iSnare.com - Free Content Articles Directory
Authors Contents [Advanced Search][Add OpenSearch][Job Search]
Distribute your articles to thousands of article sites for only $2 and below! Read more...

Index  Internet
 

Search Results Clustering Demystified

 
[ Contact the Author] [ Send to a Friend] [ Article Publisher] [Make PDF] [ Print] [ Bookmark & Share]
 
Read our Terms of Service before reprinting this article. The submitter specified above has claimed the rights to this article.
Danny Wirken

Clustering may mean to have two or more computer systems working together or multiple servers linked together for the purpose of handling variable workloads as well as to provide continued operation in case one fails. It may also refer to data clustering which is a technique used for data analysis by dividing a data set into subsets whose elements share common traits. Search result clustering aims to change the way people search online by organizing search result into folders that group similar items together.

Why Clustering is Needed

The use of the vast information available online cannot be maximized unless an effective means of organizing it can be provided. Clustering engines put search results together based on textual and linguistic similarity. This basic similarity is supported by heuristics which are coded by programmers using as basis the users’ preference on what they want to see on clustered documents. Clusters are presented using the style of folders and sub-folders.

When a search engine provides millions of results for a particular query, the searcher can either sift through the endless pages of results or depend on the search engine’s judgment as to the most relevant results. Neither can ensure that the targeted information can be accessed as it may remain buried under pages of results or it may not meet the search engine’s criteria. In the same way that all other things are clustered or organized, the world of web searching would be more useful once given the benefit of organized search results.

Clustering engines automatically cluster results into categories that have been intelligently selected from words and phrases contained in search results. Categories are intended to reach human-level accuracy and to offer hierarchical drill doom capability in a familiar folder-style interface. Mind-numbing lists need not be scrolled through or ignored as the main themes are viewed in the first 300 – 500 results right on the first page. A quick overview of the types of information available on a particular topic is made available so that the area of interest can be immediately put into focus.

With the great improvement of search engines’ capability to return a large number of relevant results, it became more difficult to navigate meaningfully through all the results. A typical searcher does not take the time to view results beyond the first page which makes it very probable to miss results that would have been relevant and useful to his/her search or query. Clusters make it possible for results found on the tenth page to be just a click away. Related items can also be viewed together without much effort. It even reveals unexpected relationships between words, ideas and concepts.

A good cluster is considered such if it possesses a readable description. It should be able to assist in narrowing down a search to find exact results. A clustering engine queries multiple search engines and combines the results to be clustered and displayed on one screen. Each result list comes with information regarding the total number of results clustered and retrieved. The clustering engine’s own heuristics shall determine the pages to be favored. Search engines sometimes return multiple copies of the same page with slightly different URLs but this is minimized in search result clustering. This is because clustering engines does not reproduce results with similar descriptions. Clusters are specific enough that repeated documents are very rare. Some are able to offer advanced search features which allows searchers to specify which sources should be searched, the number of results desired, allowable waiting time, the desired language to be used and the filtering out of offensive contents.

Search Engines that Clusters

Google Sets do not provide results but rather helps in finding similar terms to the ones entered. This allows the user to create more complex queries in one area and brainstorm on how to put a search together. Google Sets is Google Labs’ clustering agent.

Wisenut is a full-text search engine which provides for related topics aside from a number of results for any search item entered. This is called the WiseGuide. Some results would have subtopics which will show underneath the clustered results. A link can be found next to each of the clustered results whose keywords can be used to run another search. A different set of clustered results shall be produced in addition to the web page results. This search engine has been bought by LookSmart.

Teoma has been dubbed as the “Google Killer” due to its very interesting clustering technology. A single search run will produce four sets of results. Those found at the top left are sponsored results, those found at the bottom are website non-sponsored results, those at the top right are the suggestions for refining the result and those at the bottom right are link calculations from experts and enthusiasts. The link collections are suitable for general information needs while the suggestions are for more specific searches. A click on any would signal the search to run again where a different set of site results shall be provided. Teoma has been purchased by AskJeeves.

Infonetware.com is more of a demonstration of Infonetware’s Real Term Technology than a search engine. The results page is framed where the area on the left provides topics related to the search term while the web page search results are found on the right frame. It works with full searching.

Oingo uses the open Directory Project as its search source. The search results page gives a drop-down list of potential meanings. The list of categories in order of relevance to the search can be found beneath it as well as the site results from the directory itself. It is more useful for general term searches or search terms that are in a broad category.

Vivisimo is a meta-search engine that clusters its results. It provides a very simple front page with search results that are organized in groups. The page design makes it easy to explore several categories without having to “lose your place”. Clusty is the consumer search destination powered and owned by Vivisimo. It queries results from Ask, MSN, Open Directory, LookSmart, Gigablast and WiseNut. These sites were chosen because of their accurate results and quick return speeds.

Query Server offers several types of search on the left side of the front page. Each search has more or less the same interface and all cluster results. Search results are presented in a frame at the right side of the site.

Surfwax offers both subscription based and free services. A focus link can be seen in the upper left corner after a search is entered. These focus words can be used in addition to the search term. They are divided into narrower or broader categories and contain generic words and not links to specific people or places.

Northern Light News search requires a search to have a certain number of results in order to be clustered into folders. However, folder listing does not provide information about the contents of a particular folder although there are subfolders provided for broad topics. Search results are listed by order of date.

Clustering search engines break up several hundred results into manageable packages. Suggestions are provided so that the use of information is maximized and the search itself a lot easier. A search query cannot always be specific enough to target the right information at once.

Important NoticeDISCLAIMER: All information, content, and data in this article are sole opinions and/or findings of the individual user or organization that registered and submitted this article at Isnare.com without any fee. The article is strictly for educational or entertainment purposes only and should not be used in any way, implemented or applied without consultation from a professional. We at Isnare.com do not, in anyway, contribute or include our own findings, facts and opinions in any articles presented in this site. Publishing this article does not constitute Isnare.com's support or sponsorship for this article. Isnare.com is an article publishing service. Please read our Terms of Service for more information.

Article Tags: engines [See Dictionary], results [See Dictionary], search [See Dictionary]
Got a question about this article? Ask the community!
Article published on July 24, 2006 at Isnare.com
 
Rate this article:

Riya: A Big Leap In Visual Search Engines
Submitted by: Danny Wirken

Watch out for new software that will give a new face to search engines Rather, a program that includes faces in the search function...

Taguchi Method: The Key In Ad Optimization?
Submitted by: Danny Wirken

For people who are looking for the secrets on how to master ad optimization, your prayers have been answered...

What A .htaccess File Is And How To Make One
Submitted by: Danny Wirken

A htaccess file is a simple ASCII file similar to that created through text editor such as Notepad or Simple Text...

What You Should Know About Trackback Spam
Submitted by: Danny Wirken

Trackback facilitates communication between blogs When a blogger writes a new entry whether to comment on or refer to an entry found at another blog, the commenting blogger can notify the other blog with a Trackback ping...

What You Newbies Need To Know About Pay Per Click Ads
Submitted by: Danny Wirken

Just about anyone who has been using the Internet in the last few years has no doubt come across the term "pay per click" once or twice...

The Exciting World Of Video Blogging
Submitted by: Danny Wirken

When the idea of weblogs was first introduce online, it was an instant phenomenon Suddenly just about everyone feels the need to create their own space online by writing their thoughts...

The Latest On WordPress Themes
Submitted by: Danny Wirken

As WordPress and blogging become more and more popular, the list of customization options continues to grow...

Tips On How To Deal With Anonymous Comment Spam
Submitted by: Danny Wirken

Have you ever experience being flooded with anonymous comments If yes, then chances are you have been a victim of comment spam...

To Blog Or Not To Blog: The Ups And Downs Of Blogging
Submitted by: Danny Wirken

Whenever the subject of the phenomenon called blogging is raised, most people immediately think associated it with an online diary or weblog...

Trackback Spam Explained
Submitted by: Danny Wirken

In most blog applications, there is a feature called Trackback, which allows the user to send a trackback or notification to a different site or another blog that the user referred to in his own blog...

Web 2.0, A Guide For Newbies
Submitted by: Danny Wirken

A couple of years back Bill Gates introduce the idea of Convergence to the public It was a fresh idea that later became a catchphrase for the Internet Industry...

How To Use Linknotes Plugins
Submitted by: Danny Wirken

When users complained about inline links that are becoming way too obtrusive, someone was bound to find the answer...

Moving Your WordPress Blog
Submitted by: Danny Wirken

Moving a blog can make it unreachable for 24 – 72 hours, unless the new domain name has fully propagated around the Internet...

Google Update: A Test For Keyword Dominance
Submitted by: Danny Wirken

Google is one of the most popular search engines on the Internet today According to statistics about 50 to 80 percent of searches made by users worldwide are being done on Google...

How To Prevent Comment Spam With Google’s No Follow Attribute
Submitted by: Danny Wirken

Putting up and maintaining a weblog of your own could be done for free or built into your paid domain site...

Quick and Simple Overview on Webhosting Services
Submitted by: Ani K

In the present internet world, if we think of something we can get it with in no time It is possible because of the ease of access to the internet where one can search for anything and can get the best result...

5 Tips For Website Project Success
Submitted by: Vann Baker

Many website projects actually fail before they even get started This is not intentional, but with technology-driven projects, it is easy to get derailed before the train is out of the station...

Redesigning Your Website For Success
Submitted by: Vann Baker

What exactly is website "redesign" and why is it necessary Re-designing a website is often thought of as being more of a graphic design process—taking an older website and give it a totally new look so visitors and customers will take notice, or perhaps adding more content to the website so information is readily available to existing or potential customers...

Website Request For Proposal (RFP)—Invaluable
Submitted by: Vann Baker

Why do website projects fail or fall short of expectations Many businesses have the experience of their last website project taking far too long to complete, going over budget and in the end just did not measure up to expectations...

Make Extra Money Online
Submitted by: Blanca Ciotoiu

If you are expecting that I'm going to tell you a fast way to make extra money online then, you are wrong...

Web Site Design and Development – Tell a Story to Build Your Credibility and Educate Customers
Submitted by: Daljeet Sidhu

The best way to attract customers to your business is to make them understand what sets you apart from the competition...

Traffic Builder For Free
Submitted by: Dansar Gin

After you decide to have a website and to buy a domain name for your website you will ask a lot of questions like: - What is the right way to start...

What Are Meta Tags and Why Are They Important?
Submitted by: Blake Evans

A “meta tag” is a common phrase that new web designers generally have to deal with as soon as they sit down to create a website...

Podcasting
Submitted by: John Taylor

A podcast is a succession of digital media files, audio or video, that are discharged digressively and downloaded through web syndication...

Bang For Buck – How to Best Apply Twitter to Your Business and Earn Money Online
Submitted by: Trond Lyngbø

The digital world is growing and changing at an explosive rate As always, change brings both business opportunities and threats...

The Role of Backlinks in the Success of a Website
Submitted by: Kanaga Siva

The goal of most webmasters today is to obtain as many backlinks as possible for their websites by virtue of the fact that these links bestow enormous benefits on their websites...

SEO – Do You Know The Top 7 Traits Of Legitimate Search Engine Optimization Companies
Submitted by: Daljeet Sidhu

Did you know that nine out of ten people access online information through a search engine (SE) And eight out ten do not go past the first page of the search results...

Profitable Internet Ventures: Starting Up
Submitted by: Alan Tolchin

The beginning internet marketer faces fierce competition especially in the category of affiliate marketing...

Fast Link Building Techniques
Submitted by: Alexander Faust

When you are working on a website’s search engine optimization, link building probably is the most important task you need to do...

Living Your Fantasy With Online Games
Submitted by: Mark Thomas Walters

One of the main reasons why people indulge in online gaming is to get away from the reality of their life, at least for a few hours, to a fantasy world...

Isnare.com Footer Divider

© 2004-2009. Isnare Free Articles - An Isnare Online Technologies Free Articles Project. All Rights Reserved.   Privacy Policy