iSnare.com - Free Content Articles Directory
Authors Contents [Advanced Search][Add OpenSearch][Job Search]
Distribute your articles to thousands of article sites for only $2 and below! Read more...

Index  Internet
 

The Google Goal Of Indexing 100 Billion Web Pages

 
[ Contact the Author] [ Send to a Friend] [ Article Publisher] [Make PDF] [ Print] [ Bookmark & Share]
 
Read our Terms of Service before reprinting this article. The submitter specified above has claimed the rights to this article.
Danny Wirken

Google’s Goal of Quality Search

In their paper 'The Anatomy of a Large-Scale Hypertextual Web Search Engine' it is very evident that Google’s goal has always been to be one of the best search engines there is in terms of the quality of the results it gives. Sergey Brin and Lawrence Page, however knew that in order to do this, Google needed to be able to store information efficiently and cost effectively and to have excellent crawling, indexing, and sorting methods or techniques. Google not only aimed to give quality results but to produce the results as fast as possible. Google started as a high quality search engine and continues to be the best search engine today. It has managed to stay true to its original intent to be a search engine that not only crawls and indexes the web efficiently but also to produce more satisfying results in comparison to other existing search engines.

To stay true to their goal of providing the best search results Google knew right from the start that it had to be designed so that the search engine could catch up with the web’s growth. According to Brin and Page “In designing Google we have considered both the rate of growth of the Web and technological changes. Google is designed to scale well to extremely large data sets. It makes efficient use of storage space to store the index”. They knew that they needed much space to store and ever growing index.

Google’s index size, which that started out as 24 million web pages was large for its time and has grown to around 25 billion web pages, still keeping Google ahead of its competitors. However, Google is a company that doesn’t settle for just beating the competitors. They truly aim to give their users the best service there is and that means as a search engine they want to give users access to all or at least most of the quality information that is available on the web.

Google’s New System for Indexing More Pages

As mentioned earlier, Google aims to give access to even more information and has been devoting time and much effort to realize this goal. It seems that the new patent entitled 'Multiple Index Based Information Retrieval System' filed by Google employee Anna Patterson might be the answer to the problem. The patent published just this May of 2006 and filed way back in January of 2005 shows that Google might actually be aiming to expand their index size to as much as a 100 billion web pages or even more.

According to the patent, conventional information retrieval systems, more commonly known as search engines, are able to index only a small part of the documents available on the Internet. According to estimates the existing number of web pages in the Internet as of last year was around 200 billion; however, Patterson claimed that even the best search engine (that is Google) was able to index only up to 6 to 8 billion web pages. The disparity between the number of indexed pages and existing pages clearly signaled a need for a new breed of information retrieval system. Conventional information retrieval systems just weren’t capable of doing the job and just wouldn’t be able to index enough web pages to give users access to a large enough percentage of the present existing information available on the web.

The Multiple Index Based Information Retrieval System, however, is up to the challenge and is Google’s answer to the problem. Two characteristics of the new system makes it stand out compared to the conventional systems. One is that it has the “capability to index an extremely large number of documents, on the order of a hundred billion or more”. And the other is its capability to “index multiple versions or instances of documents for archiving…enabling a user to search for documents within a specific range of dates, and allowing date or version related relevance information to be used in evaluating documents in response to a search query and in organizing search results.” With the new system developed by Patterson, Google now has the ability to expand its index size to unbelievable proportions as well as improve document analysis and processing, document annotation, and even the process of ranking according to contained and anchor phrases.

History of Google’s Index Size

Google started out with an index size of around 24 million web pages in 1996. By August of 200, Google had managed to quadruple their index size to approximately one billion web pages. On September of 2003 Google’s front-page boasted and an index of 3.3 billion web pages. Microdoc, however, revealed that the actual number of web pages Google had indexed during that time was more than five billion web pages already. In their article 'Google Understates the Size of Its Database', they emphasized that Google not only specialized in simplicity but also in understating their power and complexity. Google was still managing to stay ahead of its competitors and continued to surprise everyone with what they had under their sleeves.

As Google’s index continued to grow the number in their front page grew impressively large as well before it plateaud at eight billion web pages. This was around the time that Patterson filed the new patent. Then in 2005, with controversies in index size growing, Google decided to stop counting in front of the public and simply claimed that their index size was three times larger than the nearest competitor’s index size. Google also maintained that it was not just the size of indexed pages that was important but how relevant the results they returned were. Then in September of 2005, as part of Google’s 7th anniversary, Anna Patterson, the same software engineer who filed the patent on the Multiple Based Index Information Retrieval System posted an entry on Google’s official blog claiming that the index size was now 1,000 times larger than the original index. This pegged their index size to around 24 billion web pages, about a fourth of Google’s goal of indexing a100 billion web pages. It seems then that Google must have started using the new system in mid 2005. With the new system in place we can only wait and see how fast Google will reach the goal of a 100 billion web pages in its index. It's most likely though that when Google has reached that goal it would set an even higher goal to provide continuous quality service.

Important NoticeDISCLAIMER: All information, content, and data in this article are sole opinions and/or findings of the individual user or organization that registered and submitted this article at Isnare.com without any fee. The article is strictly for educational or entertainment purposes only and should not be used in any way, implemented or applied without consultation from a professional. We at Isnare.com do not, in anyway, contribute or include our own findings, facts and opinions in any articles presented in this site. Publishing this article does not constitute Isnare.com's support or sponsorship for this article. Isnare.com is an article publishing service. Please read our Terms of Service for more information.

Article Tags: google [See Dictionary], index [See Dictionary], pages [See Dictionary]
Got a question about this article? Ask the community!
Article published on September 15, 2006 at Isnare.com
 
Rate this article:

Riya: A Big Leap In Visual Search Engines
Submitted by: Danny Wirken

Watch out for new software that will give a new face to search engines Rather, a program that includes faces in the search function...

Taguchi Method: The Key In Ad Optimization?
Submitted by: Danny Wirken

For people who are looking for the secrets on how to master ad optimization, your prayers have been answered...

What A .htaccess File Is And How To Make One
Submitted by: Danny Wirken

A htaccess file is a simple ASCII file similar to that created through text editor such as Notepad or Simple Text...

What You Should Know About Trackback Spam
Submitted by: Danny Wirken

Trackback facilitates communication between blogs When a blogger writes a new entry whether to comment on or refer to an entry found at another blog, the commenting blogger can notify the other blog with a Trackback ping...

What You Newbies Need To Know About Pay Per Click Ads
Submitted by: Danny Wirken

Just about anyone who has been using the Internet in the last few years has no doubt come across the term "pay per click" once or twice...

The Exciting World Of Video Blogging
Submitted by: Danny Wirken

When the idea of weblogs was first introduce online, it was an instant phenomenon Suddenly just about everyone feels the need to create their own space online by writing their thoughts...

The Latest On WordPress Themes
Submitted by: Danny Wirken

As WordPress and blogging become more and more popular, the list of customization options continues to grow...

Tips On How To Deal With Anonymous Comment Spam
Submitted by: Danny Wirken

Have you ever experience being flooded with anonymous comments If yes, then chances are you have been a victim of comment spam...

To Blog Or Not To Blog: The Ups And Downs Of Blogging
Submitted by: Danny Wirken

Whenever the subject of the phenomenon called blogging is raised, most people immediately think associated it with an online diary or weblog...

Trackback Spam Explained
Submitted by: Danny Wirken

In most blog applications, there is a feature called Trackback, which allows the user to send a trackback or notification to a different site or another blog that the user referred to in his own blog...

Web 2.0, A Guide For Newbies
Submitted by: Danny Wirken

A couple of years back Bill Gates introduce the idea of Convergence to the public It was a fresh idea that later became a catchphrase for the Internet Industry...

How To Use Linknotes Plugins
Submitted by: Danny Wirken

When users complained about inline links that are becoming way too obtrusive, someone was bound to find the answer...

Moving Your WordPress Blog
Submitted by: Danny Wirken

Moving a blog can make it unreachable for 24 – 72 hours, unless the new domain name has fully propagated around the Internet...

Google Update: A Test For Keyword Dominance
Submitted by: Danny Wirken

Google is one of the most popular search engines on the Internet today According to statistics about 50 to 80 percent of searches made by users worldwide are being done on Google...

How To Prevent Comment Spam With Google’s No Follow Attribute
Submitted by: Danny Wirken

Putting up and maintaining a weblog of your own could be done for free or built into your paid domain site...

Web Site Design and Development – Tell a Story to Build Your Credibility and Educate Customers
Submitted by: Daljeet Sidhu

The best way to attract customers to your business is to make them understand what sets you apart from the competition...

Traffic Builder For Free
Submitted by: Dansar Gin

After you decide to have a website and to buy a domain name for your website you will ask a lot of questions like: - What is the right way to start...

Bang For Buck – How to Best Apply Twitter to Your Business and Earn Money Online
Submitted by: Trond Lyngbø

The digital world is growing and changing at an explosive rate As always, change brings both business opportunities and threats...

The Role of Backlinks in the Success of a Website
Submitted by: Kanaga Siva

The goal of most webmasters today is to obtain as many backlinks as possible for their websites by virtue of the fact that these links bestow enormous benefits on their websites...

SEO – Do You Know The Top 7 Traits Of Legitimate Search Engine Optimization Companies
Submitted by: Daljeet Sidhu

Did you know that nine out of ten people access online information through a search engine (SE) And eight out ten do not go past the first page of the search results...

Basics Of Search Engine Optimisation (SEO)
Submitted by: Lijo George

What is SEO Search Engine Optimization is a step by step process in which a web site is optimized to the expectations of Search Engines...

Ebook - E For Environmental
Submitted by: Roberto Sedycias

The emphasis on going green is highlighted as writing books are one of the contributors for depleting natural resources...

How to Optimize Your Website Keywords - SEO
Submitted by: Sebastian Warnke

What does it take to be in the top 10 search engine results on Google or Yahoo The answer is finding the right keywords and optimizing your site accordingly...

Who Needs a Website Builder?
Submitted by: Jason Kay

If you are planning your first website you have no doubt heard of a website builder, but perhaps you are not sure of what it is or if you need one...

How Much You Need to Pay For Domain Redemption
Submitted by: John Khu

At times, people may simply forget to renew their domain names on time It is possible that the owner simply neglected the importance of renewing the domain name...

Making Money Online With Expired Domain Names – Some Practical Ways
Submitted by: John Khu

As an internet entrepreneur, you can make money in several ways Expired domains are few of the tools that can help you create enough online income...

How to Get Google Page Rank?
Submitted by: Jack Wylde

Making the most of Google’s page rank can totally bring your business or website to the forefront This is immense with a lot of toolbars and page rank facilities that can now work with Google rank escalating for many website owners...

Paid Survey Strategies That Do Not Benefit Users
Submitted by: Scott Lindsay

Paid surveys are offered as a premier way to make money by sharing an opinion A counter product is known as paid emails...

Understanding and Implementing Sound SEO Principles
Submitted by: Scott Lindsay

Search Engine Optimization (SEO) is often talked about as if it is understood completely The trouble is there are some who are just being introduced to online marketing that have very little idea what SEO is and why it is important...

Article Writer - Do You Need One?
Submitted by: Enzo F. Cesario

Content is king Your web presence needs content that your audience will be interested in, period...

Isnare.com Footer Divider

© 2004-2009. Isnare Free Articles - An Isnare Online Technologies Free Articles Project. All Rights Reserved.   Privacy Policy