Approach:

    • Editor-based priority management.
    • Maintains an internal database of Yahoo sites.
    • Refers to the greater Google architecture for general web searches.
    • Small local database allows for human interaction in prioritizing.

Analysis:

This approach has several inherent advantages. Primarily, anything being referenced by Yahoo has been approved of by human content editors. These individuals have decided there is something on that page that pertains to a specific topic. This helps mitigate the tendency for users to find irrelevant "net data" that claims to be about a given topic, but in reality has little or no relation. The editorial system also allows users to browse pages by topic, limiting the amount of data they need to review.

Unfortunately, this model for web searching also imposes some very strict limits on Yahoo. Because of the physical inability for human editors to catalogue the 2 billion pages currently in existance, along with the 7 million new pages created daily, Yahoo's size has severe restrictions dependent on human capacity. By limiting themselves to sites that fit into their hierarchy of categories, they also lose the fringe elements on the web. Anyone hoping to access these pages will need to use one of the broader search engines.

Approach:

  • Makes use of PageRank and Anchor Text prioritizing algorithms.
  • Has the largest database of cached sites of all major competitors.
  • Emphasizes high reliability in searches ñ they will return valid and pertinent results.

Analysis:

Google's approach to web searching has led them to the peak of the industry in a very short time. Eclipsing "established" engines like GoTo and Altavista, their rapid ascension rests mostly on their ability to deliver to users valuable results to their searches. This hinged on the development of their PageRank prioritizing algorithm.

 

This algorithm ensures that valuable content will be returned to the user. If a random websurfer goes to a website and continues to click on random links, the probability of finding any given page relates directly to its page rank. Once a content exclusive term is added to the search, Google's design is reached.

Additional benefits of this approach include the elimination of the human dependency element. This means that the size of Google's database is unbounded by the efforts of editors - it can grow as fast as the technology can support it. Also, this approach renders Google immune to the most common forms of Internet spam. Repetition of a phrase on a website, for instance, will not influence the ranking of that site at all.

The only significat drawback currently facing Google is the speed of its Crawler and the cost of memory. If the Internet continues to double in size annually, Google must increase the speed of its web crawler to match and also acquire space to store all of these new sites. This could prove both costly and difficult.

Approach:

  • Auction off priority to the highest bidding company.
  • Provide users with reliable data based on this bidding process.

Analysis:

The business model being supported by GoTo represents a company in need of profits and willing to do anything to achieve them. It is difficult to believe that simply because a company can afford to purchase a high priority, its data on a given topic is more valid than any other source. This penalizes sites that cannot afford to pay money to be listed, but offer an unbiased look at the field in question. Clearly, this plan jeapordizes the reliability of the search and causes users to question the value of their returns.

 

Approach:

  • Provide three tiered prioritizing system to maximize results for users.
  • Focus on enhanced web crawling capabilities to maximize scalability - goal is to surpass Google when it can no longer keep up with demand.

Analysis:

This is an outline of the WiseNut backend structure. It contains both the user query lookup aspect and the web crawling component. The major advances supported by WiseNut take place in Zyborg, their web crawler. It can crawl up to 50 million websites per day, which will allow it to keep up with the rate at which the web is growing.

WiseNut also implements the latest prioritizing algorithms. In addition to a modified version of PageRank, as Zyborg scans in a page it reviews the text near a link for keywords. It then uses these to help place the page into a category, creating an automated category directory similar to the manually created one used by Yahoo. This would give WiseNut the benefits of the Yahoo community without needing manual editors that limit the size of the cached database.

The problems being encountered by WiseNut have very little to do with their technological savvy. Currently, they are up against the extremely well entrenched household name Google. Only by overcoming their publicity disadvantage can they ever hope to force a showdown between their two technologies. WiseNut is also several years behind in crawling the web - they need to build their database of cached sites up to match that of Google. Once they have done so, then they can begin the battle in earnest for the title of king of the web.