About Random Sampling from a Search Engine´s Corpus
Random Sampling from a Search Engine´s Corpus- By Ziv Bar-Yossef and Maxim Gurevich. Technical report, August 2006. Two novel algorithms for random sampling are used to collect comparative statistics on the corpora of Google, MSN Search and Yahoo. ODP is used to create a test search engine and query pool.