About An Investigation of Documents from the World Wide Web
An Investigation of Documents from the World Wide Web- Paper by Woodruff, Aoki, Brewer, Gauthier, and Rowe describing their analysis of over 2.6 million HTML documents collected by their Inktomi Web crawler. The authors examined many characteristics of these documents, including size, number and types of tags and attributes, file extensions, and links.