Wednesday, May 27, 2015

How Large Is the Internet?

John McAfee | May 26, 2015

According to current estimates, the Deep Web, or UnderNet, contains somewhere between 750,000 petabytes and 2 exabytes of information (1 exabyte = 1,0000 terabytes (TB)). The numbers defy comprehension by simple human minds.
To put 2 exobytes into perspective, the world's largest library, the Library of Congress, which contains virtually everything written since the history of writing, houses approximately 8TB of data. As an aside, the Wire in 2013 estimated that the NSA collects 29TB of data from its various sources every day.
The Deep Web is the new frontier of information science, but massive technical challenges are still to be understood and resolved in order to mine this wealth of information.


Some sites have login passwords that restrict the ability of web crawlers to gain access and to index information. There are data incompatibilities, and form data that requires specific inputs in order to gain specific responses. There are internal pages with no external links, unpublished and unlisted posts, and information in JPEG or MP4 format, the contents of which cannot be analysed by indexing mechanisms.


