Wayback is Hugh

| 1 Comment

How the Wayback Machine Works

In the Wayback Machine, currently there are 10 billion Web pages, collected over five years. That amounts to 100 terabytes, which is 100 million megabytes. So if a book is a megabyte, which is about what it is, and the Library of Congress has 20 million books, that's 20 terabytes. This is 100 terabytes. At that size, this is the largest database ever built. It's larger than Walmart's, American Express', the IRS. It's the largest database ever built. And it's receiving queries -- because every page request when people are surfing around is a query to this database -- at the rate of 200 queries per second.

1 Comment

Great interview... Interesting way of doing parallel computing -- instead of using some PSI, MOSIX, etc, which require you to write programs keeping that nature in mind, just write a batch processing system in perl...

About this Entry

This page contains a single entry by fozbaca published on January 22, 2002 8:54 PM.

Programer Tales was the previous entry in this blog.

API for Weblogs is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Monthly Archives

Pages

Powered by Movable Type 5.2.2