So now it finally comes out: there's a method to the madness of Google's super-generosity in supplying users with mega-gigabytes of free storage space for e-mail, photos, and even upload/download space. Simply put, Google is in league with the New World Order/ the Trilateral Commission/ the Freemasons /the IRS /the CIA and/or the NFL (take your pick).
Or not. But it not, that begs a question: why preserve users search data, not as an aggregate, but as information derived from your user account?
The latest Internet scandal caught major search engine AOL committing a serious (from the user's point of view, of course) gaffe when it inadvertently released information about 19 million search requests made by more than 658,000 AOL subscribers during the three months ended in May. It's all over the Internet! Legal niceties prevent me from publishing addresses where one can obtain this data – which is a nothing more than a backed-up log file that somehow made its way into the public domain from an AOL server – but an astute Web search engine user should be able to find a copy of the 439 MB compressed download, expanded to just over 2 GB gzipped file in a matter of seconds (as I write this, I count no fewer than 50 backup servers hosting this file!).
So how did this potentially embarrassing (if not incriminating) information get off its backup server and onto an AOL server open to the Internet? Well, AOL won't tell, of course, but as one who was for many years a systems administrator “insider” involved in this kind of backup work, I can think of about 100 ways for a zipped backup file to end up on the wrong server or disk – not as the result of a hack by an outsider or a deliberate attempt to embarrass the company by an insider, but as part of very common day to day events that occur in a system backoffice – like, for example, having to reformat a server and copying its contents to another computer for backup, or an automated backup script directing the data to the wrong backup server.
The truth of the matter is that this information means little to outsiders like you and me; all it has is a list of search queries associated with user account numbers. Without the account data, of course, there's no way to match up specific users with the leaked/lost information. But of course that information exists somewhere in the company; and if push came to shove, the authorities could demand data that would link up specific users with search queries, to be used as evidence by law enforcement on cases involving crimes, for example.
This is a potential problem for anyone that types anything into any search engine, of course, but this case is relevant to both AOL and Google users (Google owns a 5 percent stake in AOL, which also accounted for about $330 million of the search engine's revenue during the first half of this year. AOL also depends on Google's algorithms for its search results).