Will the internet become a new archaeological record?

The internet preserves unimaginable amounts of data, including huge amounts of minutiae describing people's everyday lives. So here's the question: is this information worth saving for future generations to study, or is this the sort of data that's best forgotten?

According to recent estimates, two-thirds of all Americans store at least some personal information on the internet, and about half use social networks. When you throw in stuff like your browsing and search histories, that's a ton of data even for a single person, and it all adds up to what's been dubbed your "digital soul." And there's a debate brewing between "preservationists" and "deletionists" over just what should be done with all that data.

In theory, all this data could prove massively useful to future archaeologists and sociologists eager to understand not just how the world worked in the early 21st century, but also how regular people interacted with it. Seemingly mundane documents and records can take on massive importance even a few centuries later, and a well-preserved internet could prove to be the definitive archaeological record of our life and times.

But parts of the internet are already threatened with erasure - just look at the recent deletion and subsequent archival resurrection of Geocities - and the debate is crystallizing over whether this information is really worth keeping. That's the issue Sumit Paul-Choudhury deals with over at New Scientist, and he relates this analysis from Geocities preservationist Jason Scott:

The one question that gets asked most often, says Scott, is "Why bother to save this junk anyway?" His answer is that it's not junk: it's history. Geocities is a huge time capsule from the infancy of the World Wide Web. Its design values speak to the limitations of dial-up connections; its structure captures a time when no one had figured out how to navigate the web, where people built online homes in themed "neighbourhoods" called Hollywood or EnchantedForest. Its users' interactions with each other - via email addresses and guestbooks published openly without fear of spam - offer valuable insights into the birth of online culture.

But for all its potential scholarly value, there are other, far thornier issues to internet preservation that can rear their head in the shorter term. Viktor Mayer-Schönberger of the Oxford Internet Institute offers this example:

"A woman called in to a radio programme to tell me that her long-spent criminal conviction had been inadvertently revealed online. It had instantly destroyed her standing in the small community where she lived, the fresh start she had worked for years to achieve. This wasn't even something she had posted: it was someone else."

Mayer-Schönberger argues that forgetfulness is fundamental to the human experience, and so too should it have a place in our online lives. He's proposed files that "forget" over time, quietly erasing themselves after reaching a built-in expiration date. He's also suggested a sort of "digital rust", in which files actually start to decay unless active measures are taken to preserve them.

Ultimately, there probably isn't much sense in trying to either destroy or preserve all this data - like any other archaeological record, some of it will endure, some of it will fade away, and we're going to have to trust future archaeologists to understand it in some sort of context. The only difference, of course, is the sheer incomprehensible volume of information that we're leaving behind - and if future archaeologists ultimately just throw up their hands and delete the whole damn thing, I can't exactly blame them.

Check out the full article at New Scientist. Representation of the internet via.