Day: December 4, 2003

Bitrot

Simson Garfinkel writes about bitrot saying, in so many words, it won’t be that big of a problem. Jeremy Hedley warns, Cassandra-like, that for invidivuals, it might be pretty bad indeed.

I side with Garfinkel.

Like pretty much everyone who has been using a computer for more than 15 minutes, I’ve lost data. The problem of bitrot is one that is pretty widely recognized by now, even if we’re not sure exactly how best to guard against it. This awareness in itself is probably going to help minimize the problem: we may look back on the period from, say, the fifties to the nineties as an anomoly when we didn’t routinely plan on making data available to our future selves.

Bitrot is a three-layered problem:

The physical layer
If you can’t read a floppy, or whatever physical medium you’re using, you are sunk. This really breaks down into a couple sub-layers: the media itself has degraded (all media has a lifespan before it starts losing data; for some, like floppies, it’s pretty short); or the drive requires a connector and/or software drivers you can’t use with any known device.
The data layer
Fine, so by some chance your floppy is still good, but back in 1993 you were using MS Works 2 to store your business data, and there aren’t any programs that can read those files.
The cultural layer
This ties in with the data layer–some formats will almost certainly be well supported in the future, at least to the extent that format translators will exist to convert Ye Olde Data Phyle into the sleek and modern DataFile 3000. This comes down to how popular a format is/was, and whether it is clearly and publicly specified. The file format used by Word 2000, for example, is not publicly specified but is so widely used that a number of programmers have done pretty good jobs of reverse-engineering it. The PDF and RTF formats are publicly specified and very widely used. But MS Works 2? Nope.

So what can we do to avoid the heartbreak of bitrot in our own lives? A few things.

Back up
This should be obvious. My own backup strategy is to back up my home folder to an external hard drive daily, and to a magneto-optical disk (estimated to have 50-year data integrity) weekly.
Save files in publicly specified formats
As I wrote to a friend recently, “every time I save a file in Word format, I’m afraid I’m doing something that will come back to haunt me.” From now on, I’m saving my work as RTF. Plain text would be better, but RTF strikes a balance between preserving formatting and universality.
Move forward
This does not mean jumping on the bleeding edge and buying every gadget that comes along. It means recognizing when a physical or data format is on the way out, finding a safe successor, and moving to that. As long as you’ve got data you can read on a hard drive that works with your computer, and a backup you can read somewhere else, you should be in the clear indefinitely. Eventually we will see net-based storage that is convenient and affordable (we’re not quite there yet), and at that point, we won’t have any excuse for failures at the physical layer.

Pop the stack

Some programming languages have a concept called a “stack,” which is sort of like a stack of trays in a cafeteria. Each “tray” represents a value; you can “push” values onto the stack or “pop” them off. This is handy for a number of reasons, but the programmer has to keep careful track of what’s at the top of the stack–that is, how many times he’s pushed and popped. Push too much stuff on, and you’ve got a problem, because the computer only sets aside a certain amount of space for the stack.

A few recent items leave me feeling as if we’ve been pushing the stack on reality too much and risk overloading it.

  • The Onion reports on an “alternate reality TV show,” a brilliant idea that, as usual, captures something going on in the zeitgeist.
  • A hilarious review of non-existent games in Wired mentions Maximum Gamer:

    In this role-playing game, you are Todd Kellman, a world-class cyberathlete from the US. (Japanese and European versions are pending.) Gamers experience all the thrills of sitting in front of a computer screen as Kellman sits in front of his computer screen controlling the destiny of a fully rendered, computer-generated nerd sitting in front of a computer screen. This one was really popular, attracting crowds of attendees waiting for a chance to play. Or to watch somebody play. Or to watch somebody watch somebody play.

  • David Cronenberg played a character out of a David Cronenberg movie on Alias. David Cronenberg is the master of pushing and popping the reality stack so many times you get dizzy (cf Videodrome, Existenz).
  • OK. So that’s all in fun. Then this morning, via an article linked in BoingBoing, I learned about There. There is a massively multiplayer online roleplaying game (MMORPG). This is not like most such games, like Everquest. Rather than going around fighting and accumulating treasure, in There, you spend real money to acquire fake money, so that you can go fake shopping. As the article puts it, “Why go There when it looks just like here?”

In short, I think we collectively may need to get out more [he wrote, while sitting at his computer]