Babbage | Information storage

Tape rescues big data

A 60-year-old solution to a modern problem: where to put all those bits and bytes

By Economist.com

WHEN physicists throw the “on” switch on the Large Hadron Collider (LHC), between three and six gigabytes of data spew out of it every second. That is, admittedly, an extreme example. But the flow of data from smaller sources than CERN, the European particle-research organisation outside Geneva that runs the LHC, is also growing inexorably. At the moment it is doubling every two years. These data need to be stored. And that need for mass storage is reviving a technology which, only a few years ago, seemed destined for the scrapheap: magnetic tape.

Tape is the oldest computer storage medium still in use. It was first put to work on a UNIVAC computer in 1951. And although tape sales have been falling since 2008 and dropped by 14% in 2012, according to the Santa Clara Consulting Group, this fall has gone into reverse, with a 1% rise in the last quarter of that year and a 3% rise expected this year.

Alberto Pace, head of data and storage at CERN, says that tape has four advantages over hard disks for the long-term preservation of data. The first is speed. Although it takes about 40 seconds for an archive robot to select the right tape and put it in a reader, once it has loaded, extracting data from that tape is about four times as fast as reading from a hard disk.

The second advantage is reliability. When a tape snaps, it can be spliced back together. The loss is rarely more than a few hundred megabytes—a bagatelle in information-technology circles. When a terabyte hard disk fails, by contrast, the result is usually that all the data on it are lost. The consequence at CERN, specifically, is that a few hundred megabytes of its 100 petabyte tape repository are lost every year. Of the 50 petabytes of data held on hard disk, however, it loses a few hundred terabytes in the same period.

The third benefit of tapes is that they do not need power to preserve data held on them. Stopping a disk rotating by temporarily turning off the juice—a process called power cycling—increases the likelihood that it will fail.

The fourth benefit is security. If a hacker with a grudge managed to break into CERN’s data centre, he could delete all 50 petabytes of the disk-held data in minutes. To delete the same amount from the organisation’s tapes would take years.

And tape has two other benefits, as Evangelos Eleftheriou, manager of storage technologies at IBM’s research laboratory in Zurich, points out. It is cheaper than disks (a gigabyte of disk storage costs 10 cents; of tape 4 cents), and it lasts longer. Tapes can still be read reliably three decades after something is recorded on them. For disks, that figure is around five years.

Tape will never be the whole answer to storing data, according to Dr Eleftheriou. But it forms a crucial part of a “storage hierarchy”. At the top of this are so-called hot data, those that need to be available for immediate access. These are best held in flash memory. Lukewarm data—those that people need to look at frequently, but not instantaneously—are best stored on disks. Cold data, the stuff in long-term storage, can be recorded on tape. But this cold store is by far the biggest repository. A report published in 2008 by Andrew Leung of the University of California, Santa Cruz, found that in general 90% of an organisation’s data become cold after a couple of months.

However, even today’s tape cartridges, which can hold up to six terabytes of compressed data, are not up to the job of dealing with the data deluge that is around the corner. Much higher densities than that are needed.

In 2010 Dr Eleftheriou and his team, in collaboration with Fujifilm, set a new record. They demonstrated a tape that can store 29.5 gigabits per square inch—which, for a standard 1km tape, translates as 35 terabytes of data on a single cartridge. But even that is not enough for Dr Eleftheriou. He has now set himself the challenge of developing a tape with a density of 100 gigabits per square inch, and also creating the equipment necessary to read it. If he is successful, a single cartridge will be able to store more than 100 terabytes.

The biggest challenge he and his colleagues face is not be squeezing more barium-ferrite magnetic particles on to a tape in order to record more stuff; it is, rather, positioning the read/write head to within 10 nanometres in order to read what has been recorded back correctly when a tape is travelling under it at a speed of five metres a second. To put this task in perspective, a virus is about 40 nanometres across. Dr Eleftheriou is nevertheless confident that he will have a demonstration model ready in 2014.

More from Babbage

And it’s goodnight from us

Why 10, not 9, is better than 8

For Microsoft, Windows 10 is both the end of the line and a new beginning


Future, imperfect and tense

Deadlines in the future are more likely to be met if they are linked to the mind's slippery notions of the present