Biocomputing storage solutions: The Science and Where It Actually Stands

The data storage problem is quietly becoming one of the more urgent engineering challenges of the next decade. Global data creation is growing at approximately 23 percent per year. Current storage technologies — hard drives, solid-state drives, magnetic tape — require physical space, significant energy, controlled temperature environments, and have functional lifespans measured in years to decades. The global data centres that house humanity’s accumulated information already consume more electricity than many entire countries.

DNA data storage is not a new idea — scientists have been exploring it theoretically since the 1960s — but it has recently crossed from theoretical possibility to experimentally demonstrated reality in ways that make the eventual practical applications plausible within a reasonable timeframe. The biology is extraordinary: a single gram of DNA can, in theory, store approximately 215 petabytes of data. All the data generated by humanity so far could be stored in a volume of DNA smaller than a room. The information density is not comparable to anything humanity has engineered — it evolved over four billion years to be the densest information storage medium that chemistry permits.

How DNA Storage Works

Data is encoded into DNA by converting binary information into sequences of the four DNA bases — adenine, thymine, guanine, and cytosine. These sequences are then synthesised as physical DNA strands using DNA synthesis technology, which has become substantially cheaper and faster over the last decade (following the same kind of cost curve that semiconductor manufacturing followed in an earlier era). The data is stored as physical DNA in a suitable medium — typically a dry or freeze-dried form for stability — and retrieved by sequencing the DNA and running the output through a decoding algorithm.

The key milestones that have made this more than conceptually interesting: Microsoft Research and the University of Washington demonstrated in 2019 the storage and retrieval of 1 GB of data in DNA with a random-access retrieval system. The retrieval was slow — hours rather than milliseconds — but the data was retrieved accurately. In 2021, a team at ETH Zurich demonstrated DNA storage encapsulated in silica with a stability profile suggesting data preservation of thousands of years without energy input, comparable to the natural stability that allows DNA from organisms tens of thousands of years old to be sequenced today.

The Current Limitations

The technology has real, significant limitations that are worth stating clearly. The cost of DNA synthesis, while declining rapidly, remains orders of magnitude higher per byte than magnetic storage for most use cases. Retrieval speed is fundamentally limited by the biochemical processes involved — DNA sequencing, however fast it gets, is not going to approach the microsecond access times of solid-state storage. And the error rates in both synthesis and sequencing, while manageable with error-correction algorithms, add complexity and computational overhead to both writing and reading.

The applications profile that DNA storage is most suitable for in the near term is therefore not everyday computing storage — it’s archival storage. Data that needs to be preserved for very long periods, accessed infrequently, and stored in the smallest possible physical footprint. Cultural heritage archiving, genomic databases, regulatory records with long retention requirements, and scientific data sets are all candidate applications. The National Archives of multiple countries are actively investigating DNA storage for exactly these reasons.

The Longer Timeline

The truly transformative version of biocomputing storage — not just storing data in DNA but processing it there, using biological mechanisms for computation as well as storage — remains further out. DNA computing, explored by Leonard Adleman in the 1990s, demonstrated the theoretical possibility of using DNA hybridisation for certain types of computation, but practical DNA computers remain laboratory curiosities rather than useful machines. The convergence of storage and processing in biological substrates is a long-horizon possibility rather than a near-term prospect.

What the near term offers is a compelling answer to a real problem: the need to store vast, growing archives of data in a way that doesn’t require buildings full of powered hard drives. That’s not the science fiction endpoint, but it’s a meaningful advance on an important problem.