Molecular Storage in Synthetic DNA

Digital data in DNA.


Technology Overview

Molecular storage of information in synthetic DNA is a new area of research and development that may pave the way for cheaper storage and replication of Petabyte-scale data.

Using the chemical building blocks of DNA, molecules can be constructed that encode for vast amounts of data. As well as being tiny and incredibly space efficient, DNA molecules are easily and cheaply replicated, even billions of times, and can then be later read using well-understood methodologies.

The dream is that in a few decades Exabytes of data could fit in a beaker, or a packet of inexpensive pellets, containing hundreds of billions of synthetic DNA molecules. These enormous datasets could be easily copied, backed up, and shared as a new form of cold storage and data retrieval.

Molecular storage of big data in DNA is of interest to The Arch Mission Foundation as a candidate for future Arch™Library technologies.

In keeping with our mission to explore and catalyze the cutting-edge of innovation on the frontiers of storage, archiving and retrieval we began working with several pioneering groups working in this area, including Microsoft, Catalog, and Carverr.

Microsoft and the University of Washington created the Molecular Information Systems Laboratory (MISL) to study how biology can be used to benefit the IT industry. The DNA data storage project is the first effort of its kind undertaken by the MISL team.

They have encoded a book collection in DNA, composed of 20 public domain titles sourced from the Project Gutenberg library and they have verified the information can be recovered with no bit errors.

They are now working on creating a photo collection to be encoded in DNA as part of their ‘Memories in DNA’ project. The project encourages people to share images they wish to remember forever that MISL researchers will then preserve in synthetic DNA with Microsoft and Twist Bioscience Corporation.

The Memories in DNA Archive features 10,000 crowdsourced images and the full text of 20 important books, among other items. The data is encoded into billions of synthetic DNA molecules and encapsulated for long-­term preservation. Collectively this data will represent the first Molecular Collection of the Lunar Library, which the Arch Mission Foundation announced last spring.

Members of the public can contribute their own images to the Lunar Library by submitting them at or by emailing an attachment to For this project, Twist Bioscience is synthesizing the DNA molecules.

The Arch Mission Foundation is designing a special payload to contain and protect this material for long­term backup on the Moon in a future mission.