DNAmaker

Automating the Storage of Digital Data on Long DNA Molecules


J. Leblanc – Y. Audic – O. Boullé – G. Andrade-Barosso – D. Lavenier

Context

The DNAmaker is an extension of the dnaXiv project (2021-2024) that was exploring how digital information can be efficiently stored on DNA molecules. The DNAmaker project run over 18 months, starting from July 2023. Both projects were completed in December 2024. DNAmaker has been organized into several phases of experimentation and progressive technical development. It has been deployed in Rennes, mainly within IRISA/Inria and IGDR laboratories. The automation devices are hosted in Catherine André’s laboratory (IGDR). The research team brings together scientists from IRISA/Inria (D. Lavenier, O. Boullé, J. Leblanc, G. Andrade-Barosso) and from IGDR (Y. Audic). The project follows the earlier dnarXiv initiative, a scientific effort exploring data storage on DNA molecules.

Objective

The DNAmaker project was aiming to fully automate the design of long DNA molecules for digital data storage, a protocol which was successfully experimented manually in the dnarXiv project. Unlike current DNA synthesis technologies, which only produce short oligonucleotide sequences (< 300 nucleotides), DNAmaker automatically assembles these short fragments into much longer DNA molecules (several thousand nucleotides).

This platform addresses a major challenge: existing biotechnological processes for DNA assembly are fully manual and require 3 to 4 days of work to build a single 25-kilobase molecule. Automation is therefore essential to scale up and make this storage technology operational.

Motivations

DNA-based data storage offers three decisive advantages for future archiving:

  • Exceptional density: terabytes of information can be stored inside a tiny capsule
  • Outstanding longevity: DNA can remain stable for hundreds of years without special maintenance
  • Low total cost of ownership: an economically viable solution for ultra–long-term archiving

Given the exponential growth of global data production, this technology represents a strategic response. In addition, most of the required infrastructure (computing platform, biological protocols, nanopore sequencing) has already been developed and validated during the dnarXiv project.

The DNAmaker platform

The core of the platform is composed of three robotic devices:

  • Opentrons OT-2 robot: a precision pipetting system used for routine laboratory operations. It is programmed to execute the DNA assembly protocols
  • MG400 robotic arm: a modular robotic arm that automatically supplies the Opentrons robot with well plates containing oligonucleotides and chemical reagents
  • Robot-operated refrigerator: a refrigerator that can be controlled automatically to store reagents and products

These three devices are fully controlled through a Python API, allowing complete synchronization and execution of long, complex experiments without human intervention.

The construction of long DNA molecules involves two major steps: Golden Gate assembly and PCR amplification. This construction method was developed and published as part of the dnarXiv project. This approach allows the progressive assembly of double-stranded DNA molecules, which are much more stable and suitable for storage than the original single-stranded sequences.

The final demonstration consisted of the automated construction of several DNA molecules encoding the 1789 Declaration of the Rights of Man and of the Citizen, stored in three unique copies. A scientific publication is currently being prepared.



Advantages of the DNAmaker Platform

Compared to earlier approaches based solely on short oligonucleotides, assembling long DNA molecules provides several concrete benefits:

  • Cost reduction: a major improvement in the ratio of useful data to total data, increasing throughput while lowering costs
  • Simplified reconstruction: instead of millions of short fragments, only a few thousand long ones need to be processed, making retrieval easier and faster
  • Increased stability: double-stranded DNA is significantly more stable than single-stranded DNA
  • Compatibility with next-generation sequencing: nanopore sequencing technologies work best with long molecules several thousand nucleotides long
  • Enhanced random-access capabilities: enabling improved design for storage and retrieval systems

Perspectives and Challenges

DNAmaker marks a decisive step toward demonstrating the feasibility of DNA-based data storage using long DNA molecules. Automating the construction process paves the way for storage experiments involving significant data volumes (several megabytes and beyond) and strengthens the credibility of our long-molecule storage approach—an emerging paradigm within the community.


 
 
 

https://www.youtube.com/watch?v=VA38WajV7hc
 
 
 

 
 
 

 
 
 

 
 
 

 
 
 
More pictures