Microsoft and University of Washington researchers set record for DNA storage

Microsoft and University of Washington researchers set record for DNA storage

Researchers at Microsoft and the University of Washington have reached an early but important milestone in DNA storage by storing a record 200 megabytes of data on the molecular strands.

The impressive part is not just how much data they were able to encode onto synthetic DNA and then decode. It’s also the space they were able to store it in.

Once encoded, the data occupied a spot in a test tube “much smaller than the tip of a pencil,” said Douglas Carmean, the partner architect at Microsoft overseeing the project.

Think of the amount of data in a big data center compressed into a few sugar cubes. Or all the publicly accessible data on the Internet slipped into a shoebox. That is the promise of DNA storage – once scientists are able to scale the technology and overcome a series of technical hurdles.

 

Test tube holding data next to pencil

Digital data from more than 600 basic smartphones can be stored in the faint pink smear of DNA at the end of this test tube. Photo by Tara Brown Photography/University of Washington.

 

The Microsoft-UW team stored digital versions of works of art (including a high-definition video by the band OK Go!), the Universal Declaration of Human Rights in more than 100 languages, the top 100 books of Project Guttenberg and the nonprofit Crop Trust’s seed database on DNA strands.

Demand for data storage is growing exponentially, and the capacity of existing storage media is not keeping pace.  That’s making it hard for organizations that need to store a lot of data – such as hospitals with vast databases of patient data or companies with lots of video footage – to keep up. And it means information is being lost, and the problem will only worsen without a new solution.

DNA could be the answer.

It has several advantages as a storage medium. It’s compact, durable – capable of lasting for a very long time if kept in good conditions (DNA from woolly mammoths was recovered several thousand years after they went extinct, for instance) – and will always be current, the researchers believe.

“As long as there is DNA-based life on the planet, we’ll be interested in reading it,” said Karin Strauss, the principal Microsoft researcher on the project. “So it’s eternally relevant.”

This explains why the Microsoft-UW team is just one of a number of research groups around the globe pursuing the potential of DNA as a vast digital attic.

The researchers acknowledge they have a long way to go.

Luis Henrique Ceze, a UW associate professor of computer science and engineering and the university’s principal researcher on the project, said the biotechnology industry made big advances in both “synthesizing” (encoding) and “sequencing” (decoding) data in recent years. Even so, he said, the team still has a long way to go to make it viable as an archival technology.

 

But the researchers are upbeat.

They note that their diverse team of computer scientists, computer architects and molecular biologists already has increased storage capacity a thousand times in the last year. And they believe they can make big advances in speed by applying computer science principles like error correction to the process.

Carmean, who was involved in development of Intel’s microprocessor architecture beginning in 1989, puts it this way:

“It’s one of those serendipitous partnerships where a strong understanding of processors and computation married with molecular biology experts has the potential of producing major breakthroughs.”

To get an idea of how the Microsoft-UW team does its work, flash back to high school biology and recall that DNA – or deoxyribonucleic acid – is a molecule that contains the biological instructions used in the growth, development, functioning and reproduction of all known living organisms.

“DNA is an amazing information storage molecule that encodes data about how a living system works. We’re repurposing that capacity to store digital data — pictures, videos, documents,” said Ceze, who is conducting research in the team’s Molecular Information Systems Lab (MISL), which is housed in a basement on the University of Washington campus. “This is one important example of the potential of borrowing from nature to build better computer systems.”

Storing digital data on DNA works like this:

First the data is translated from 1s and 0s into the “letters” of the four nucleotide bases of a DNA strand — (A)denine, (C)ytosine, (G)uanine and (T)hymine.

 

Karin Strauss

Karin Strauss. Photo by Scott Eklund/Red Box Pictures

 

Then they have vendor Twist Bioscience “translate those letters, which are still in electronic form, into the molecules themselves, and send them back,” Strauss said. “It’s essentially a test tube and you can barely see what’s in it. It looks like a little bit of salt was dried in the bottom.”

Reading the data uses a biotech tweak to random access memory (RAM), another concept borrowed from computer science. The team uses polymerase chain reaction (PCR), a technique that molecular biologists use routinely to manipulate DNA, to multiply or “amplify” the strands it wants to recover. Once they’ve sharply increased the concentration of the desired snippets, they take a sample, sequence or decode the DNA and then run error correction computations.

The lab tour complete, one question needed asking: Why an OK Go video?

“We like that a lot because there are many parallels with the work,” Strauss said with a laugh. “They’re very innovative and are bringing different things from different areas into their field and we feel we are doing something very similar.”

[Microsoft]

July 13, 2016 / by / in , , , , , , , ,

Leave a Reply

Show Buttons
Hide Buttons

IMPORTANT MESSAGE: Scooblrinc.com is a website owned and operated by Scooblr, Inc. By accessing this website and any pages thereof, you agree to be bound by the Terms of Use and Privacy Policy, as amended from time to time. Scooblr, Inc. does not verify or assure that information provided by any company offering services is accurate or complete or that the valuation is appropriate. Neither Scooblr nor any of its directors, officers, employees, representatives, affiliates or agents shall have any liability whatsoever arising, for any error or incompleteness of fact or opinion in, or lack of care in the preparation or publication, of the materials posted on this website. Scooblr does not give advice, provide analysis or recommendations regarding any offering, service posted on the website. The information on this website does not constitute an offer of, or the solicitation of an offer to buy or subscribe for, any services to any person in any jurisdiction to whom or in which such offer or solicitation is unlawful.