Scientists work toward storing digital information in DNA

Genetic material could become a durable format to archive data in the future.

July 24, 2016 10:55 pm | Updated October 18, 2016 03:13 pm IST - NEW YORK:

A gel image of the DNA of 96 horses displayed on a computer monitor at the UC Davis veterinary genetics lab in Davis, California. — Photo: AP

A gel image of the DNA of 96 horses displayed on a computer monitor at the UC Davis veterinary genetics lab in Davis, California. — Photo: AP

Her computer, Karin Strauss says, contains her “digital attic” a place where she stores that published math paper she wrote in high school, and computer science schoolwork from college.

She’d like to preserve the stuff “as long as I live, at least,” says Ms. Strauss, 37. But computers must be replaced every few years, and each time she must copy the information over, “which is a little bit of a headache.”

It would be much better, she says, if she could store it in DNA the stuff our genes are made of.

Ms. Strauss, who works at Microsoft Research in Redmond, Washington, is working to make that sci-fi fantasy a reality.

She and other scientists are not focused on finding ways to stow high school projects or snapshots or other things an average person might accumulate, at least for now. Rather, they aim to help companies and institutions archive huge amounts of data for decades or centuries, at a time when the world is generating digital data faster than it can store it.

It’s not that the data will disappear from the tape. A bigger problem is familiar to anybody who has come across an old eight-track tape or floppy disk and realised he no longer has a machine to play it. Technology moves on, and data can’t be retrieved if the means to read it is no longer available, Mr. Starr says.

So for that and other reasons, long-term archiving requires repeatedly copying the data to new technologies.

The difference between DNA and digital devices

Into this world comes the notion of DNA storage. DNA is by its essence an information-storing molecule; the genes we pass from generation to generation transmit the blueprints for creating the human body. That information is stored in strings of what’s often called the four-letter DNA code. That really refers to sequences of four building blocks abbreviated as A, C, T and G found in the DNA molecule. Specific sequences give the body directions for creating particular proteins.

Digital devices, on the other hand, store information in a two-letter code that produces strings of ones and zeroes. A capital ‘A’, for example, is 01000001.

How to convert from digital to DNA?

Converting digital information to DNA involves translating between the two codes. In one lab, for example, a capital A can become ATATG. The idea is once that transformation is made, strings of DNA can be custom-made to carry the new code, and hence the information that code contains.

What are the advantages?

* One selling point is durability. Scientists can recover and read DNA sequences from fossils of Neanderthals and even older life forms. So as a storage medium, “it could last thousands and thousands of years,” says Luis Ceze of the University of Washington, who works with Microsoft on DNA data storage.

* Advocates also stress that DNA crams information into very little space. Almost every cell of your body carries about six feet of it; that adds up to billions of miles in a single person. In terms of information storage, that compactness could mean storing all the publicly accessible data on the Internet in a space the size of a shoebox, Mr. Ceze says.

* DNA storage would avoid the problem of having to repeatedly copy stored information into new formats as the technology for reading it becomes outmoded.

Developments in this area

Getting the information into DNA takes some doing. Once scientists have converted the digital code into the 4-letter DNA code, they have to custom-make DNA. For some recent research Ms. Strauss and Mr. Ceze worked on, that involved creating about 10 million short strings of DNA.

Twist Bioscience of San Francisco used a machine to create the strings letter by letter, like snapping together Lego pieces to build a tower. The machine can build up to 1.6 million strings at a time.

Each string carried just a fragment of information from a digital file, plus a chemical tag to indicate what file the information came from.

To read a file, scientists use the tags to assemble the relevant strings. A standard lab machine can then reveal the sequence of DNA letters in each string.

What are the challenges?

Sri Kosuri of the University of California Los Angeles, who has worked on DNA information storage but has now largely moved on to other pursuits, says one challenge for making the technology practical is making it much cheaper.

Scientists custom-build fairly short strings DNA now for research, but scaling up enough to handle information storage in bulk would require a “mind-boggling” leap in output, Mr. Kosuri says. With current technology, that would be hugely expensive.

0 / 0
Sign in to unlock member-only benefits!
  • Access 10 free stories every month
  • Save stories to read later
  • Access to comment on every story
  • Sign-up/manage your newsletter subscriptions with a single click
  • Get notified by email for early access to discounts & offers on our products
Sign in

Comments

Comments have to be in English, and in full sentences. They cannot be abusive or personal. Please abide by our community guidelines for posting your comments.

We have migrated to a new commenting platform. If you are already a registered user of The Hindu and logged in, you may continue to engage with our articles. If you do not have an account please register and login to post comments. Users can access their older comments by logging into their accounts on Vuukle.