DNA Cryptography

5/26/2011

Bio-CryptographyBio-cryptography or Bioencryption is a next generation security mechanism, storing almost a million gigabytes of data inside bacteria. Research from two prominent universities indicates that it is not only possible but also practical to store digital data in the genome of a living organism and retrieve that data hundreds or even thousands of years later, after the organism has reproduced its genetic material through hundreds of generations.

Note: A milliliter of liquid can contain up to 1 billion bacteria, and you can see that the potential capacity of bacteria-based memory is enormous. The idea of storing data inside bacteria has been around for about a decade. Even very simple bacteria have long strands of DNA with tons of bases available for data encryption, and bacteria are by their nature far more resilient to damage than more traditional electronic storage. Bacteria are nature's hardiest survivors, capable of surviving just about any disaster that would finish off a regular hard drive. Besides, bacteria's natural reproduction would create lots of redundant copies of the data, which would help preserve the integrity of the information and make retrieval easier.

Preparing traditional data for storage inside bacteria is simple enough. There are four DNA bases that can be used to make up the DNA strings: adenine, cytosine, guanine, and thymine. That basically means we're working with a four number system, also known as quaternary numbers.

In a presentation on their breakthrough, the Hong Kong researchers showed how to change the word "iGEM" into DNA-ready code. They used the ASCII table to convert each of the individual letters into a numerical value (i=105, G=71, etc.), which can then be changed from base-10 to base-4 (105=1221, 71=0113, etc.). Finally, those numbers can be changed into their DNA base equivalents, with 0, 1, 2, and 3 replaced with A, T, C, and G. And so iGEM becomes ATCTATTGATTTATGT.

Once the raw data is ready, the researchers say a few algorithms can be used to weed out redundant and repetitive information. That doesn't just save a ton of space - lots of repetition in the DNA sequence can actually be biologically harmful to the wellbeing of the DNA and bacteria, so this step rather neatly solves two problems at once.

DNA strands aren't long enough to store complicated information like a photograph or a book, so the best available solution is to fragment the data into lots of little pieces and spread it among the different cells. To make that work, the researchers have to create a system that allows the fragments to identified and ultimately put back in the right order. So they created a three-part structure for all the DNA: header, message, and checksum.

The header is an 8-base-long sequence that is divided into four levels of identifying information - zone, region, area and district - which allows each fragment to be put back in the right order. After the message carries the actual usable data, the checksum provides a repetition of the original header, which is useful in controlling for minor mutations to the bacteria.

So, let's say the information has been encrypted and placed in lots of different cells of bacteria. How then does someone retrieve the data on the other end? The decrypter would take the DNA and run it through what's known as next-generation high-throughput sequencing, or NGS. This particular type of sequencing analyzes and compares multiple copies of the same sequence and then uses majority-voting to figure out which bases are correct if parts of the data have decayed. Then the compression algorithms could be reversed to restore the raw data to its original form.

The last step would be snapping the fragments back together in the correct order so that the DNA strands could be translated back into useful data. This is where we go from just data storage to data encryption. The person trying to read the data needs a formula that will reveal the right order of the headers and checksums - without that formula, the data remains meaningless.

Now, there does seem like one potential concern with using E. coli to store data: isn't E. coli dangerous? It appears there's not too much to worry about there - the researchers used non-virulent strains of the bacteria, and the bacteria can't do much more than store the data and reproduce. The DNA sequences that represent the data are total gibberish when it comes to encoding potentially dangerous proteins.

(Content extracted from Web)

7 Comments

manoj lade

3/22/2012 11:12:04 pm

i want dna cryptography study paper and algorithm technic

non surgical face lift Melbourne link

5/27/2012 10:32:16 pm

Thanks for sharing this great article! That is very interesting I love reading and I am always searching for informative information like this.

serwis laptopow wroclaw link

7/10/2012 03:09:49 am

It is difficult to acquire knowledgeable individuals about this subject, and you sound like what happens you�re speaking about! Thanks

Minecraft link

7/12/2012 10:09:34 pm

You can find some fascinating time limits on this write-up nonetheless I don�t know if I see all of them heart to heart. There�s some validity even so I�ll take preserve opinion until I look into it further. Great article , thanks and we want far more! Added to FeedBurner as properly

Serwis laptopów wrocław link

7/18/2012 01:56:56 pm

I surely didn�t know that. Learnt 1 thing new these days! Thanks for that.

hormone replacement for women link

8/6/2012 04:46:25 pm

I am very happy to be here because this is a very good site that provides lots of information about the topics covered in depth.

Surendra

2/18/2014 05:48:55 pm

Thanks a lot, it was very useful information

DNA Cryptography

Leave a Reply.

Author

Archives

Categories