Construction of bio-constrained code for DNA data storage

dc.contributor.authorWang, Yixin
dc.contributor.authorNoor-A-Rahim, Md.
dc.contributor.authorGunawan, Erry
dc.contributor.authorGuan, Yong Liang
dc.contributor.authorPoh, Chueh Loo
dc.contributor.funderNational University of Singaporeen
dc.contributor.funderHorizon 2020en
dc.date.accessioned2019-05-02T11:10:59Z
dc.date.available2019-05-02T11:10:59Z
dc.date.issued2019-04-22
dc.date.updated2019-05-02T11:03:32Z
dc.description.abstractWith extremely high density and durable preservation, DNA data storage has become one of the most cutting-edge techniques for long-term data storage. Similar to traditional storage which impose restrictions on the form of encoded data, data stored in DNA storage systems are also subject to two biochemical constraints, i.e., maximum homopolymer run limit and balanced GC content limit. Previous studies used successive process to satisfy these two constraints. As a result, the process suffers low efficiency and high complexity. In this paper, we propose a novel content-balanced run-length limited (C-RLL) code with an efficient code construction method, which generates short DNA sequences that satisfy both constraints at one time. Besides, we develop an encoding method to map binary data into long DNA sequences for DNA data storage, which ensures both local and global stability in terms of satisfying the biochemical constraints. The proposed encoding method has high effective code rate of 1.917 bits per nucleotide and low coding complexity.en
dc.description.statusPeer revieweden
dc.description.versionAccepted Versionen
dc.format.mimetypeapplication/pdfen
dc.identifier.citationWang, Y., Noor-A-Rahim, M., Gunawan, E., Guan, Y. L. and Poh, C. L. (2019) 'Construction of bio-constrained code for DNA data storage', IEEE Communications Letters. doi: 10.1109/LCOMM.2019.2912572en
dc.identifier.doi10.1109/LCOMM.2019.2912572en
dc.identifier.eissn1558-2558
dc.identifier.endpage4en
dc.identifier.issn1089-7798
dc.identifier.journaltitleIEEE Communications Lettersen
dc.identifier.startpage1en
dc.identifier.urihttps://hdl.handle.net/10468/7836
dc.language.isoenen
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)en
dc.relation.projectinfo:eu-repo/grantAgreement/EC/H2020::MSCA-COFUND-FP/713567/EU/Cutting Edge Training - Cutting Edge Technology/EDGEen
dc.relation.urihttps://ieeexplore.ieee.org/abstract/document/8695057
dc.rights© 2019, IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.en
dc.subjectDNAen
dc.subjectMemoryen
dc.subjectSiliconen
dc.subjectComplexity theoryen
dc.subject3G mobile communicationen
dc.subjectPrecodingen
dc.subjectDNA data storageen
dc.subjectRun-length limited codeen
dc.subjectLong term data storageen
dc.subjectConstrained codeen
dc.titleConstruction of bio-constrained code for DNA data storageen
dc.typeArticle (peer-reviewed)en
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
08695057.pdf
Size:
387.63 KB
Format:
Adobe Portable Document Format
Description:
Accepted Version
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.71 KB
Format:
Item-specific license agreed upon to submission
Description: