TY - JOUR
T1 - Reassembling haplotypes in a mixture of pooled amplicons when the relative concentrations are known
T2 - A proof-of-concept study on the efficient design of nextgeneration sequencing strategies
AU - Ranjard, Louis
AU - Wong, Thomas K.F.
AU - Rodrigo, Allen G.
N1 - Publisher Copyright:
© 2018 Ranjard et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2018/4
Y1 - 2018/4
N2 - Next-generation sequencing can be costly and labour intensive. Usually, the sequencing cost per sample is reduced by pooling amplified DNA = amplicons) derived from different individuals on the same sequencing lane. Barcodes unique to each amplicon permit short-read sequences to be assigned appropriately. However, the cost of the library preparation increases with the number of barcodes used. We propose an alternative to barcoding: by using different known proportions of individually-derived amplicons in a pooled sample, each is characterised a priori by an expected depth of coverage. We have developed a Hidden Markov Model that uses these expected proportions to reconstruct the input sequences. We apply this method to pools of mitochondrial DNA amplicons extracted from kangaroo meat, genus Macropus. Our experiments indicate that the sequence coverage can be efficiently used to index the short-reads and that we can reassemble the input haplotypes when secondary factors impacting the coverage are controlled. We therefore demonstrate that, by combining our approach with standard barcoding, the cost of the library preparation is reduced to a third.
AB - Next-generation sequencing can be costly and labour intensive. Usually, the sequencing cost per sample is reduced by pooling amplified DNA = amplicons) derived from different individuals on the same sequencing lane. Barcodes unique to each amplicon permit short-read sequences to be assigned appropriately. However, the cost of the library preparation increases with the number of barcodes used. We propose an alternative to barcoding: by using different known proportions of individually-derived amplicons in a pooled sample, each is characterised a priori by an expected depth of coverage. We have developed a Hidden Markov Model that uses these expected proportions to reconstruct the input sequences. We apply this method to pools of mitochondrial DNA amplicons extracted from kangaroo meat, genus Macropus. Our experiments indicate that the sequence coverage can be efficiently used to index the short-reads and that we can reassemble the input haplotypes when secondary factors impacting the coverage are controlled. We therefore demonstrate that, by combining our approach with standard barcoding, the cost of the library preparation is reduced to a third.
UR - http://www.scopus.com/inward/record.url?scp=85045148194&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0195090
DO - 10.1371/journal.pone.0195090
M3 - Article
SN - 1932-6203
VL - 13
JO - PLoS ONE
JF - PLoS ONE
IS - 4
M1 - e0195090
ER -