An assembly-free method of phylogeny reconstruction using short-read sequences from pooled samples without barcodes

Thomas K.F. Wong*, Teng Li, Louis Ranjard, Steven H. Wu, Jeet Sukumaran, Allen G. Rodrigo*

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    Abstract

    A current strategy for obtaining haplotype information from several individuals involves short-read sequencing of pooled amplicons, where fragments from each individual is identified by a unique DNA barcode. In this paper, we report a new method to recover the phylogeny of haplotypes from short-read sequences obtained using pooled amplicons from a mixture of individuals, without barcoding. The method, AFPhyloMix, accepts an alignment of the mixture of reads against a reference sequence, obtains the single-nucleotide-polymorphisms (SNP) patterns along the alignment, and constructs the phylogenetic tree according to the SNP patterns. AFPhyloMix adopts a Bayesian inference model to estimate the phylogeny of the haplotypes and their relative abundances, given that the number of haplotypes is known. In our simulations, AFPhyloMix achieved at least 80% accuracy at recovering the phylogenies and relative abundances of the constituent haplotypes, for mixtures with up to 15 haplotypes. AFPhyloMix also worked well on a real data set of kangaroo mitochondrial DNA sequences.

    Original languageEnglish
    Article numbere1008949
    JournalPLoS Computational Biology
    Volume17
    Issue number9
    DOIs
    Publication statusPublished - Sept 2021

    Fingerprint

    Dive into the research topics of 'An assembly-free method of phylogeny reconstruction using short-read sequences from pooled samples without barcodes'. Together they form a unique fingerprint.

    Cite this