Kmer2SNP: Reference-free SNP calling from raw reads based on matching

Yanbo Li, Hardip Patel, Yu Lin*

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    3 Citations (Scopus)

    Abstract

    SNP calling is a fundamental problem of genetic analysis and has many applications, such as gene-disease diagnosis, drug design, and ancestry inference. Prior approaches either require high-quality reference genome, or suffer from low recall/precision or high runtime. We develop a reference-free algorithm Kmer2SNP to call SNP directly from raw reads, an approach that models SNP calling into a maximum weight matching problem. We benchmark Kmer2SNP against reference-free methods including hybrid (assembly-based) and assembly-free methods on both simulated and real datasets. Experimental results show that Kmer2SNP achieves better SNP calling quality while being an order of magnitude faster than the state-of-the-art methods. Kmer2SNP shows the potential of calling SNPs only using k-mers from raw reads without assembly. The source code is freely available at https://github.com/yanboANU/Kmer2SNP.

    Original languageEnglish
    Title of host publicationProceedings - 2020 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2020
    EditorsTaesung Park, Young-Rae Cho, Xiaohua Tony Hu, Illhoi Yoo, Hyun Goo Woo, Jianxin Wang, Julio Facelli, Seungyoon Nam, Mingon Kang
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages208-212
    Number of pages5
    ISBN (Electronic)9781728162157
    DOIs
    Publication statusPublished - 16 Dec 2020
    Event2020 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2020 - Virtual, Seoul, Korea, Republic of
    Duration: 16 Dec 202019 Dec 2020

    Publication series

    NameProceedings - 2020 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2020

    Conference

    Conference2020 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2020
    Country/TerritoryKorea, Republic of
    CityVirtual, Seoul
    Period16/12/2019/12/20

    Fingerprint

    Dive into the research topics of 'Kmer2SNP: Reference-free SNP calling from raw reads based on matching'. Together they form a unique fingerprint.

    Cite this