Physical separation of haplotypes in dikaryons allows benchmarking of phasing accuracy in Nanopore and HiFi assemblies with Hi-C data

Hongyu Duan, Ashley W. Jones, Tim Hewitt, Amy Mackenzie, Yiheng Hu, Anna Sharp, David Lewis, Rohit Mago, Narayana M. Upadhyaya, John P. Rathjen, Eric A. Stone, Benjamin Schwessinger, Melania Figueroa, Peter N. Dodds, Sambasivam Periyannan, Jana Sperschneider*

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    32 Citations (Scopus)

    Abstract

    Background: Most animals and plants have more than one set of chromosomes and package these haplotypes into a single nucleus within each cell. In contrast, many fungal species carry multiple haploid nuclei per cell. Rust fungi are such species with two nuclei (karyons) that contain a full set of haploid chromosomes each. The physical separation of haplotypes in dikaryons means that, unlike in diploids, Hi-C chromatin contacts between haplotypes are false-positive signals. Results: We generate the first chromosome-scale, fully-phased assembly for the dikaryotic leaf rust fungus Puccinia triticina and compare Nanopore MinION and PacBio HiFi sequence-based assemblies. We show that false-positive Hi-C contacts between haplotypes are predominantly caused by phase switches rather than by collapsed regions or Hi-C read mis-mappings. We introduce a method for phasing of dikaryotic genomes into the two haplotypes using Hi-C contact graphs, including a phase switch correction step. In the HiFi assembly, relatively few phase switches occur, and these are predominantly located at haplotig boundaries and can be readily corrected. In contrast, phase switches are widespread throughout the Nanopore assembly. We show that haploid genome read coverage of 30–40 times using HiFi sequencing is required for phasing of the leaf rust genome, with 0.7% heterozygosity, and that HiFi sequencing resolves genomic regions with low heterozygosity that are otherwise collapsed in the Nanopore assembly. Conclusions: This first Hi-C based phasing pipeline for dikaryons and comparison of long-read sequencing technologies will inform future genome assembly and haplotype phasing projects in other non-haploid organisms.

    Original languageEnglish
    Article number84
    JournalGenome Biology
    Volume23
    Issue number1
    DOIs
    Publication statusPublished - Dec 2022

    Fingerprint

    Dive into the research topics of 'Physical separation of haplotypes in dikaryons allows benchmarking of phasing accuracy in Nanopore and HiFi assemblies with Hi-C data'. Together they form a unique fingerprint.

    Cite this