A Robust Parallel Computing Data Extraction Framework for Nanopore Experiments

Y. M. N. D. Y. Bandara, Shankar Dutt, Buddini I. Karawdeniya, Jugal Saharia, Patrick Kluth, Antonio Tricoli

Research output: Contribution to journalArticlepeer-review

Abstract

The success of a nanopore experiment relies not only on the quality of the experimental design but also on the performance of the analysis program utilized to decipher the ionic perturbations necessary for understanding the fundamental molecular intricacies. An event extraction framework is developed that leverages parallel computing, efficient memory management, and vectorization, yielding significant performance enhancement. The newly developed abf-ultra-simple function extracts key parameters from the header critical for the operation of open-seek-read-close data loading architecture running on multiple cores. This underpins the swift analysis of large files where an ≈ × 18 improvement is found for a 100 min-long file (≈4.5 GB) compared to the more traditional single (cell) array data loading method. The application is benchmarked against five other analysis platforms showcasing significant performance enhancement (>2 ×–1120 ×). The integrated provisions for batch analysis enable concurrently analyzing multiple files (vital for high-bandwidth experiments). Furthermore, the application is equipped with multi-level data fitting based on abrupt changes in the event waveform. The application condenses the extracted events to a single binary file improving data portability (e.g., 16 GB file with 28 182 events reduces to 47.9 MB–343 × size reduction) and enables a multitude of post-analysis extractions to be done efficiently.

Original languageEnglish
Pages (from-to)2400045
Number of pages13
JournalSmall Methods
Early online dateJul 2024
DOIs
Publication statusPublished - 5 Jul 2024

Fingerprint

Dive into the research topics of 'A Robust Parallel Computing Data Extraction Framework for Nanopore Experiments'. Together they form a unique fingerprint.

Cite this