SOAP3: GPU-based Compressed Indexing and Ultra-fast Parallel Alignment of Short Reads

Chi Man Liu, Tak-Wah Lam, Thomas Wong, Edward Wu, Siu-Ming Yiu, Ruibang Luo, Bingqiang Wang, Chang Yu, Xiaowen Chu, Kaiyong Zhao, Ruiqiang Li

Research output: Contribution to conferencePaperpeer-review

Abstract

As the cost eciency of the next generation DNA sequencing technology keeps improving,
there is an ever-increasing demand for high-throughput software to align the enormous number of short reads (patterns) with reference genomes (such as the human genome). In the past few years, a number of very fast alignment software (e.g., Maq, SOAP2, ZOOM, Bowtie, BWA) have been developed; most of them are exploiting some kind of compressed index (like BWT) of the reference sequence. These tools can align at a speed of a few hundred seconds per one million reads. Yet such speed is still behind the throughput of the next generation sequencing machines. In this paper we show the rst implementation of a compressed index on the GPU. The technical issues include how to reduce the memory accesses to the index from individual threads (cores) and how to control the branching and divergence of the threads to avoid unnecessary idle time. Based on this new index, we are able to exploit the parallelism of GPU to speed up the alignment of short reads. Our experiments show that with respect to the human genome, the GPU takes only a few seconds to perform exact alignment of one million length-100 reads, and tens of seconds when a few mismatches are required. Further improvement has been obtained by utilizing the host CPU to share the load of the GPU concurrently.
Original languageEnglish
Publication statusPublished - 2011
Externally publishedYes
EventThird Workshop on Massive Data Algorithmics - Paris, Paris, France
Duration: 16 Jun 201116 Jun 2011
https://cs.au.dk/~gerth/madalgo/old-madalgo.au.dk/html_sider/2_5_Events/MASSIVE_2011/Call_For_Papers_2011.html

Conference

ConferenceThird Workshop on Massive Data Algorithmics
Country/TerritoryFrance
CityParis
Period16/06/1116/06/11
Internet address

Fingerprint

Dive into the research topics of 'SOAP3: GPU-based Compressed Indexing and Ultra-fast Parallel Alignment of Short Reads'. Together they form a unique fingerprint.

Cite this