Efficient subsequence search in databases

Rohit Jain, Mukesh K. Mohania, Sunil Prabhakar

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Finding tuples in a database that match a particular subsequence (with gaps) is an important problem for a range of applications. Subsequence search is equivalent to searching for regular expressions of the type.*q 1.*q 2.*.*q l .*, where the subsequence is q 1 q 2.q l . For efficient execution of these queries, there is a need for appropriate index structures that are both efficient and can scale to large problem sizes. This paper presents two index structures for such queries based on trie and bitmap. These indices are disk-resident, hence can be easily used by large databases with limited memory availability. Our indices are applicable to dynamic databases, where tuples can be added or deleted. Both indices are implemented and validated against a naive approach. The results show that the proposed indices are efficient, having low I/O and time overhead.

Original languageEnglish
Title of host publicationWeb-Age Information Management - 14th International Conference, WAIM 2013, Proceedings
PublisherSpringer Verlag
Pages440-452
Number of pages13
ISBN (Print)9783642385612
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event14th International Conference on Web-Age Information Management, WAIM 2013 - Beidaihe, China
Duration: 14 Jun 201316 Jun 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7923 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference14th International Conference on Web-Age Information Management, WAIM 2013
Country/TerritoryChina
CityBeidaihe
Period14/06/1316/06/13

Fingerprint

Dive into the research topics of 'Efficient subsequence search in databases'. Together they form a unique fingerprint.

Cite this