TY - GEN
T1 - Efficient subsequence search in databases
AU - Jain, Rohit
AU - Mohania, Mukesh K.
AU - Prabhakar, Sunil
PY - 2013
Y1 - 2013
N2 - Finding tuples in a database that match a particular subsequence (with gaps) is an important problem for a range of applications. Subsequence search is equivalent to searching for regular expressions of the type.*q 1.*q 2.*.*q l .*, where the subsequence is q 1 q 2.q l . For efficient execution of these queries, there is a need for appropriate index structures that are both efficient and can scale to large problem sizes. This paper presents two index structures for such queries based on trie and bitmap. These indices are disk-resident, hence can be easily used by large databases with limited memory availability. Our indices are applicable to dynamic databases, where tuples can be added or deleted. Both indices are implemented and validated against a naive approach. The results show that the proposed indices are efficient, having low I/O and time overhead.
AB - Finding tuples in a database that match a particular subsequence (with gaps) is an important problem for a range of applications. Subsequence search is equivalent to searching for regular expressions of the type.*q 1.*q 2.*.*q l .*, where the subsequence is q 1 q 2.q l . For efficient execution of these queries, there is a need for appropriate index structures that are both efficient and can scale to large problem sizes. This paper presents two index structures for such queries based on trie and bitmap. These indices are disk-resident, hence can be easily used by large databases with limited memory availability. Our indices are applicable to dynamic databases, where tuples can be added or deleted. Both indices are implemented and validated against a naive approach. The results show that the proposed indices are efficient, having low I/O and time overhead.
UR - http://www.scopus.com/inward/record.url?scp=84880033208&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-38562-9_45
DO - 10.1007/978-3-642-38562-9_45
M3 - Conference contribution
SN - 9783642385612
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 440
EP - 452
BT - Web-Age Information Management - 14th International Conference, WAIM 2013, Proceedings
PB - Springer Verlag
T2 - 14th International Conference on Web-Age Information Management, WAIM 2013
Y2 - 14 June 2013 through 16 June 2013
ER -