Abstract
Record or data linkage is an important enabling technology in the health sector, as linked data is a costeffective resource that can help to improve research into health policies, detect adverse drug reactions, reduce costs, and uncover fraud within the health system. Significant advances, mostly originating from data mining and machine learning, have been made in recent years in many areas of record linkage techniques. Most of these new methods are not yet implemented in current record linkage systems, or are hidden within �black box� commercial software. This makes it difficult for users to learn about new record linkage techniques, as well as to compare existing linkage techniques with new ones. What is required are flexible tools that enable users to experiment with new record linkage techniques at low costs. This paper describes the Febrl (Freely Extensible Biomedical Record Linkage) system, which is available under an open source software licence. It contains many recently developed advanced techniques for data cleaning and standardisation, indexing (blocking), field comparison, and record pair classification, and encapsulates them into a graphical user interface. Febrl can be seen as a training tool suitable for users to learn and experiment with both traditional and new record linkage techniques, as well as for practitioners to conduct linkages with data sets containing up to several hundred thousand records.
Original language | English |
---|---|
Title of host publication | Proceedings of Australasian Workshop on Health Data and Knowledge Management (HDKM 2008) |
Editors | James R Warren, Ping Yu, John Yearwood, Jon D Patrick |
Place of Publication | Australia |
Publisher | Australian Computer Society Inc. |
Pages | 17-25 |
Edition | Peer Reviewed |
ISBN (Print) | 97819206862613 |
Publication status | Published - 2008 |
Event | Australasian Workshop on Health Data and Knowledge Management (HDKM 2008) - Wollongong Australia Duration: 1 Jan 2008 → … |
Conference
Conference | Australasian Workshop on Health Data and Knowledge Management (HDKM 2008) |
---|---|
Period | 1/01/08 → … |
Other | January 22-25 2008 |