A fused forensic text comparison system using lexical features, word and character N-grams: A likelihood ratio-based analysis in Predatory Chatlog messages

Shunichi Ishihara*

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    2 Citations (Scopus)

    Abstract

    This study investigates the degree that the performance of a likelihood ratio (LR)-based forensic text comparison (FTC) system improves by using logistic-regression fusion on LRs that were separately estimated by three different procedures, involving lexical features, word-based N-grams and character-based N-grams. This study uses predatory chatlog messages. The number of words used for modelling each group of messages is 500 words. The performance of the FTC system is assessed in terms of its validity (= accuracy) and reliability (= precision) using the log-likelihood-ratio cost (Cllr) and 95% credible intervals (CI), respectively. It is demonstrated that 1) out of the three procedures, the lexical features procedure performed best in terms of Cllr; and that 2) the fused system outperformed all three of the single procedures. The Cllr value of the fused system is better than that of the procedure with lexical features by a value of 0.14. It is also reported that the validity and reliability of a system is negatively correlated; the fused system that yielded the best result in terms of Cllr has the worst CI value.

    Original languageEnglish
    Title of host publicationProceedings of the 2014 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2014
    EditorsDouglas E. Comer, Peter Mueller, Bhawna Mallick, Sougata Mukherjea, Sabu M. Thampi, Dilip Krishnaswamy, Axel Sikora
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages2762-2768
    Number of pages7
    ISBN (Electronic)9781479930791
    DOIs
    Publication statusPublished - 26 Nov 2014
    Event3rd International Conference on Advances in Computing, Communications and Informatics, ICACCI 2014 - Delhi, India
    Duration: 24 Sept 201427 Sept 2014

    Publication series

    NameProceedings of the 2014 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2014

    Conference

    Conference3rd International Conference on Advances in Computing, Communications and Informatics, ICACCI 2014
    Country/TerritoryIndia
    CityDelhi
    Period24/09/1427/09/14

    Fingerprint

    Dive into the research topics of 'A fused forensic text comparison system using lexical features, word and character N-grams: A likelihood ratio-based analysis in Predatory Chatlog messages'. Together they form a unique fingerprint.

    Cite this