Mahalanobis distance with an adapted within-author covariance matrix: An authorship verification experiment

Shunichi Ishihara*

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    1 Citation (Scopus)

    Abstract

    The rotated delta, which is argued to be a theoretically better-grounded distance measure, has failed to receive any empirical support for its superiority. This study revisits the rotated delta—which is more commonly known as the Mahalanobis distance in other areas—with two different covariance matrices that are estimated from training data. The first covariance matrix represents the between-author variability, and the second the within-author variability. A series of likelihood ratio-based authorship verification experiments was carried out with some different distance measures. The experiments made use of the documents arranged from a large database of text messages that allowed for a total of 2,160 same-author and 4,663,440 different-author comparisons. The Mahalanobis distance with the between-author covariance matrix performed far worse compared to the other distance measures, whereas the Mahalanobis distance with the within-author covariance matrix performed better than the other measures. However, superior performance relative to the cosine distance is subject to word lengths and/or the order of the feature vector. The result of follow-up experiments further illustrated that the covariance matrix representing the within-author variability needs to be trained using a good amount of data to perform better than the cosine distance: the higher the order of the vector, the more data are required for training. The quantitative results also infer that the two sources of variabilities—notably within- and between-author variabilities—are independent of each other to the extent that the latter cannot accurately approximate the former.

    Original languageEnglish
    Pages (from-to)1051-1072
    Number of pages22
    JournalDigital Scholarship in the Humanities
    Volume37
    Issue number4
    DOIs
    Publication statusPublished - 1 Dec 2022

    Fingerprint

    Dive into the research topics of 'Mahalanobis distance with an adapted within-author covariance matrix: An authorship verification experiment'. Together they form a unique fingerprint.

    Cite this