A new local distance-based outlier detection approach for scattered real-world data

Ke Zhang*, Marcus Hutter, Huidong Jin

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    330 Citations (Scopus)

    Abstract

    Detecting outliers which are grossly different from or inconsistent with the remaining dataset is a major challenge in real-world KDD applications. Existing outlier detection methods are ineffective on scattered real-world datasets due to implicit data patterns and parameter setting issues. We define a novel Local Distance-based Outlier Factor (LDOF) to measure the outlier-ness of objects in scattered datasets which addresses these issues. LDOF uses the relative location of an object to its neighbours to determine the degree to which the object deviates from its neighbourhood. We present theoretical bounds on LDOF's false-detection probability. Experimentally, LDOF compares favorably to classical KNN and LOF based outlier detection. In particular it is less sensitive to parameter values.

    Original languageEnglish
    Title of host publication13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009
    Pages813-822
    Number of pages10
    DOIs
    Publication statusPublished - 2009
    Event13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009 - Bangkok, Thailand
    Duration: 27 Apr 200930 Apr 2009

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume5476 LNAI
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009
    Country/TerritoryThailand
    CityBangkok
    Period27/04/0930/04/09

    Fingerprint

    Dive into the research topics of 'A new local distance-based outlier detection approach for scattered real-world data'. Together they form a unique fingerprint.

    Cite this