An effective pattern based outlier detection approach for mixed attribute data

Ke Zhang, Huidong Jin*

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    19 Citations (Scopus)

    Abstract

    Detecting outliers in mixed attribute datasets is one of major challenges in real world applications. Existing outlier detection methods lack effectiveness for mixed attribute datasets mainly due to their inability of considering interactions among different types of, e.g., numerical and categorical attributes. To address this issue in mixed attribute datasets, we propose a novel Pattern based Outlier Detection approach (POD). Pattern in this paper is defined to describe majority of data as well as capture interactions among different types of attributes. In POD, the more does an object deviate from these patterns, the higher is its outlier factor. We use logistic regression to learn patterns and then formulate the outlier factor in mixed attribute datasets. A series of experimental results illustrate that POD performs statistically significantly better than several classic outlier detection methods.

    Original languageEnglish
    Title of host publicationAI 2010
    Subtitle of host publicationAdvances in Artificial Intelligence - 23rd Australasian Joint Conference, Proceedings
    Pages122-131
    Number of pages10
    DOIs
    Publication statusPublished - 2010
    Event23rd Australasian Joint Conference on Artificial Intelligence, AI 2010 - Adelaide, SA, Australia
    Duration: 7 Dec 201010 Dec 2010

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume6464 LNAI
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference23rd Australasian Joint Conference on Artificial Intelligence, AI 2010
    Country/TerritoryAustralia
    CityAdelaide, SA
    Period7/12/1010/12/10

    Fingerprint

    Dive into the research topics of 'An effective pattern based outlier detection approach for mixed attribute data'. Together they form a unique fingerprint.

    Cite this