Comparative Study of Outlier Detection Algorithms for Machine Learning

Zahra Nazari, Seong Mi Yu, Dongshik Kang, Yousuke Kawachi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Outliers are unusual data points which are inconsistent with other observations. Human error, mechanical faults, fraudulent behavior, instrument error, and changes in the environment are some reasons to arise outliers. Several types of outlier detection algorithms are developed and a number of surveys and overviews are performed to distinguish their advantages and disadvantages. Multivariate outlier detection algorithms are widely used among other types, therefore we concentrate on this type. In this work a comparison between effects of multivariate outlier detection algorithms on machine learning problems is performed. For this purpose, three multivariate outlier detection algorithms namely distance based, statistical based and clustering based are evaluated. Benchmark datasets of Heart disease, Breast cancer and Liver disorder are used for the experiments. To identify the effectiveness of mentioned algorithms, the above datasets are classified by Support Vector Machines (SVM) before and after outlier detection. Finally a comparative review is performed to distinguish the advantages and disadvantages of each algorithm and their respective effects on accuracy of SVM classifiers.

Original languageEnglish
Title of host publicationICDLT 2018 - 2018 2nd International Conference on Deep Learning Technologies
PublisherAssociation for Computing Machinery
Pages47-51
Number of pages5
ISBN (Electronic)9781450364737
DOIs
Publication statusPublished - 2018 Jun 27
Externally publishedYes
Event2nd International Conference on Deep Learning Technologies, ICDLT 2018 - Chongqing, China
Duration: 2018 Jun 272018 Jun 29

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2nd International Conference on Deep Learning Technologies, ICDLT 2018
Country/TerritoryChina
CityChongqing
Period18/6/2718/6/29

Keywords

  • Machine Learning
  • Outlier Detection
  • Support Vector Machines

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Comparative Study of Outlier Detection Algorithms for Machine Learning'. Together they form a unique fingerprint.

Cite this