Detecting outliers in high-dimensional neuroimaging datasets with robust covariance estimators

Virgile Fritsch, Gaël Varoquaux, Benjamin Thyreau, Jean Baptiste Poline, Bertrand Thirion

Research output: Contribution to journalArticlepeer-review

44 Citations (Scopus)


Medical imaging datasets often contain deviant observations, the so-called outliers, due to acquisition or preprocessing artifacts or resulting from large intrinsic inter-subject variability. These can undermine the statistical procedures used in group studies as the latter assume that the cohorts are composed of homogeneous samples with anatomical or functional features clustered around a central mode. The effects of outlying subjects can be mitigated by detecting and removing them with explicit statistical control. With the emergence of large medical imaging databases, exhaustive data screening is no longer possible, and automated outlier detection methods are currently gaining interest. The datasets used in medical imaging are often high-dimensional and strongly correlated. The outlier detection procedure should therefore rely on high-dimensional statistical multivariate models. However, state-of-the-art procedures, based on the Minimum Covariance Determinant (MCD) estimator, are not well-suited for such high-dimensional settings. In this work, we introduce regularization in the MCD framework and investigate different regularization schemes. We carry out extensive simulations to provide backing for practical choices in absence of ground truth knowledge. We demonstrate on functional neuroimaging datasets that outlier detection can be performed with small sample sizes and improves group studies.

Original languageEnglish
Pages (from-to)1359-1370
Number of pages12
JournalMedical Image Analysis
Issue number7
Publication statusPublished - 2012 Oct
Externally publishedYes


  • High-dimension
  • Minimum covariance determinant
  • Neuroimaging
  • Outlier detection
  • Robust estimation

ASJC Scopus subject areas

  • Radiological and Ultrasound Technology
  • Radiology Nuclear Medicine and imaging
  • Computer Vision and Pattern Recognition
  • Health Informatics
  • Computer Graphics and Computer-Aided Design


Dive into the research topics of 'Detecting outliers in high-dimensional neuroimaging datasets with robust covariance estimators'. Together they form a unique fingerprint.

Cite this