Abstract
We have developed a method to automatically detect incidents by detecting abnormal sound events from audio signals recorded in real environments. The proposed method uses the multi-stage Gaussian Mixture Model (GMM), which learns rare sounds using multiple GMMs. In this work, we investigated the relationship between sound environment and detection performance, and found that the performance deteriorates in noisy environments, and that the performance largely depends on the SN ratio of the abnormal sounds. Next, we investigated methods for determining hyperparameters of the multi-stage GMM, which involves intermediate thresholds, numbers of mixtures of GMMs and the detection threshold. The experimental results showed that the combination of percentile-based threshold determination and Bayesian information criterion (BIC)-based mixture determination was most effective. However, when using the automatically-determined parameters, the detection performance deteriorated by up to 20%.
Original language | English |
---|---|
Pages (from-to) | 301-304 |
Number of pages | 4 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
Publication status | Published - 2011 Dec 1 |
Event | 12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011 - Florence, Italy Duration: 2011 Aug 27 → 2011 Aug 31 |
Keywords
- Abnormal sound detection
- CCTV
- Gaussian mixture model
- Incident detection
ASJC Scopus subject areas
- Language and Linguistics
- Human-Computer Interaction
- Signal Processing
- Software
- Modelling and Simulation