Predicting hard drive failures in datacenters can help avoid wasting resources and waiting time for recovery. Anomaly detection from sensing data is commonly used for predicting failures. Usually, conventional threshold-based anomaly detection methods consider each sensor independently. However, deciding an optimal threshold for each type of sensors is not trivial, especially for large-scale systems in datacenters. To detect failures that cannot conventionally be detected, multimodal anomaly detection becomes crucial integrating sensing data from different types of sensors. This work proposes a correlation-based multimodal anomaly detection approach. This approach is applied to a Network-Attached Storage (NAS) system with multiple hard disk drives (HDDs) and three sensors, which are a thermal camera, a microphone, and system performance logs. The unimodal results show that the auditory and system performance model can detect temporal anomalies, and the thermal model can detect spatial anomalies. The multimodal results show that even with a simple filter and detection algorithms, the multimodal approach was able to detect failure signs before the real failure and also earlier than the auditory unimodal approach.