TY - GEN
T1 - Improving generalization ability of deep neural networks for visual recognition tasks
AU - Okatani, Takayuki
AU - Liu, Xing
AU - Suganuma, Masanori
N1 - Funding Information:
Acknowledgments. This work was partly supported by JSPS KAKENHI Grant Number JP15H05919, JST CREST Grant Number JPMJCR14D1, and the ImPACT Program “Tough Robotics Challenge” of the Council for Science, Technology, and Innovation (Cabinet Office, Government of Japan).
Publisher Copyright:
© Springer Nature Switzerland AG 2019.
PY - 2019
Y1 - 2019
N2 - This article discusses generalization ability of convolutional neural networks (CNNs) for visual recognition with special focus on robustness to image degradation. It has been long since CNNs were claimed to surpass human vision, for example, in an object recognition task. However, such claims simply report experimental results that CNNs perform better than humans on a closed set of testing inputs. In fact, CNNs can easily fail for images to which noises are added, when they have not learned the noisy images; this is the case even if humans are barely affected by the added noises. As a solution to this problem, we discuss an approach that first restores the clean image from an input distorted image and then uses it for the target recognition task, where a CNN trained only on clean images is used. For solutions to the first step, we show our recent studies of image restoration. There are multiple different types of image distortion, such as noise, defocus/motion blur, rain-streaks, raindrops, haze etc. We first introduce our recent study of architectural design of CNNs for image restoration targeting at a single, identified type of distortion. We then introduce another study, which proposes to use a single CNN to remove combination of multiple types of distortion with unknown mixture ratio. Although it achieves only lower accuracy than the first method in the case of a single, identified type of distortion, the method will be more useful in practical applications.
AB - This article discusses generalization ability of convolutional neural networks (CNNs) for visual recognition with special focus on robustness to image degradation. It has been long since CNNs were claimed to surpass human vision, for example, in an object recognition task. However, such claims simply report experimental results that CNNs perform better than humans on a closed set of testing inputs. In fact, CNNs can easily fail for images to which noises are added, when they have not learned the noisy images; this is the case even if humans are barely affected by the added noises. As a solution to this problem, we discuss an approach that first restores the clean image from an input distorted image and then uses it for the target recognition task, where a CNN trained only on clean images is used. For solutions to the first step, we show our recent studies of image restoration. There are multiple different types of image distortion, such as noise, defocus/motion blur, rain-streaks, raindrops, haze etc. We first introduce our recent study of architectural design of CNNs for image restoration targeting at a single, identified type of distortion. We then introduce another study, which proposes to use a single CNN to remove combination of multiple types of distortion with unknown mixture ratio. Although it achieves only lower accuracy than the first method in the case of a single, identified type of distortion, the method will be more useful in practical applications.
KW - Convolutional neural networks
KW - Generalization ability
KW - Visual recognition
UR - http://www.scopus.com/inward/record.url?scp=85064230405&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85064230405&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-13940-7_1
DO - 10.1007/978-3-030-13940-7_1
M3 - Conference contribution
AN - SCOPUS:85064230405
SN - 9783030139391
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 3
EP - 13
BT - Computational Color Imaging - 7th International Workshop, CCIW 2019, Proceedings
A2 - Trémeau, Alain
A2 - Horiuchi, Takahiko
A2 - Tominaga, Shoji
A2 - Schettini, Raimondo
PB - Springer Verlag
T2 - 7th Computational Color Imaging Workshop, CCIW 2019
Y2 - 27 March 2019 through 29 March 2019
ER -