Training CNNs with normalized kernels

Mete Ozay, Takayuki Okatani

研究成果: Conference contribution

1 被引用数 (Scopus)

抄録

Several methods of normalizing convolution kernels have been proposed in the literature to train convolutional neural networks (CNNs), and have shown some success. However, our understanding of these methods has lagged behind their success in application; there are a lot of open questions, such as why a certain type of kernel normalization is effective and what type of normalization should be employed for each (e.g., higher or lower) layer of a CNN. As the first step towards answering these questions, we propose a framework that enables us to use a variety of kernel normalization methods at any layer of a CNN. A naive integration of kernel normalization with a general optimization method, such as SGD, often entails instability while updating parameters. Thus, existing methods employ ad-hoc procedures to empirically assure convergence. In this study, we pose estimation of convolution kernels under normalization constraints as constraint-free optimization on kernel submanifolds that are identified by the employed constraints. Note that naive application of the established optimization methods for matrix manifolds to the aforementioned problems is not feasible because of the hierarchical nature of CNNs. To this end, we propose an algorithm for optimization on kernel manifolds in CNNs by appropriate scaling of the space of kernels based on structure of CNNs and statistics of data. We theoretically prove that the proposed algorithm has assurance of almost sure convergence to a solution at single minimum. Our experimental results show that the proposed method can successfully train popular CNN models using several different types of kernel normalization methods. Moreover, they show that the proposed method improves classification performance of baseline CNNs, and provides state-of-the-art performance for major image classification benchmarks.

本文言語English
ホスト出版物のタイトル32nd AAAI Conference on Artificial Intelligence, AAAI 2018
出版社AAAI Press
ページ3884-3891
ページ数8
ISBN(電子版)9781577358008
出版ステータスPublished - 2018
イベント32nd AAAI Conference on Artificial Intelligence, AAAI 2018 - New Orleans, United States
継続期間: 2018 2 22018 2 7

出版物シリーズ

名前32nd AAAI Conference on Artificial Intelligence, AAAI 2018

Other

Other32nd AAAI Conference on Artificial Intelligence, AAAI 2018
CountryUnited States
CityNew Orleans
Period18/2/218/2/7

ASJC Scopus subject areas

  • Artificial Intelligence

フィンガープリント 「Training CNNs with normalized kernels」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル