Training Neural Networks by Sub-Problems for Avoiding Local Minima

Hossein Mobahi, Department of Computer Science, University of Illinois at Urbana-Champaign

Multilayer neural networks with sigmoidal activation function are interesting because of being general function approximators. However, the popular error backpropagation algorithm used for training these networks is usually unable to achieve a globally optimum solution and is trapped in local minima. In this work we present a new idea for coping with this problem by creating a chain of sub-problems. The chain starts from an local-minima-free problem (convex objective function) and gradually deforms to the original problem. The way the sub-problems are created maintains global optimality throughout deformations under certain assumptions. The sub-problems are created systematically by reformulating the problem to a constrained nonlinear program.