2021 Annual Meeting

(303f) Initializing the Internal States of Lstm Neural Networks Via Manifold Learning

Checkout You must be logged in to view this content. Log in now.

Pricing

Individuals

List Price	225.00
AIChE Pro Members	150.00
AIChE Emeritus Members	105.00
AIChE Graduate Student Members	Free
AIChE Undergraduate Student Members	Free

Authors

Saurabh Malani - Presenter

Felix Kemeth, Johns Hopkins University

Thomas Bertalan, Johns Hopkins University

Nikolaos Evangelou, Johns Hopkins University

Tianqi Cui, Johns Hopkins University

Ioannis G. Kevrekidis, Princeton University

There has been a long standing effort to derive dynamical systems from data, in particular for tasks such as prediction and control. One particular class of functions that excel in this are recurrent neural networks. Long-short term memory (LSTM) networks have gained an increasing amount of attention in recent years, in particular due to their ability to deal with the vanishing gradient problem and their potential to model partially-observed high-dimensional systems using a set of internal cell states [1]. For accurate predictions, however, the internal states have to be initialized properly, and a precise way guaranteed to find these initial values is still missing.

Here, we present a manifold-learning approach to initialize the internal state values of LSTM recurrent neural networks consistent with initial observed input data. Our approach is based on learning the intrinsic data manifold from the observed variables as a preprocessing step. Using concepts such as generalized synchronization, we argue that the converged internal states are a function on this learned manifold. We show that the dimension of this manifold indicates the required amount of observed input data for proper initialization. This ansatz is demonstrated on a partially observed chemical model system, where we show that initializing the internal LSTM states using this approach yields visibly improved performance compared to earlier "warm-start" initialization approaches [2]. We furthermore discuss the potential application of our approach to other recurrent neural network variants such as reservoir computing [3]. Finally, we show that learning the data manifold can transform the problem of partially observed dynamics into a fully observed one, facilitating the identification of nonlinear dynamical systems [4].

[1] Sepp Hochreiter and JÃ¼rgen Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735â1780, 1997

[2] Hans-Georg Zimmermann, Christoph Tietz, and Ralph Grothmann. Forecasting with Recurrent Neural Networks:12 Tricks, pages 687â707. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012

[3] Herbert Jaeger. The echo state approach to analysing and training recurrent neural networks. Technical report, Fraunhofer Institute for Autonomous Intelligent Systems, 2001

[4] Felix P. Kemeth, Tom Bertalan, Nikolaos Evangelou, Tianqi Cui, Saurabh Malani, and Ioannis G. Kevrekidis, Initializing LSTM internal states via manifold learning. (submitted), 2021

Breadcrumb

2021 Annual Meeting

(303f) Initializing the Internal States of Lstm Neural Networks Via Manifold Learning

Authors