(373c) Data-Driven Disentanglement of Latent variables and Effective parameters for Cellular Trajectories with Manifold Learning and Deep Learning Approaches | AIChE

(373c) Data-Driven Disentanglement of Latent variables and Effective parameters for Cellular Trajectories with Manifold Learning and Deep Learning Approaches


Evangelou, N., Johns Hopkins University
Sroczynski, D., Princeton University
Phillip, J., Johns Hopkins University
Kevrekidis, I. G., Princeton University
We begin with sets of trajectories sampled from ordinary differential equations (ODE) for various initial conditions from various populations. We illustrate two data-driven frameworks to discover intrinsic information from the ODE system: latent variables and effective parameters. Our first approach utilizes the questionnaire Diffusion Maps metric [1-3]. Questionnaire Diffusion Maps uses an iterative approach to refine its similarity matrix, which allows us to discover latent variables and effective parameters after a few successive steps in a fully unsupervised manner.

Our second approach involves using a) output-informed Diffusion Maps [4] to discover a set of latent observables that capture the intrinsic dimensionality of the system’s response, and b) a conformal Autoencoder Neural Networks (AE) [5] to disentangle the contribution of the initial conditions from that of the parameters to the behavior of the ODE system. This second approach can be easily extended to stochastic trajectories sampled from heterogeneous populations by using a recently developed Neural-SDE approach [6].

We discuss the explainability of the identified data-driven observables for both cases and illustrate that our discovered data-driven observables (latent variables and effective parameters) can be physically interpretable. We consider pedagogical examples from synthetic data generated from parameter dependent ordinary differential equations that we use as a proof of concept. We illustrate the ability of our scheme to discover data-driven observables from data obtained from wearable devices. We also illustrate the ability of our scheme to identify the latent parameter given experimental cellular trajectories of dermal fibroblasts from individuals between 2 and 96 years of age [7].

[1] J.I. Ankenman, Geometry and Analysis of Dual Networks on Questionnaires, Ph.D. thesis, Yale University (2014).
[2] O. Yair, R. Talmon, R.R. Coifman, and I.G. Kevrekidis, Proc. Natl. Acad. Sci. U. S. A. 114, E7865 (2017).
[3] D.W. Sroczynski, O. Yair, R. Talmon, and I.G. Kevrekidis, Isr. J. Chem. 58, 787 (2018).

[4] Holiday, Alexander, et al. "Manifold learning for parameter reduction." Journal of computational physics 392 (2019): 419-431.

[5] Evangelou, Nikolaos, et al. "On the parameter combinations that matter and on those that do not: data-driven studies of parameter (non) identifiability." PNAS Nexus 1.4 (2022): pgac154.

[6] Dietrich, Felix, et al. "Learning effective stochastic differential equations from microscopic simulations: Linking stochastic numerics to deep learning." Chaos: An Interdisciplinary Journal of Nonlinear Science 33.2 (2023): 023121.

[7] Phillip, Jude M., et al. "Biophysical and biomolecular determination of cellular age in humans." Nature biomedical engineering 1.7 (2017): 0093.