(724h) Machine Learning of Macromolecular Folding Funnels from Univariate Measurements
The stable conformations and motions of biomolecules and macromolecules are governed by their underlying free energy surface. By integrating ideas from dynamical systems theory with nonlinear manifold learning, we have developed an approach to recover single-molecule free energy surfaces from univariate time series in a single molecular observable. Using Takens’ Delay Embedding Theorem, we expand the measurement time series into a high dimensional phase space in which the dynamics are equivalent to those of the macromolecule in real space. We then employ nonlinear manifold learning – diffusion maps and nonlinear principal components analysis – to extract a low-dimensional representation of the free energy surface that is diffeomorphic (i.e., related by a smooth transformation) to that which would have been recovered from a complete knowledge of all molecular degrees of freedom. We have validated our approach in molecular dynamics simulations of a C24H50 polymer chain, demonstrating that the free energy surface extracted from knowledge of only the head-to-tail distance of the chain is geometrically and topologically equivalent to that recovered from a complete knowledge of all the atomic coordinates. Our approach lays the theoretical foundations to extract empirical polymer and protein folding landscapes directly from experimental measurements.