(550b) Integration of Umbrella Sampling and Nonlinear Dimensionality Reduction Using Diffusion Maps: Iterative Determination of the “Right” Order Parameters
The diffusion mapping technique [1,2,3] is a nonlinear dimensionality reduction technique to systematically extract dynamically meaningful order parameters directly from a simulation trajectory. Umbrella sampling  is a venerable molecular simulation technique to improve exploration of phase space and efficiently compute free energy surfaces by artificially driving the system in one or more order parameters. The systematic determination of ?good? variables in which to conduct the umbrella sampling is a long-standing problem. In this work, we present an extension of the diffusion map approach to integrate this technique with the umbrella sampling approach and iteratively determine ?good? sampling variables.
Given an initial set of putative sampling variables, determined heuristically or from an exploratory molecular simulation, umbrella sampling is employed to compute the free energy surface (FES) parameterized by these variables . By a slight reformulation of the technique, the diffusion map is applied to the umbrella sampling data accounting for the Boltzmann weights of the constituent data points computed from the FES in the putative variables. The results of the diffusion map are then used to identify ?leakage? into order parameters beyond those in which the umbrella sampling was conducted, and which are then incorporated into a new round of umbrella sampling. The umbrella sampling/diffusion mapping iteration is repeated until no further order parameters emerge.
We illustrate this methodology for blocked alanine dipeptide in explicit solvent, and find the system to be well characterized by the Φ, Ψ and θ dihedral angles, with transitions in the ζ dihedral relatively unimportant. Furthermore, our results show that the diffusion map approach effectively separates the conformational states of the peptide, and determines the ?topology? of the conformational phase space by elucidating the dynamic connectivity between states.
1. Coifman, R.R. Lafon, S. Lee, A.B. Maggioni, M. Nadler, B. Warner, F. Zucker, S.W. ?Geometric Diffusions as a Tool for Harmonic Analysis and Structure Definition of Data: Diffusion Maps? Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 21, 7426-7431.
2. Belkin, M. Niyogi, P. ?Laplacian Eigenmaps for Dimensionality Reduction and Data Representation? Neural Computation, 2003, 15, 1373-1396.
3. Nadler, B. Lafon, S. Coifman, R.R. Kevrekidis, I.G. ?Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators? in Advances in Neural Information Processing Systems, MIT Press: Boston, 2005, 955-962.
4. Torrie, G.M. Valleau, J.P. ?Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling? J. Comput. Phys. 1977, 23, 2, 187.
5. Kumar, S. Rosenberg, J.M. Bouzida, D. Swendsen, R.H. Kollman, P.A. ?The weighted histogram analysis method for free-energy calculations on biomolecules. I. The method? J. Comput. Chem. 1992, 13, 8, 1011-1021.