(9g) High Quality Protein Structure Prediction Using Equivariant Convoluted Networks with Applications in Drug Design and Next Generation Biomaterials | AIChE

(9g) High Quality Protein Structure Prediction Using Equivariant Convoluted Networks with Applications in Drug Design and Next Generation Biomaterials

Authors 

Chowdhury, R. - Presenter, Harvard Medical School
Over the past six decades, researchers have been able to discern and report three dimensional geometries of proteins in labs using experimental techniques like cryo-electron microscopy, nuclear magnetic resonance and X-ray crystallography. However, each method depends on a lot of trial and error, which lead to both time and monetary overhead (often thousands of dollars per protein structure). This is why biologists are turning to AI-based methods as a proxy to this cumbersome process for proteins. A sophisticated protocol will enable accelerated research in drug design and even determining novel biocatalytic scaffolds, and other biomaterials. To this end, there exists three state of the art tools (as of March 2020) – AlphaFold (Google, DeepMind), and trRosetta (David Baker Lab), and RGN (from our lab by Mohammed AlQuraishi) which are able to use convoluted and/or recursive geometric neural networks to predict most probable dihedrals from a sequence, which is when used to build a protein structure.

However, none of these methods are able to resolve (a) loop structures, and (b) cis/ trans orientations of amino acids (because it is difficult two learn from the bi-modal distribution of w (omega), which is 0° for cis and 180° for trans amino acids). We hereby propose a novel method which will learn rotations and torsions, in addition to inter-residue distances and dihedrals to predict a distogram which not only encodes information between residue i and i+1, but also information about all possible NC2 information using an equivariant neural network (see Figure 1). Subsequently, the best fit Ca-trace would be obtained that meet the distance, dihedral, rotational, and torsional constraints. Finally, in-built functions of PyRosetta would be used to build a PDB structure of the protein with appropriate rotamer-repacking to obtain a lowest energy structure.

As potential application of this tool, I envision applications ranging from therapeutic drug design, to next generation materials for CAR-T cell therapy or for precise bioseparations and drug delivery using stable ensembles of block copolymers and designed channel proteins.