(569e) Signature Descriptors for Structure Generation of Optimal Reactants and Products

Authors: 
Dev, V., Auburn University
Chemmangattuvalappil, N. G., Auburn University
Eden, M. R., Auburn University

Selection of optimal chemicals and maximization of process efficiency has successfully been carried out by utilizing process design techniques that integrate product design considerations in a reverse problem formulation. The combined process-product design problem is solved for design targets in terms of properties that are later used to design optimal products. However, most of these techniques have been implemented in non-reactive systems. Different efforts to address the paucity of techniques for reactive systems have been restricted to single unknown reactants. Also, utilization of linear property models and solution schemes that do not have a provision for treating property constraints are among some of the other restrictions. In this work an algorithm has been developed for the design of optimal reactants and products that is not restricted by the number of unknown reactants and products. Nonlinear property models can be incorporated in the algorithm and the solution scheme considers property constraints. The products of the reaction(s) are constrained by property bounds arising from the process to ensure the optimality of the process. An optimization problem is formulated for each of the products in order to select the best chemical structures that satisfy the property bounds and have the best dominant property. The recently developed molecular signature descriptors have been utilized to perform molecular design. The signature descriptor is a systematic codification system over an alphabet of atoms describing the extended valence of the atoms of a molecule. The signature of a molecule can be expressed as a linear combination of its atomic signatures. Topological indices of molecules that form various property models can be linearly represented in terms of molecular signature descriptors [1]. Different quantitative structure activity/property relationships (QSARs/QSPRs) have been applied in the developed algorithm to estimate properties from molecular structures. QSARs/QSPRs can be expressed in terms of different topological indices. Thus the target properties can be expressed in terms of signatures. In the developed method, the problem is formulated in terms of the number of occurrences of each signature, i.e. the number of times each signature appears in the molecular structure. The dominant property to be optimized and the property constraints can thus be ultimately expressed in terms of the number of occurrences of the signatures in the product molecule. After identifying the occurrence values from each of the formulated problems, the molecular structure of products is generated by using an algorithm developed by Chemmangattuvalappil et al. [2]. Next, using the chemical equation to track the migration of groups from reactants to products, the structure of the reactants is determined by considering the appropriate addition and/or removal and/or rearrangement of groups. Thus, candidate reactant structures can be generated irrespective of the type and number of reactions as the introduced algorithm focuses on the design of products first and then uses the chemical equations to identify the reactant structures. Graph theory principles have been utilized to track the signatures and to avoid the generation of infeasible molecular structures. To solve the resulting nonlinear integer optimization problem(s), MI-LXPM, a real coded genetic algorithm (RCGA) developed by Deep et al. [3] has been incorporated in the solution scheme. RCGAs are general purpose population based search techniques that mimic the principles of natural selection and natural genetics laid down by Charles Darwin and encode the solution (represented as chromosomes) in real numbers. Property functions that may be discontinuous, non-differentiable and/or highly non-linear can be handled by genetic algorithms (GAs). Real coding is generally more efficent than binary and gray coding and it helps avoid the “Hamming-Cliff” problem. MI-LXPM is a robust and efficient GA for solving nonlinear integer programming problems. It utilizes the recently developed and self adaptive Laplace crossover (LX) operator and a tunable power mutation (PM) operator [4]. A special truncation procedure is utilized to handle integer restrictions on variables instead of rounding off carried out in previous algorithms. One may not begin with an array of feasible points since a parameter free constraint handling procedure dependent on tournament selection is used. Thus search of feasible points is not a constraining factor. This contribution will illustrate the developed methods and highlight their use through a case study.

[1]   J.L. Faulon, D.P. Visco Jr., R.S. Pophale, 2003, Journal of Chemical Information and Computer   Sciences, 43, 707-720

[2]   N.G. Chemmangattuvalappil, M.R. Eden, 2013, Industrial & Engineering Chemistry Research, 52, 7090-7103

[3]   K. Deep, K.P. Singh, M.L. Kansal, C. Mohan, 2009, Applied Mathematics and Computation, 212, 505-518

[4]   K. Deep, M. Thakur, 2007, Applied Mathematics and Computing, 193, 211-230