ALAMO: Machine Learning from Data and First Principles

Wilson, Z., Carnegie Mellon University

Models of systems and processes are routinely used to facilitate design, optimization, and intelligent decision making. The increasing abundance and availability of large and descriptive data sets has increased the use of data-driven models to aid in these tasks. ALAMO is a computational methodology for the data-driven learning of algebraic models. Through explicit parametric transformations of system inputs, and linear selection of model features, ALAMO is capable of building models tailored for use in equation oriented optimization algorithms. Insights obtained from first principles or modeler insight can be applied directly to modeled responses using constrained regression, a semi-infinite programming approach for enforcing response constraints in the space of the model coefficients.

The capabilities of ALAMO are demonstrated through a number of case studies. Alternative data-driven model building methodologies are compared through the optimization of the resultant models. A linear model selection algorithm using a combination of search heuristics, a regularization filter, and the final integer optimization of a model fitness metric provides efficient selection of predictive models. New data is acquired through an adaptive design of experiments, error maximization sampling, in order to efficiently sample a system and certify model quality. Benchmark data sets are used to compare the model selection and adaptive sampling algorithms used by ALAMO against alternative approaches.


This paper has an Extended Abstract file available; you must purchase the conference proceedings to access it.


Do you already own this?



AIChE Members $695.00
AIChE Graduate Student Members $695.00
AIChE Undergraduate Student Members $695.00
Non-Members $895.00