(237j) A Novel Approach to Data Processing in High Throughput Material Science Experimentation | AIChE

(237j) A Novel Approach to Data Processing in High Throughput Material Science Experimentation

Advancements in materials modeling techniques lead to rapid development of new materials. To test the properties of these modeled materials, a more rapid approach to material science research is required. High throughput experiments (HTE) in material science research accomplishes this challenge by providing a methodology for streamlining and optimizing the efficiency of experiments. In current HTE experiments large amounts of data are collected from a range of material compositions. While in the past HTE was limited by data collection, advances in computing have helped overcome these issues. To address the burgeoning big data in materials science problem we have been working to develop novel data processing algorithms. A typical experiment requires that a sample be scanned with a synchrotron light source, such as the one at the Stanford Linear Accelerator Center (SLAC) to gather information about the underlying crystal structure and composition. The resulting large volume of data would take a long time to process manually. An algorithm can process the data in seconds. A typical algorithm begins with a background subtraction. This is done quickly with a cubic spline fit. After the background is subtracted, peaks are fit with a pseudo-Voigt profile. Using the full width half maximum, peak location and peak height for each sample, the features across a sample can be analyzed quickly[J1] [s2] . This type of signal processing is widely used and greatly increases the data interpretation step of an experiment. We have been developing assisted machine learning algorithms to more [s3] quickly analyze crystal phase with respect to sample composition and other properties. The algorithm first separates data into many groups, called clusters. This initial clustering procedure does not require human input and forms clusters based on common traits in the XRD spectra. The next part of the algorithm requires that a human identify features on characteristic data picked by the algorithm from each cluster. The algorithm uses human input to create standard criteria for crystal phase identification in the rest of the data. This greatly reduces the time from data collection to sample characterization, and ultimately new material discovery. By developing new data processing techniques, experiments are optimized for efficiency and precision.Advancements in materials modeling techniques lead to rapid development of new materials. To test the properties of these modeled materials, a more rapid approach to material science research is required. High throughput experiments (HTE) in material science research accomplishes this challenge by providing a methodology for streamlining and optimizing the efficiency of experiments. In current HTE experiments large amounts of data are collected from a range of material compositions. While in the past HTE was limited by data collection, advances in computing have helped overcome these issues. To address the burgeoning big data in materials science problem we have been working to develop novel data processing algorithms. A typical experiment requires that a sample be scanned with a synchrotron light source, such as the one at the Stanford Linear Accelerator Center (SLAC) to gather information about the underlying crystal structure and composition. The resulting large volume of data would take a long time to process manually. An algorithm can process the data in seconds. A typical algorithm begins with a background subtraction. This is done quickly with a cubic spline fit. After the background is subtracted, peaks are fit with a pseudo-Voigt profile. Using the full width half maximum, peak location and peak height for each sample, the features across a sample can be analyzed quickly[J1] [s2] . This type of signal processing is widely used and greatly increases the data interpretation step of an experiment. We have been developing assisted machine learning algorithms to more [s3] quickly analyze crystal phase with respect to sample composition and other properties. The algorithm first separates data into many groups, called clusters. This initial clustering procedure does not require human input and forms clusters based on common traits in the XRD spectra. The next part of the algorithm requires that a human identify features on characteristic data picked by the algorithm from each cluster. The algorithm uses human input to create standard criteria for crystal phase identification in the rest of the data. This greatly reduces the time from data collection to sample characterization, and ultimately new material discovery. By developing new data processing techniques, experiments are optimized for efficiency and precision.