(362i) Virtual Sample Generation Based on Quantile Regression Variational Generative Adversarial Network for Soft-Sensing Modeling | AIChE

(362i) Virtual Sample Generation Based on Quantile Regression Variational Generative Adversarial Network for Soft-Sensing Modeling

Authors 

Zhang, X., Beijing University of Chemical Technology
Zhu, Q., Beijing University of Chemical Technology
In the modern process industries, the establishment of an accurate and reliable soft-sensing model plays a vital role in the development of decision-making systems [1,2]. This is because some key process variables are difficult to be measured directly due to the unavailability of sufficient sensing instruments. A soft-sensing model is typically used to predict the unmeasured key variables by taking advantage of the available sensor readings of other process variables. Data-driven modeling has been recognized as an effective tool for the soft sensing of complex industrial processes with uncertainty and high nonlinearity. To achieve a high level of model accuracy and resilience, typically sufficient samples need to be provided for training a data-driven model. However, for certain problems/processes, it can be very expensive, time-consuming, or labor-intensive to collect many data samples as needed by most of the existing algorithms for modeling. This type of limitation may lead to a “small sample problem” in the data-driven modeling of complex chemical processes. The characteristics of limited numbers of useful samples, strong nonlinearity, and uncertainty in the available samples lead to great challenges in establishing accurate data-driven models, which are the basis of achieving optimal process operation [3,4].

Virtual Sample Generation can be used to overcome the limitations of the existing methods in terms of model accuracy when data is insufficient by generating additional virtual samples for modeling [5,6]. The generated virtual samples are used to augment the training data set and improve the data quality, which can enhance the prediction capability of the soft-sensing models. By embracing machine learning and statistical learning, virtual sample generation methods can handle small sample problems in complex environments and have been widely used in different engineering fields. Among them, the deep generation models have strong learning abilities and the capability of generating high-quality virtual samples. The virtual sample generation methods based on deep learning are deep neural network models with multiple hidden layers, which learns and combines low-level features of data to obtain high-level abstract feature representations of data [7,8].

In order to ensure the generation model does not only have the ability to generate virtual samples, but also has the ability to deal with regression prediction problems, the Quantile Regression-based Variational Generative Adversarial Network (QRVAE-GAN) is proposed. The deep generative learning framework QRVAE-GAN includes a Variational Auto-Encoder combined with the Generative Adversarial Network. The Encoder in the Variational Auto-Encoder is used to map the real sample to a potential vector, while the Generator in the Generative Adversarial Network is used to reconstruct the original sample and match the characteristics of the original sample with the given potential vector, to establish the relationship between the potential vector space and the real sample space. The Discriminator is responsible for judging whether the input sample belongs to the real sample probability distribution. The mapping function of the Encoder reduces the training difficulty of the Generator and improves the training speed of the model. And the proposed virtual sample generation model QRVAE-GAN embeds the Quantile Regression output y of the sample as an additional condition into the generative adversarial structure, which affects the generation of the input variable, such that the model can have better prediction ability. This deep generative model can generate labeled samples and can be used to handle sample augmentation in regression prediction problems. QRVAE-GAN can improve the quality of virtual sample generation and increase sample diversity.

In this work, we augment the generated virtual samples to the original data set, and use the expanded training data set to train the data-driven models. This way, the accuracy and robustness of the data-driven soft-sensor models can be improved. We use the multivariable benchmark function to verify the effectiveness of the proposed methods. At the same time, the proposed method is applied to the data-driven modeling of two practical industrial processes: the High-Density Polyethylene production process and the Purified Terephthalic Acid production process. The verification results of multiple benchmark function data sets and two actual industrial process data sets show that the proposed data augmentation method can further improve the performance of data-driven models.

References

[1] Yan W, Tang D, Lin Y. A data-driven soft sensor modeling method based on deep learning and its application. IEEE Transactions on Industrial Electronics, 2016, 64(5): 4237-4245.

[2] Shang C, Yang F, Huang D, et al. Data-driven soft sensor development based on deep learning technique. Journal of Process Control, 2014, 24(3): 223-233.

[3] Gong H F, Chen Z S, Zhu Q X, et al. A Monte Carlo and PSO based virtual sample generation method for enhancing the energy prediction and energy optimization on small data problem: An empirical study of petrochemical industries. Applied Energy, 2017, 197: 405-415.

[4] Li D C, Wen I H. A genetic algorithm-based virtual sample generation technique to improve small data set learning. Neurocomputing, 2014, 143: 222-230.

[5] Wedyan M, Crippa A, Al-Jumaily A. A novel virtual sample generation method to overcome the small sample size problem in computer aided medical diagnosing. Algorithms, 2019, 12(8): 160.

[6] Lopez-Martin M, Carro B, Sanchez-Esguevillas A, et al. Conditional variational autoencoder for prediction and feature recovery applied to intrusion detection in iot. Sensors, 2017, 17(9): 1967.

[7] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial networks. Communications of the ACM, 2020, 63(11): 139-144.

[8] Chen Z S, Hou K R, Zhu M Y, et al. A virtual sample generation approach based on a modified conditional GAN and centroidal Voronoi tessellation sampling to cope with small sample size problems: Application to soft sensing for chemical process. Applied Soft Computing, 2021, 101: 107070.