(254y) Analysis of Multivariate Statistical Methods Performance Applied to Tropospheric Ozone Level Prediction in the Metropolitan Area of São Paulo

Authors: 
Ramos Rodrigues de Paula, R., Universidade de São Paulo
Guardani, R., University of Sao Paulo
According to the World Health Organization (WHO), in 2012, around 7 million people died due to exposure to atmospheric pollution. That is equivalent to one in eight of total global deaths. Because of that, air pollution is now considered the largest single environmental health risk (WHO, 2014).

For this reason, the prediction of critical episodes of air pollution is a growing concern to authorities, especially in large urban centers, where there is high concentration of people and, therefore, high levels of air pollutants emission.

Tropospheric ozone is one of the most concerning pollutants. An acute exposure of humans to high levels of ozone may lead to biochemical and cellular changes in the lung (Devlin et al., 1991). Ozone is also capable of impacting forest trees in many ways, decreasing growth and productivity (Karnosky 2006).

The tropospheric ozone formation is a result of Volatile Organic Compounds (VOCs) oxidation and photolysis of oxides of nitrogen (NOx). The chemical complexity of atmospheric VOC oxidations is illustrated by the Harwell photochemical trajectory model of Derwent and Jenkin, which contains 384 chemical species and 684 chemical reactions (Finlayson-Pitts and Pitts Jr., 2012).

Additional complexity for ozone formation modeling is given by the fact that ozone precursors, VOCs and NOx, are commonly released to the atmosphere by mobile sources, such as cars and trucks, which cannot be monitored, and the dispersion of these reactants depends on the local wind velocity and direction.

The air quality in large urban centers is affected by a high amount of variables that represent the meteorological conditions and the emission sources. Therefore, model results are highly dependent on the local conditions for each specific region of interest. The existing deterministic models for ozone levels prediction in the lower atmosphere have limitations, given that all chemical and physical phenomena involved in the production and dispersion of pollutants in the urban atmosphere are not yet completely understood.

The proposed models are based on multivariate statistic methods, in the field of Statistical Machine Learning, such as Neural Networks, Random Forests and Discriminant Analysis (Pavón-Domingues, Jiménez-Hornero & Gutiérres de Ravé, 2014) (Burrows et al., 1994) (Guardani et al., 2003).

Neural Networks and Random Forests have been particularly effective in the capture of the complex and non linear relations between meteorological variables and air pollutant concentrations (Arhami, Kamali, & Rajabi, 2013) (Ochando et al., 2015) (Borges & Guardani, 2012).

This study is an extension of previous studies carried out at the University of São Paulo and CETESB, the São Paulo State Environmental Protection Agency (Guardani et al., 2003, Borges et al., 2012), and consists of a comparison of the performance of statistical models, such as Neural Networks and Random Forests, for the prediction of levels of tropospheric ozone in the São Paulo Metropolitan Area (SPMA), which is characterized by high concentration of inhabitants and intense economical activity, and where the air quality is particularly affected by episodes of high ozone levels.

The study focused on the prediction of daily maximum levels of ozone based on hourly values of meteorological variables such as Temperature, Pressure, Relative Humidity and Wind Velocity in two directions, North-South and East-West.

Two different sets of features were used during the tests: 1) the morning and afternoon averages for each of the variables; 2) The maximum daily values of Temperature, the minimum daily values of Pressure and Relative Humidity and the averages of Wind velocity in two directions. The selection of features affected the performance of the models.

Temporal curves of predicted values presented a good agreement with the observed values of maximum ozone levels, both for Neural Networks and Random Forests. However, the highest values of ozone concentration were underestimated by the predictors, indicating that different interactions among the factors exist in these scenarios.

REFERENCES

Arhami, M., Kamali, N., & Rajabi, M. M. (2013). Predicting hourly air pollutant levels usin artificial neural networks coupled with uncertainty analysis by Monte Carlo simulations

Borges, A. S., Andrade, M. F., & Guardani, R. (2012). Ground level ozone prediction using a neural network model based on meteorological variables and applied to the metropolitan area of São Paulo. Int. J. Environment and Pollution .

Burrows, W. R., Benjamin, M., Beauchamp, S., Lord, E. R., McCollor, D., & Thompson, B. (1994). CART Decision-Tree Statistical Analysis and Prediction of Summer Season Maximum Surface Ozone for the Vancouver, Montreal, and Atlantic Regions of Canada. Journal of Applied Meteorology , pp. 1848-1862.

Karnosky, D. F., Skelly, J. M., Percy, K. E., Chappelka, A. H. (2006). Perspectives regarding 50 years of research on effects of tropospheric ozone air pollution on US forests. Elsevier. Environmental Pollution 147, pp. 489-506.

Devlin, B. R., McDonnell, W. F., Mann, R., Becker, S., House, D. E., Schreinemachers, D.,
Hillel S. K. (1991). Exposure of Humans to Ambient Levels of Ozone for 6.6 Hours Causes Cellular and Biochemical Changes in the Lung. American Journal of Respiratory Cell and Molecular Biology. Vol. 4. pp. 72-81.

Finlayson-Pitts, B.J. , Pitts Jr, J.N. (1993). Atmospheric Chemistry of Tropospheric Ozone Formation: Scientific and Regulatory Implications. Air & Waste, 43:8, 1091-1100.

Guardani, R., Aguiar, J. L., Nascimento, C. A., Lacava, C. I., & Yanagi, Y. (2003). Ground-Level Ozone Mapping in Large Urban Areas Using Multivariate Statistica Analysis: Application to the São Paulo Metropolitan Area. J. of Air & Waste Manage. Assoc.

Guardani, R., Nascimento, C., Guardani, M., Martins, M., & Romano, J. (1999). Study of Atmospheric Ozone Formation by Means of a Neural Network-Based Model. Journal of the Air & Waste Management Association .

WHO. (2014). World Health Organization. Acesso em 31 de 05 de 2014, disponível em http://www.who.int/mediacentre/news/releases/2014/air-pollution/en/

Pavón-Domingues, P., Jiménez-Hornero, F. J., & Gutiérres de Ravé, E. (2014). Proposal for estimating ground-level ozone concentrations at urban areas based on multivariate statistical methods. Elsevier.

Ochando, L. C., Julián, C. I. F., Ochando, F. C., Ferri, C. (2015). Airvlc: An application for real-time forecasting urban air pollution.