Predicting Hourly Pm10 Concentration in Seberang Perai and Petaling Jaya Using Log-Normal Linear Model

Prediction of particulate matter (PM10) concentration is useful to assist future planning in the context of environmental problems. The aim of this study is to predict hourly PM10 concentration by considering the probability distribution, the serial correlation between subsequent observations and the seasonal pattern of the PM10 concentration. We propose to analyze the data using the log-normal linear models with potential predictors. In the initial study, three probability distributions (Weibull, gamma, and log-normal) were considered to fit the hourly PM10 concentration for two stations of Peninsular Malaysia – Seberang Perai and Petaling Jaya from 2008-2009 and 2014-2015. Within the distributions, the log- normal distribution was found appropriate. Then, the log-normal linear model with sine and cosine terms, and lagged as predictors were fitted to the PM10 concentration data. The Likelihood Ratio Test (LRT) and the Akaike Information Criterion (AIC) were used to assess model appropriateness. Diagnostic QQ plots indicate that the models fit the data well except in the extreme tails. The models were fitted using the first 75% of data and validated using the remainder. Using the model, prediction of inadequate hourly PM10 records is possible. Index Terms- air quality, lagged, log-normal linear model, regression, statistical modelling.