Paper Title
Urban Air Quality Prediction with Feature Voting
Abstract
In the 21st century, increasing air pollution, urbanization, industrialization, vehicle emissions, and other factors
have made breathing in Indian cities difficult. Under the National Clean Air Programme (NCAP) initiated by the Union
government, 131 Indian cities were classified as ‘non-attainment cities’. These cities fell short of the minimum required air
quality standards set by the National Ambient Air Quality Standards(NAAQS). Thus, it is in social interest to produce an
accurate prediction of the air quality of an area using machine learning. Given the dynamic nature of weather data, it is often
noted that such datasets require extensive data preprocessing and effective feature selection as a pre-requisite for satisfactory
training under a machine learning model. It is better to trim the dataset to contain only essential pollutants. However,
standalone feature selection models can be unreliable as they can produce contradicting results on the same dataset. Hence, it
is imperative to combine multiple feature selection models and devise a generalized, cost-effective approach to eliminate
redundant features. This results in better model performance and increased accuracy. Feature Voting can be used to mitigate
indecision during the process of Feature Selection. Further predicting the AQI values (hourly) for each city can help to
consider the pollutants for each city and their individual causes. This will help policymakers to mitigate urban air pollution
more effectively.
Keywords – Machine Learning; Feature Selection; Air Quality; Feature Voting; Random Forest Regression