Project 1 - Just Breathe

Description

Shiny Visualization link

The project aims at analysing the Air Quality data for different states and counties across the United States of America from 1980-2018. The project is developed using R, Shiny and various R packages and is hosted in shinyapps.io.

Libraries Used:
shiny
shinydashboard
ggplot2
lubridate
DT
grid
leaflet
scales
shinycssloaders


Once you click on the link, the application will consist of five tabs on the sidebar for navigation purposes. The detailed description of the tab functionalities are as follows.

AQI data for the year:
This tab consists of the following:
- Pie chart showing percentage of days (good / moderate / unhealthy for sensitive / unhealthy / very unhealthy / hazardous) for the selected year.
- Bar chart showing values(categorized by days) for the selected year.
- Table showing values(categorized by days) for the selected year.
- Pie chart showing percentage of Pollutants (CO, NO2, Ozone, SO2, PM2.5, PM10) for the selected year
- Bar chart showing values(categorized by pollutants) for the selected year.
- Table showing values(categorized by pollutants) for the selected year.

Different state,county can be selected from the sidebar to visualise the change in the AQI data for that chosen year.

Over the years (1980-2018) :
This tab consists of the following:
- Line chart to show the change in Median AQI, 90th Percentile AQI, Max AQI over the years for the selected state,county from the sidebar.
- Line chart to show the percentage change in Pollutants value over the years for the selected county change in Pollutants over the years for the selected state,county.
- Table to show the percentage change in Pollutants value over the years for the selected county.

Like the previous tab, different state,county can be selected from the sidebar to visualise the change in trend over the years.

County Map:
This tab is basically used to plot the state,county in the map using the latitude/longitude values which are derived from the dataset. We need to install leaflet package in order to use this map. Different counties will be plotted in the map based on options selected from the sidebar.

Compare different counties:
This tab has the option for the user to select three years and three states,counties in order to compare different data for different counties. This tab consists of the following.
- Line chart to compare change in 90th Percentile AQI/ Max AQI/ Median AQI value for the selected three counties over the years 1980-2018.
- Bar chart to compare change in 90th Percentile AQI/ Max AQI/ Median AQI value for three counties for the selected year.
- Line chart to compare percentage change in CO/NO2/Ozone/SO2/PM2.5/PM10 for the selected three counties over the years 1980-2018.
- Bar chart to compare change in CO/NO2/Ozone/SO2/PM2.5/PM10 values for three counties for the selected year.
- Line chart to compare percentage change in Good/Moderate/Sensitive for groups/Unhealthy/Very Unhealthy/Hazardous days for the selected three counties over the years 1980-2018.
- Bar chart to compare change in Good/Moderate/Sensitive for groups/Unhealthy/Very Unhealthy/Hazardous days value for three counties for the selected year.

Information:
This tab contains information about the coursework, who developed the project, what libraries are being used to visualize the data and the data source from which the data is downloaded.

Snapshots

About the data

The data has been downloaded from https://aqs.epa.gov/aqsweb/airdata/download_files.html. The data available in the website was categorized by years from 1980 - 2018. I have done a little preprocessing of data before using it. First of all, I have combined all the data (separate county wise data files) as one single data, assigned it to a data variable and used it for plotting purposes. Based on the user input (year/state/county), the necessary data is filtered from the data variable using various R commands. The latitude/longitudinal values which are required for map plotting purposes, are derived from the aqs_sites.csv file. All the 0 values are filtered before plotting/visualising in the pie/bar chart but are used in the line chart to show the variation (change in trend) of data over the years. And all the 0 latitude values are also filtered before plotting the county in the map.

Source Code

Link to download the code (also contains the data files and the Readme.txt file)
YouTube Video

Insights from the data

If we look at the above image, the Max AQI value has been reduced significantly from the 80's. It was somewhere around 220 in the 80's and it has been reduced closer to 175 in 2018. Not only Max AQI, other values such as 90th Percentile AQI and Median AQI are also significantly smaller when compared to previous years. This shows we are doing better in reducing the pollutants. The above image is for Cook County,Illinois. But even for most of the counties, all these three values have decreased over a period of time.

As far as the individual pollutants are concerned, the values of Ozone, PM2.5 and PM10 has been increased over the years for almost all the counties. Other pollutant values are seen to be decreasing from 1980. This means that we have started using machines that emits these pollutants to a greater extent. Even though Max AQI/ 90th Percentile AQI / Median AQI values tend to reduce, these values are seen to be increasing significantly over the years.

Most of the counties in New York are less polluted than expected. Even though it is considered as the most polluted city due to the number of industries and automotives, the AQI data shows that New York is lesser polluted than expected. Max AQI/ 90th Percentile AQI / Median AQI values and the pollutants values are lesser compared to other states like Illinois. The Max AQI values are closer to 100 in NY whereas it is closer to 200 in states like Illinois. Probably, a new rule was implemented in New York which has led to the reduction to air pollution.

In case of North Dakota, which I expected to be least polluted because of its development compared to the other states, tends to be equally polluted like New York. If we look at the Max AQI/ 90th Percentile AQI / Median AQI values from the image above the values are almost closer to the values of New York. The pollutants values also show a similar change in trend. The Max AQI value reached 150 in 2017 which was much greater than New York and almost closer to Illinois. So this shows that being under developed is not the only reason for air pollution and there are other hidden aspects to be considered as well. This could be even because of the machines we use for farming (considered to be high in those areas).