GLOBAL WARMINGS EFFECT ON TEMPERATURE AND HUMIDITY
PERFORMING ANALYSIS OF METEOROLOGICAL DATA
Hypothesis to be tested : The Influence of Global Warming on temperature and humidity.
Libraries:Pandas, scikit-learn , matplotlib
INSTALL LIBRARIES
For any data analysis project we are using jupyter notebook as a editor. By using pip install numpy, pandas, matplotlib and seaborn. Now start with follwing code:
import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
For this project we are using a csv file which is available there on kaggle platform. Now import this csv file in the project. You can get this csv file on the link below:
Dataset
df = pd.read_csv('weatherHistory.csv')
df.head()
For analysing any dataset we first need to check dataset size,shape and is there any null values present, if yes then we have to clean it but here we are getting a clean dataset.
df.shape
df.dtypes
df.isnull().sum()
For our hypothesis we need to manage data according to dates so we are managing date format.
df['Formatted Date'] = pd.to_datetime(df['Formatted Date'], utc=True)
df = df.set_index("Formatted Date")
df.head(2)
We need to calculate average monthly temperature and humidity to make a conclusion so first we are resampling the data according to month.
data_columns = ['Apparent Temperature (C)', 'Humidity']
df_monthly_mean = df[data_columns].resample('MS').mean()
df_monthly_mean.head()
From 2006 to 2016 graphical representation of monthly mean temperature vs mean humidity.
import seaborn as sns
import warnings
warnings.filterwarnings("ignore")
plt.figure(figsize=(14,6))
plt.title("Variation in Apparent Temperature and Humidity with time")
sns.lineplot(data=df_monthly_mean)
For above code we will get below graph. According to this graph even though effects of global warming we are not going to see much difference in the average absolute temperature and the average humadity. As shown in the below graph we can clearly see that humidity is on the same line while temperature is changing in 2009 is goes up while in 2010 it goes down and there is significant drop in temperature in 2012 and then again rise in 2016.
RETRIVING DATA FOR APRIL MONTH
Retriving apparent temprature and humidity for particular month here for april month of every year
df1 = df_monthly_mean[df_monthly_mean.index.month==4]
print(df1)
df1.dtypes
We are using matplotlib library for graphical representation here.
USING PYPLOT
import matplotlib.dates as mdates
fig, ax = plt.subplots(figsize=(18,7))
ax.plot(df1.loc['2006-04-01':'2016-04-01', 'Apparent Temperature (C)'], marker='o', linestyle='-',label='Apparent Temperature (C)')
ax.plot(df1.loc['2006-04-01':'2016-04-01', 'Humidity'], marker='o', linestyle='-',label='Humidity')
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d %m %Y'))
ax.legend(loc ='center right')
ax.set_xlabel('Month of April')
USING LMPLOT
sns.lmplot(x='Apparent Temperature (C)',y='Humidity',data=df_monthly_mean)
plt.show()
USING HEATMAP
corr = df_monthly_mean.corr()
sns.heatmap(corr)
USING DISTPLOT
sns.distplot(df.Humidity,color='red')
sns.relplot(data=df,x="Apparent Temperature (C)", y="Humidity", color="purple", hue="Summary")
plt.figure(figsize = (14 , 5))
sns.barplot(x = 'Apparent Temperature (C)', y = 'Humidity', data = np.round(df1, decimals=2))
According to all above graphs and analysis there is No change in average humidity. The year 2009 can see an increase in average apparent temperature, then a fall in 2010, then a slight increase in 2011, then a significant drop in 2015, and then an increase in 2016.
I am thankful to mentors at https://internship.suvenconsultants.com for providing awesome problem statements and giving many of us a Coding Internship Exprience. Thank you www.suvenconsultants.com
Comments
Post a Comment