TimeSeries Analysis and Prediction for Covid-19 Infection In India

Share At:

[ This Article was originally published on medium on May29, 2020 ]

As Corona virus is spreading exponentially and has caused so much damage to the mankind across the globe, it will be interesting to analyse and predict the trend of spread.

Here we’ll be performing time series analysis on covid-19 infected patients in India and will perform the forecasting for next 2 months ( June and July , 2020)

About The Dataset

The Dataset has been downloaded from Kaggle and have been formatted and cleaned before using it.

Dataset Link

Importing Necessary Libraries

Load The Dataset

Datetime Conversion

Below graph shows the Covid-19 infection count ( 01 Mar — 28 May)

Visualizing Trend, Seasonal and Irregular components

Plotting Rolling-Mean and Rolling-Standard-Deviation for the dataset

Performing Augmented Dickey-Fuller Test to check if dataset is stationary

We can see from above ADF test that the dataset is not stationary.

Taking Log of the dataset since the dataset is not stationary

Train , Test Split of the dataset

Using AR model to Fit the train data

Make Predictions

Checking Mean Squared Error

Plotting AR model For the Train data

Using ARIMA Model for Prediction

The output:

Fitting the model:

Plotting Predictions for 2 More months( June and July) with 95% Confidence:

Forecasting for 2 Months:

Plotting Of The Forecast

The Final Forecast

Below Graph shows the final prediction. while the Blue line represents the actual data point, the red line represents the “prediction”. The prediction is made till July, 2020.


While I have taken utmost care while analysing the dataset, however suggestions are welcome. The prediction is made based on current trend and might change if there’s any change in trend.

Hope you enjoy the reading!!!

Share At:
0 0 votes
Article Rating
Notify of
Inline Feedbacks
View all comments
Back To Top

Contact Us