COVID-19 Vaccination Data Analysis

Overview

This repository contains code and documentation for analyzing COVID-19 vaccination data. The analysis includes data preprocessing, exploratory data analysis, statistical tests, and data visualization. This README provides an overview of the project's phases and instructions for running and replicating the analysis.

Phases

The analysis is divided into five phases:

Phase 1: Data Preparation

Load the vaccination data from the provided CSV file.
Inspect the data's structure using df.info(), df.tail(), and df.columns.
Remove unnecessary columns ('source_name' and 'source_website') using df.drop().
Clean and describe the data using df.describe().
Handle missing values by filling them with zero.
Convert data types: 'total_vaccinations,' 'people_vaccinated,' 'people_fully_vaccinated,' 'daily_vaccinations_raw,' 'daily_vaccinations' to 'int64' and 'iso_code' to 'string.'
Save the preprocessed data to a new CSV file using df.to_csv().

Phase 2: Exploratory Analysis and Visualization

Load the preprocessed data.
Perform exploratory analysis and visualization using Python libraries (e.g., pandas, seaborn, matplotlib).
Calculate the mean, min, max, and correlations within the dataset.
Explore country data, including the number of unique countries.
Explore the minimum and maximum values of fully vaccinated people.
Explore the minimum and maximum dates in the dataset.
Visualize the number of daily vaccinations over time.
Visualize the distribution of total vaccinations by vaccine.
Visualize the relationship between total vaccinations and people vaccinated.
Visualize the comparison between countries and the number of fully vaccinated people.

Phase 3: Statistical Analysis

Import the necessary Python libraries (pandas, numpy, scipy.stats).
Select the 'total_vaccinations' data and define an 'expected_mean.'
Perform a one-sample t-test to compare the selected data to the expected mean.
Print the test statistic (t) and p-value.
Check for statistical significance based on a chosen alpha level.
Calculate descriptive statistics for the 'total_vaccinations' data (mean, median, standard deviation, variance).
Conduct correlation analysis between numeric columns.

Phase 4: Model Building

Split the data into training and testing datasets.
Scale the data for better model performance.
Choose a model (e.g., simple linear regression) based on problem formulation and dataset characteristics.
Fit the data to the selected model.
Evaluate model performance, focusing on the R-squared value.

Phase 5: Visualization using IBM Cognos Analytics

Load the preprocessed dataset into IBM Cognos Analytics.
Create and customize visualizations using the software:
- Visualize the number of daily vaccinations over time.
- Visualize the distribution of total vaccinations by vaccine.
- Visualize the relationship between total vaccinations and people vaccinated.
- Visualize the comparison between countries and the number of fully vaccinated people.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
DAC_Phase1.docx		DAC_Phase1.docx
DAC_Phase2.docx		DAC_Phase2.docx
DAC_Phase3.docx		DAC_Phase3.docx
DAC_Phase4.docx		DAC_Phase4.docx
DAC_Phase5.docx		DAC_Phase5.docx
DAC_code.ipynb		DAC_code.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

COVID-19 Vaccination Data Analysis

Overview

Phases

Phase 1: Data Preparation

Phase 2: Exploratory Analysis and Visualization

Phase 3: Statistical Analysis

Phase 4: Model Building

Phase 5: Visualization using IBM Cognos Analytics

About

Releases

Packages

Languages

Kabi-45/covid-vaccine-analysis

Folders and files

Latest commit

History

Repository files navigation

COVID-19 Vaccination Data Analysis

Overview

Phases

Phase 1: Data Preparation

Phase 2: Exploratory Analysis and Visualization

Phase 3: Statistical Analysis

Phase 4: Model Building

Phase 5: Visualization using IBM Cognos Analytics

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages