Skip to content

This repository hosts multiple data analysis projects, showcasing a variety of real-time and batch processing pipelines. Each project highlights different tools and technologies, offering comprehensive solutions for data streaming, storage, and visualization.

Notifications You must be signed in to change notification settings

evanmathew/Data-Analysis-Projects

Repository files navigation

Data Science and Machine Learning Portfolio

Repository containing portfolio of data science projects completed for academic, self learning, and professional purposes. Presented in the form of Jupyter Notebooks.

Tools

  • Python: NumPy, Pandas, Seaborn, Matplotlib, Plotly, OOPS, Sklearn
  • Machine Learning: Linear and Logistic Regression, Random Forest Classification, Decission Tree
  • Database Management: MySQL, Excel
  • Data Visualization Tools: Tableau, Microsoft PowerBi, Microsoft Excel

Contents

  • 🤖🧠 Machine Learning

    • Heart Failure Analysis & Prediction Using Logistic Regression Model: I leveraged my data analysis and machine learning skills to conduct a comprehensive study on heart failure prediction using logistic regression. Heart failure is a critical health concern, and early detection can significantly improve patient outcomes. By applying data science techniques to a large dataset of patient records, I aimed to develop an accurate predictive model.
    • Titanic Survival Analysis & Classification ML Model: Explore the Titanic Survival Prediction project, where I dive into data analysis and machine learning. As a beginner in data science, I've analyzed the Titanic dataset to uncover insights into passenger demographics and survival factors. I've built predictive models using various machine learning algorithms, including Logistic Regression, Random Forest, Decision Tree, and more. I competed in the Kaggle competition and learned valuable skills in feature engineering and model evaluation. Join me on this exciting data-driven journey!!
    • [New incmoming]
  • 📊 Data Analysis and Visualization (Python Programming)

    • Black Market Sales Analysis: Based off on black friday sales details of a retail store. I have meticulously cleaning and transforming it. With Seaborn and Matplotlib, I visualized compelling patterns and trends, unveiling valuable insights into customer purchasing behavior's and and top 5 popular products that have been purchased.
    • Apple Iphone Flipkart Sales Analysis: This project involved analyzing iPhone sales on Flipkart using Python to understand sales trends, popular iPhone models, and the impact of discounted prices on sales performance. The project used Python, Jupyter Notebook for data analysis, and Flipkart sales data to analyze customer behavior and sales data during promotional periods.
    • Sugarcane Production By Every Countries Analysis: I aimed to analyze and visualize the production of sugarcane across different countries. I collected data on several key aspects of sugarcane production, including total production, production per person in kilograms, the country with the highest production, and the yield of sugarcane in each country.
    • Exploratory Data Analysis (EDA) for Twitter Sentiment Analysis on Flight-Related Tweets: In this engaging project, I aim to dive deep into the world of social media data to analyze sentiments expressed by users on Twitter regarding their flight experiences.
    • Rossman Store Sale Analysis:In this project, I conducted a comprehensive data analysis of Rossman Store Sales data to derive valuable insights and inform strategic decisions. The project encompassed various aspects, including sales trends, store performance, customer behavior, and the impact of promotions and holidays.
  • 🕸️ Web Scraping

    • YouTube Data Scraping: I harnessed the power of web automation and data scraping techniques using Selenium to gather valuable insights and information from YouTube, one of the world's largest video-sharing platforms. By automating the data collection process, I was able to retrieve and analyze data on videos such as views, likes, date of publishing, and the video description.
    • Scrapping Wikipedia content through Google search: The primary objective of this project is to develop a Python script that can scrape relevant Wikipedia content through Google search queries. This project aims to demonstrate web scraping, data manipulation using Python.
    • Stock Image Data Scraping: This project was just an exercise to gather information about images, including the number of likes, comments, image URLs, and unique image IDs, enabling more informed content selection and analysis
  • 🔌📈 Power-Bi Dashboard

    • Employee Presence HR Data Analysis: Leveraging real data from Atliq Technology, this tool optimizes project timelines and office space usage. I also had the opportunity to enhance my Power BI skills with Data Analysis Expressions (DAX)
  • Minor Projects

If you enjoyed what you saw, want to have a chat with me about the portfolio, work opportunities, or collaboration, feel free to contact me on: - [LinkedIn][(https://www.linkedin.com/in/evansajumathew)]

evansajumathew evansajumathew

About

This repository hosts multiple data analysis projects, showcasing a variety of real-time and batch processing pipelines. Each project highlights different tools and technologies, offering comprehensive solutions for data streaming, storage, and visualization.

Topics

Resources

Stars

Watchers

Forks