mehroosali

Follow

🎯

Focusing

Mehroos Ali mehroosali

🎯

Focusing

Follow

MS CS graduate at the University of Texas at Dallas | Data Engineer | Software Developer

10 followers · 9 following

Richardson, Texas

Achievements

Achievements

mehroosali/README.md

< Hello World, I'm Mehroos Ali />

I am a collaborative data engineering professional with substantial knowledge and experience in analysis, design, development, implementation, migration, convergence, management, and support of large-scale databases, data warehouses, and big data systems by creating intuitive architectures and frameworks that help organizations effectively capture, store, process, visualize and analyze huge volume of structured, semi-structured, unstructured and stream of heterogeneous data set.
I am currently pursuing my Masters in Computer Science at the University of Texas at Dallas specializing in Intelligent Systems.
I have previously interned at Amazon as Data Engineer this past summer where I gained knowledge and experience working with design and development of streaming data pipelines.
I have previously worked as a Data Engineer for Onward Technologies which is a global IT service provider in domains such as data analytics, data science, Artificial Intelligence (AI) and Machine Learning (ML). Before that I was working with Cognizant on their flagship Core Banking and Insurance customer - Suncorp.
I am interested in Big Data Engineering, Cloud Data Warehousing, Devops and Full Stack Development.
📩 Feel free to reach me at [email protected].

🛠 My Toolkit

🏆 Github Stats

🤝 Let's stay connected!

Pinned Loading

databricks-F1-Project databricks-F1-Project Public

A data pipeline project build on databricks and azure to demostrate lifecycle of a cloud data project.

Jupyter Notebook 6 5
s3-redshift-batch-etl-pipeline s3-redshift-batch-etl-pipeline Public

Built functional python ETL script with functions that initialized spark clusters using pyspark library to extract songs stored in S3 bucket. Partitioned songs data by year and artist_id and compre…

Python 5 3
bigquery-sparksql-batch-etl bigquery-sparksql-batch-etl Public

Batch ETL pipeline project on GCP to load and transform daily flight data using Spark to update tables in BigQuery. The pipeline is automated using Airflow.

Python 2
ABCStoresPipeline ABCStoresPipeline Public

Batch ETL data pipeline built on HDP 3.0 to process daily sales and business data to procedure power Bi reports. Automated the pipelines using Airflow.

Scala
Twitter-Sentiment-Analysis Twitter-Sentiment-Analysis Public

personal project to pull live Twitter data using Nifi getTwitter processor and pushes to Kafka topic which is then consumed by a Spark Streaming application where basic sentiment analysis is perfor…

Scala 2 1
Realtime-Customer-Viewership-Analysis Realtime-Customer-Viewership-Analysis Public

data pipeline using the lambda architecture is created for the unification and consolidation of real-time customer web events, weblogs, and profile data into a hive warehouse for adhoc analysis.

Scala