Skip to content
View prembhajaj's full-sized avatar

Block or report prembhajaj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
prembhajaj/README.md

Hey there, I'm Prem Bhajaj! πŸ‘‹

Thanks for stopping by. I'm excited to share a bit about who I am, what I've done, and what I'm passionate about. Feel free to explore and reach out if you want to chat or collaborate!


A Bit About Me πŸŽ“

I'm a Computer Science grad student at Binghamton University - SUNY, currently diving deep into data structures, machine learning, and social media data science. Before that, I completed my Bachelor's in Computer Engineering at University of Mumbai, where I built a strong foundation in everything from cloud computing to data mining.


My Technical Toolbox βš™οΈ

I love working with a variety of technologies. Here's a quick rundown:

  • Programming Languages: Python 🐍, R, Mulesoft, Java β˜•, JavaScript, Node.js, React.js, REST APIs, Rust, MongoDB, SQL
  • Frameworks & Libraries: LangChain, PySpark, Hadoop, FastAPI, Flask, PyTorch, TensorFlow, NumPy, Pandas, Scikit-learn, Spring
  • Tools & Platforms: Azure, AWS, GCP, Spark, Hive, Sqoop, Power Automate, Git, Maven, Postman, Docker, Messaging Queues, CI/CD, Tableau
  • Certifications: Microsoft AZ-900, AWS Technical Accreditation, Machine Learning A-Zβ„’: Python & R in Data Science

What I've Been Up To πŸ’Ό

At Binghamton University - SUNY

Research Assistant – Data Science & Analytics
Sep 2024 – Present
I'm working on some pretty cool projects:

  • Fine-tuning large language models (think BERT and GPT-3.5) on hundreds of multimodal patient records to boost cancer imaging diagnosis accuracy by 12% 🎯.
  • Engineering features using PCA and optimizing SVM classifiers to hit high accuracy rates while reducing false positives πŸ“ˆ.
  • Using PySpark to process hundreds of thousands of genomic samples, making data harmonization a breeze for clinical research πŸ”„.
  • Running statistical analyses to unearth key biomarkers that could shape future treatments πŸ”.

My Time at LTIMindtree Ltd. in Mumbai

Senior Software Engineer – Data Engineering
Nov 2021 – Jan 2024
I dove deep into data engineering and machine learning:

  • Developed text summaries using Hugging Face models, speeding up issue prioritization by 50% ⚑.
  • Scaled AWS EMR workflows to process over 10M records daily and significantly cut runtimes ⏱️.
  • Built a serverless ML monitoring system that slashed manual reporting from 8 hours a week to just half an hour πŸ“Š.
  • Enhanced data pipelines with Spark to boost throughput and feature engineering capabilities πŸš€.

Graduate Engineer Trainee
Jul 2021 – Nov 2021

  • Developed Python-based ML microservices on Azure Functions, reducing deployment costs by 40% πŸ’‘.
  • Integrated Azure tools to enhance scaling and security within CI/CD pipelines πŸ”’.

Early Adventures at Peregrine PR Pvt Ltd.

Web Development Intern
Jun 2019 – Aug 2019
I got my start by:

  • Building a timesheet application with React, Flask, and SQL that saved managers time and streamlined billing ⏳.
  • Training an NLP-powered chatbot to answer FAQs with impressive accuracy πŸ€–.
  • Assisting with third-party API integrations to boost website functionality and user engagement 🌐.

Projects & Passions πŸš€

I love taking on projects that challenge me and help me grow. Here are a few highlights:

  • Employee Timesheet and Billing Cost Calculator:
    A handy web app using React JS and Python Flask that helps employees log work, managers approve submissions, and clients review project costs.

  • Group Chat and Large-File Transfer Application:
    Built with Python's Socket and GUI libraries, this app supports real-time chatting and file transfers up to 2GB πŸ’¬πŸ“.

  • Facial Image Generation for Suspect Identification:
    Trained a TensorFlow-based DCGAN on over 200K images to generate high-resolution facial images, achieving an impressive FrΓ©chet Inception Distance score and published in Springer ICSES 2021 πŸ–ΌοΈβœ¨.

  • Comprehensive Study of Failed Machine Learning Applications:
    A research project employing a 3C (Consolidation, Classification, Case Studies) approach, culminating in a co-authored chapter in a Taylor & Francis ML journal πŸ“š.

  • Forest Cover Classification and Clustering:
    Leveraged PCA and SVM (with GridSearchCV) to classify forest cover data with high accuracy, and applied K-Means clustering to validate the results πŸŒ²πŸ“Š.

  • Gene Mutation and RNAseq Data Analysis:
    Processed and analyzed data from over 500 NSCLC patient records to study gene expression and survival outcomes 🧬.

  • Social Media Sentiment Analysis - Reddit and 4chan:
    Developed a sentiment scoring pipeline with Spark NLP and Logistic Regression, deployed on AWS EMR to analyze over 100K stock market-related posts πŸ“ˆπŸ’¬.

  • Virtual Chemistry Lab:
    A web-based simulation of 25 chemistry experiments that even led to a published paper in an international journal (IRJET) πŸ”¬.


Beyond Coding: Leadership & Community πŸ‘₯

I believe in sharing knowledge and building community:

  • Editorial Head, ACM Student Chapter:
    Led the publication of our annual technical magazine in 2021 πŸ“°.
  • Co-Technical Head, ICACTA-2020:
    Helped organize and manage the technical aspects of an international conference 🌐.

Let's Connect! πŸ’¬

If you're curious about my work, have a question, or just want to chat about tech and innovation, feel free to reach out:

Looking forward to connecting and collaborating!

Cheers,
Prem

Pinned Loading

  1. OSL-mini-project OSL-mini-project Public

    Python

  2. prembhajaj.github.io prembhajaj.github.io Public

    HTML

  3. psp2108/Face_Generation psp2108/Face_Generation Public

    Python 1

  4. psp2108/SIH2020 psp2108/SIH2020 Public

    JavaScript

  5. hmi-mini-project hmi-mini-project Public

    HTML

  6. LTITimesheet LTITimesheet Public

    JavaScript