Skip to content
View kaiyang-code's full-sized avatar
πŸš€
πŸš€
  • Databricks
  • Mountain View

Block or report kaiyang-code

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
kaiyang-code/README.md

About

My name is Kaiyang, and I am a master's student at Carnegie Mellon University. I have always been fascinated by the world of programming, and I consider myself lucky to have found a career that aligns with my passions. Throughout my career, I have had the opportunity to work with a variety of programming languages and technologies, which has allowed me to grow and develop as a programmer. While I have gained a considerable amount of knowledge and experience from working for an early-stage startup, a big tech company, and multiple academic researches, I am always eager to continue learning and improving. I'm currently looking for full-time software engineering opportunities with a particular interest in distributed systems, cloud computing, and operating systems. Thank you for viewing my profile, and I hope you will find what you are looking for :)

Experiences

Databricks | Full-Time | 02/2025 - πŸš€

  • Part of the IAM Infra team designing and developing highly scalable next-generation systems to serve millions of requests per second

Amazon Web Services | Full-Time | 07/2024 - 01/2025

  • Part of DynamoDB Core Services team building hyper-scale distributed system with sub-10ms response time for workloads of all sizes
  • Designed and implemented a high-impact tooling for dynamic config overrides on storage nodes in Java, using RPC to efficiently publish overrides without full AZ redeployments, cutting operational overhead by nearly 100% and greatly boosting system uptime
  • Initiated and implemented a production-ready log trimming process in Java based on an investigation on race conditions caused by table partition splits during nodes redeployment downtime, resulting in a $1.5 million cost reduction for log storage annually
  • Led design and fix to the low-level Paxos algorithm, resolving a long-standing issue with leader replica removal that caused recurring elections and system lag, resulting in over 100ms reduction in P100 latency and greatly enhancing system responsiveness

Analytics 4 Everyone LLC | Internship | 05/2023 - 08/2023

  • Implemented and migrated a large-scale ML-native service on GCP using Python for an early-stage startup, resulting in a highly scalable and fault-tolerant model that analyzed over 60 million podcast episodes on the fly for 10,000+ users
  • Designed the entire backend workflow and streamlined the backend build to an IaC environment using Terraform and Shell Script from scratch, utilizing GKE, Dataproc, PostgreSQL, HBase, Kafka, and Cloud VPC/Firewalls/DNS as the backend infrastructures
  • Developed an archiving process that periodically deletes MP3 files on HBase using Pydoop, reducing disk usage by around 55%; containerized the program using Docker and deployed it on GKE as a CronJob
  • Built an advanced HPA algorithm that scales Kubernetes Pods based on Kafka messaging load using KEDA, reducing the processing time by around 88% (from 800+ hours to within 72 hours) while ensuring pods resiliency on GKE
  • Created a secure CI/CD pipeline using Maven as a build tool in Jenkins; integrated the pipeline with Cloud Shell, GKE, and SonarQube, automatically pushing new codes to the staging environment while detecting and alerting vulnerabilities

Google | Internship | 05/2022 - 08/2022

  • Implemented three multi-cloud data analytics workflow DAG scripts using Python and SQL, providing sample codes for data transferring and processing jobs between GCP, AWS, and Azure in Cloud Composer (managed Apache Airflow)
  • Authored and published three tutorials that showcased the implementations, configurations, dependencies, and entire schedulings of multi-cloud DAGs in Cloud Composer, utilizing GCS, BigQuery, Dataproc, S3, and Blob Storage
  • Elevated the original data analytics workflow using PySpark and DAG scripts, providing a more advanced usage of Cloud Composer that incorporated a Dataproc Serverless batch job for the developers; my pipeline has been merged with the GCP sample code base
  • My work provided the first-ever usage of Cloud Composer in a multi-cloud environment, which fulfilled a highly demanded request and received a 95%+ positive feedback rate from Google’s developer community

Multidisciplinary Design Program | Part-time | 01/2021 - 01/2022

  • Built an accessible driving simulator platform using modeling software OpenDS and Roadrunner to support research on driver distraction, driver workload, and interfaces for partially automated vehicles
  • Developed a car-following task consistent with the federal driver distraction protocol independently using Pygame and CARLA in Python; my simulation system will be used for research purposes shortly
  • Created a realistic map of the UMich campus with landscapes, such as roads, constructions, etc., by utilizing three types of data (Aerial data, LiDAR data, and elevation data) in Roadrunner, providing a virtual environment for the dynamic/static motion of automobiles operated by test subjects

University of Michigan | Internship | 05/2021 - 08/2021

  • Developed Trace Collector, a Python library that recorded both manual and automated web browser actions into JSON files, providing inputs for the backend programming synthesis algorithm
  • Implemented and categorized more than 180 benchmarks to imitate actual actions of users using Trace Collector and Selenium API, which helped the development of our Domain-Specific Language (DSL) and testing of frontend codes (JavaScript, CSS, HTML)
  • Implemented a programming synthesis technique that transformed unstructured expressions into looping programs using E-graph, Equality Saturation, and Inverse Transformation in Rust

Pinned Loading

  1. FedScale-1 FedScale-1 Public

    Forked from SymbioticLab/FedScale

    FedScale: Benchmarking Model and System Performance of Federated Learning

    Python

  2. GoogleCloudPlatform/python-docs-samples GoogleCloudPlatform/python-docs-samples Public

    Code samples used on cloud.google.com

    Jupyter Notebook 7.6k 6.5k

  3. Distributed-System-6.5840 Distributed-System-6.5840 Public

    for self-studying MIT distributed system course 6.5840

    Go

  4. arrow-py/arrow arrow-py/arrow Public

    🏹 Better dates & times for Python

    Python 8.8k 694