Skip to content

Commit

Permalink
fix: 🐛 sync update on more
Browse files Browse the repository at this point in the history
  • Loading branch information
yambottle committed Mar 27, 2024
1 parent c8b6841 commit 7c9deee
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions more.html
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ <h3>Software Engineer(DevOps) - DataJoint</h3>
Company's [<a href="https://github.com/datajoint">Open-source Github</a>] and [<a href="https://datajoint.com/works">Commercialized Product</a>]
</div>
<div>
<b class="italic-text">* AWS:</b> Administrated the company's AWS account and several other customers' AWS accounts. Configured <b>VPC</b>, <b>Subnet</b>, <b>Security Groups</b>, <b>IAM</b> role and policies, <b>S3</b> lifecycle management, <b>EFS</b> access point, <b>EC2</b> instances, <b>RDS</b> instances, <b>Lambda</b> triggered by <b>SQS</b> or <b>EventBridge</b>, <b>SNS</b> and <b>SES</b>, <b>CloudWatch</b> metrics and alarms, <b>Route 53</b> DNS records, <b>Secrets Manager</b> for deployment secrets.
<b class="italic-text">* AWS:</b> Administrated DataJoint's AWS account and several other customers' AWS accounts. Configured <b>VPC</b>, <b>Subnet</b>, <b>Security Groups</b>, <b>IAM</b> role and policies, <b>S3</b> lifecycle management, <b>EFS</b> access point, <b>EC2</b> instances, <b>RDS</b> instances, <b>Lambda</b> triggered by <b>SQS</b> or <b>EventBridge</b>, <b>SNS</b> and <b>SES</b>, <b>CloudWatch</b> metrics and alarms, <b>Route 53</b> DNS records, <b>Secrets Manager</b> for deployment secrets.
</div>
<div>
<b class="italic-text">* CI/CD: </b> Developed generic <b>Github Actions</b> reusable workflows used by <b>30+</b> repositories followed by <a href="https://www.conventionalcommits.org/en/v1.0.0/">Conventional Commits</a>, <a href="https://learn.microsoft.com/en-us/devops/develop/how-microsoft-develops-devops">Release Flow</a> and <a href="https://opengitops.dev/">GitOps</a> best practices, to automate build, test, release, publish private or open-source <b>Python</b> packages[<a href="https://pypi.org/search/?q=datajoint">PyPI</a>] or deploy <b>Docker</b> images[<a href="https://hub.docker.com/u/datajoint">Dockerhub</a>].
Expand All @@ -144,7 +144,7 @@ <h3>Software Engineer(DevOps) - DataJoint</h3>
<b class="italic-text">* Kubernetes: </b> Provisioned Kubernetes clusters for development, staging and production environments using <b>k3d or kOps</b>. Developed utility <b>bash</b> scripts with <b>helm</b> and <b>kubectl</b> to manage Kubernetes clusters more efficiently, including configuring <b>Nginx ingress</b> controller, cert manager with <b>Let's encrypt</b> issuer, <b>Cillium</b> Container Network Interface(CNI), IAM Roles for Service Account(<b>IRSA</b>), <b>Cluster Autoscaler</b>, AWS Elastic Load Balancer(<b>ELB</b>) or deploying applications like Percona XtraDB Clusters, Keycloak, JupyterHub, Flask and ReactJS based web application, etc.
</div>
<div>
<b class="italic-text">* Ephemeral Worker Clusters: </b> Designed and developed a worker lifecycle manager using Python in about a month to fulfill an <b>urgent</b> business requirement. This development <b>polls</b> jobs from a MySQL database, then provisions and configures ephemeral EC2 instances by <b>Packer(pre-build AMI), Terraform and cloud-init</b> to compute jobs <b>at scale</b>; implemented AWS S3 mount to significantly reduce raw data downloading <b>overhead</b> and added EFS as a file cache for intermediate steps to improve computation <b>failover</b>; configured <b>NVIDIA CUDA toolkit</b> and <b>NVIDIA container runtime</b> for <b>GPU</b> workers.
<b class="italic-text">* Ephemeral Worker Clusters: </b> Designed and developed a worker lifecycle manager using Python within one month to fulfill an <b>urgent</b> business requirement. This development <b>polls</b> jobs from a MySQL database, then provisions and configures ephemeral EC2 instances by <b>Packer(pre-build AMI), Terraform and cloud-init</b> to compute jobs <b>at scale</b>; implemented AWS S3 mount to significantly reduce raw data downloading <b>overhead</b> and added EFS as a file cache for intermediate steps to improve computation <b>failover</b>; configured <b>NVIDIA CUDA toolkit</b> and <b>NVIDIA container runtime</b> for <b>GPU</b> workers.
</div>
<div>
<b class="italic-text">* Platform Automation: </b> To provision or terminate AWS resources using <b>boto3</b> or <b>Terraform</b>; manage customers' <b>RBAC</b> permissions using Keycloak and Github REST API; generating usage and billing report with <b>AWS S3 Inventory</b> report, <b>AWS CloudTrail</b> and <b>AWS Cost and Usage</b> report, made a <b>Plotly Dash</b> to analyze cost and usage efficiency.
Expand All @@ -153,13 +153,13 @@ <h3>Software Engineer(DevOps) - DataJoint</h3>
<b class="italic-text">* Jupyterhub: </b> Configured and maintained Jupyterhub deployment on a Kubernetes cluster with <b>Node Affinity</b> to assign pods onto different nodes by requirements and <b>Cluster Autoscaler</b> along with <b>AWS Auto Scaling Group</b> to accommodate <b>100+</b> active users; improved base images' <b>build time</b> and maintenance <b>overhead</b>.
</div>
<div>
<b class="italic-text">* Observability:</b> Implemented small part of the metrics and alerts using <b>AWS CloudWatch</b>, and then later integrated <b>Datadog</b> for Kubernetes clusters' and ephemeral EC2 instances' metrics and logging through <b>OpenTelemetry</b> protocol, synthetic API testing, UI/UX monitoring.
<b class="italic-text">* Observability:</b> Implemented a small part of the metrics and alerts using <b>AWS CloudWatch</b>, and then later integrated <b>Datadog</b> for Kubernetes clusters' and ephemeral EC2 instances' metrics and logging through <b>OpenTelemetry</b> protocol, synthetic API testing, and UI/UX monitoring.
</div>
<div>
<b class="italic-text">* Security:</b> Set up codebase <b>vulnerability</b> scan with FOSSA; Set up <b>AWS Secrets Manager</b> working with <b>External Secret Store Operator</b> to secure Kubernetes secrets; Deployed and administrated self-hosted <b>Keycloak</b> for <b>RABC</b> authentication, further integrate it with <b>AWS IAM</b> as an <b>identity provider</b> to access AWS resources through <b>STS</b>, learned about OpenID Connect(<b>OIDC</b>) authentication flow such as authorization code flow, client credential flow, password grant flow etc.
<b class="italic-text">* Security:</b> Set up codebase <b>vulnerability</b> scan with FOSSA; Set up <b>AWS Secrets Manager</b> working with <b>External Secret Store Operator</b> to secure Kubernetes secrets; Deployed and administrated self-hosted <b>Keycloak</b> for <b>RABC</b> authentication, further integrated it with <b>AWS IAM</b> as an <b>identity provider</b> to access AWS resources through <b>STS</b>, enabled OpenID Connect(<b>OIDC</b>) authentication flows such as authorization code flow, client credential flow, password grant flow etc.
</div>
<div>
<b class="italic-text">* MySQL Database:</b> Maintained a self-hosted <b>Percona XtraDB Clusters</b> on database <b>daily backup</b> stored on <b>S3</b>, <b>mysqldump</b> backup redundancy, Point-in-Time Recovery(<b>PITR</b>), <b>deadlock</b> detection, slow query log.
<b class="italic-text">* MySQL Database:</b> Maintained a self-hosted <b>Percona XtraDB Clusters</b> on database <b>daily backup</b> stored on <b>S3</b>, <b>mysqldump</b> backup redundancy, Point-in-Time Recovery(<b>PITR</b>), <b>deadlock</b> detection, and slow query log.
</div>
</div>
</li>
Expand Down

0 comments on commit 7c9deee

Please sign in to comment.