Experience

Senior Staff DevOps Engineer

Simpl — One Sigma Technologies • March, 2022 — June, 2024

  • Led migration of Flink pipelines from EC2 cluster to EKS for improved efficiency & resource isolation.
  • Introduced IP address based whitelisting for User IAM credentials to block access from outside known networks.
  • Troubleshot & fixed numerous issues in infrastructure over my tenure:
    • Built Grafana dashboards for quicker identification of issues across clusters.
    • Resolved critical issues in Opensearch, Dask, Jupyterhub & Concourse clusters.
    • Fixed dependency issues in Dask & Airflow cluster deployments, multiple times.
  • Enhanced PostgreSQL performance:
    • Implemented Table partitioning.
    • Migrated data from a very large table(3 TB in size) to new partitioned table.
    • Automated partition management/lifecycle with pg_partman & pg_cron.
  • Cost Optimization & Performance:
    • Implemented Karpenter on EKS utilizing Spot instances.
    • Migrated RDS PostgreSQL databases to Aurora IO‑Optimized.
    • Started the migration of a 6TB DynamoDB table to S3, Athena & Glue.
    • Implemented Blue‑Green migrations for upgrading AWS RDS engine versions.
    • Implemented VPC Endpoints for S3, DynamoDB to ensure security & optimize costs.
    • Identified & added lifecycle rule for cleanup of incomplete multipart uploads.
    • Migrated big‑data infrastructure to Spot instances: Dask EC2 clusters & ECS clusters.
    • Contributed to the setup of scalable Jupyter clusters on ECS backed by Spot instances.

SRE Consultant

Ordway — Billing and Revenue Automation • March, 2021 — March, 2022

  • Hardened Docker images & EC2 AMIs.
  • Configured auto‑scaling on EKS clusters with custom metrics.
  • Introduced Spot Instances for non‑prod workloads to optimize costs.
  • Orchestrated migration from Heroku to AWS EKS for improved scalability:
    • performed dozens of trial migrations
    • identified issues in elasticsearch, database migrations & fixed them before actual migration
    • minimized downtime by figuring out optimal ordering of migration steps
    • migrated with zero critical issue

SRE Lead

Arcesium • April, 2019 — March, 2021

  • Led Incident Management & Root Cause Analysis.
  • Automated internal workflows & assisted L1 support.

Site Reliability Engineer

Adobe • February, 2017 — April, 2019

  • Saved significant costs by optimizing AWS S3 storage.
  • Handled on‑call duties & regular SRE tasks, including automation.
  • Developed an internal Chaos testing service using Flask, Celery & SaltStack.

Lead Devops

GoFro.com — Bonavita Technologies • November 2015 — February, 2017

  • Implemented monitoring with OMD server & check_mk.
  • Managed AWS & in‑house infrastructure, including deployment automation with Jenkins and SaltStack.

Education

International Institute of Information Technology, Pune

PGDBA — Information Technology • 2007 — 2009

  • Conducted Business Quiz
  • Conducted Linux Workshop

Kolhapur Institute of Technology's College Of Engineering — Shivaji University

Bachelor of Engineering — Electronics • 2007

Skills

  • Kubernetes
  • Terraform
  • Microsoft Azure
  • Python3
  • Boto3
  • Databases
  • Jenkins
  • High Availability
  • Resilience
  • Amazon Web Services
  • EKS, EC2, S3, Route53, Athena, Glue, Cloudformation, RDS, Dynamodb, Load Balancers, SNS, SQS, Lambda, API Gateway, Autoscaling

  • Linux System Administration
  • Shell Scripting, Troubleshooting, Hardening

  • Containers
  • Containerd, Docker

  • Configuration Management
  • Ansible, Saltstack

Recognition

Awarded twice

Simpl • 2023

Awarded both for individual and team contributions.

Quarterly achievement award

Adobe • 2018

Received for individual contributions and achievements within a quarter. Nominated multiple times for these awards.

Awarded twice

Snapdeal • 2014

Received an annual & a quarterly award for individual achievements.

Additional Links