Open to Backend / Infra roles • Bay Area • MSCS @ Georgia Tech

I build infrastructure-grade backends that stay correct under retries and stay debuggable in production.

5+ years owning production services and shipping end-to-end systems: idempotency, concurrency, caching, and observability — plus cloud infrastructure (Kubernetes + Terraform).

Java / Python / C++
Kubernetes + Terraform
Redis + Postgres + SQL

Production impact (quick proof) scan

Not just “built features” — reduced downtime, MTTR, and p99 latency on real systems.

50K+
daily active users (owned services)
40%
p99 latency reduction (peak)
30–35%
downtime / MTTR improvement
200K+
messages/min (load-tested)

What I’m best at signal

Correctness-first design: idempotency, retries, isolation, observability. I optimize systems so incidents are rarer and debugging is faster.

Idempotency Concurrency Caching SLO mindset K8s + Terraform Production debugging
02. Deployed Infrastructure & Systems

Cloud Resume Challenge (GCP)

GCPCloud FunctionsTerraformCI/CD
live
graph LR User[Visitor] -->|HTTPS| CDN[CDN/LB] CDN -->|Get Site| Bucket[GCS Bucket] CDN -->|API Call| func[Cloud Function] func -->|Update| DB[(Firestore)] Repo[GitHub] -->|Push| Build[Cloud Build] Build -->|Deploy| func & Bucket style func fill:#4285F4,stroke:#fff,color:#fff style DB fill:#FFCA28,stroke:#333,color:#000

Full-stack serverless architecture handling real-time visitor counting. Zero manual deployments permitted.

  • Infrastructure as Code: Entire GCP stack (Firestore, Functions, IAM) provisioned via Terraform.
  • CI/CD Pipeline: GitHub Actions automatically runs Python tests and triggers Cloud Build for deployment.
  • Backend Logic: Python Cloud Function implementing atomic database increments and CORS handling.

Idempotent Transaction Processor

PythonRedisPostgreSQLCorrectness
backend
graph LR req[API Request] -->|Key| lock{Redis Lock} lock -->|Exists| cached[Return 200] lock -->|New| process[Process Payment] process --> db[(Postgres)] process -->|Success| cache_result[Cache Result] style lock fill:#b93f50,stroke:#fff,stroke-width:2px

Prevents double-billing during network retries using distributed locking and idempotency keys.

  • Guarantees exactly-once side effects for at-least-once message delivery.
  • Implemented atomic Redis `SETNX` locks with TTL to prevent deadlocks during crash loops.
  • Optimized for high-concurrency financial transaction throughput.

Concurrent Messaging Engine

C++ThreadsSystemsProfiling
performance
sequenceDiagram participant Client participant Queue participant Worker Client->>+Queue: Publish (Non-Blocking) loop Worker Thread Queue->>+Worker: Dequeue (Locked) Worker->>Worker: Batch Process end Worker-->>-Client: Async Ack

High-throughput messaging framework exploring thread safety, contention, and memory models.

  • Implemented thread-safe blocking queues using `std::mutex` and `std::condition_variable`.
  • Profiled CPU contention to tune worker thread pool size for optimal throughput.
  • Handled graceful shutdown and memory cleanup to prevent leaks.
Experience
I’ve owned production backend services end-to-end: design, deployment, monitoring, and incident response.

Senior Software Developer — Backend & Infrastructure

Curves n’ Colors Pvt. Ltd • Multi-tenant platforms • 50K+ DAU
Aug 2019 – Sep 2022
  • Reduced peak p99 latency ~40% using Redis caching + SQL tuning
  • Improved availability ~35% by fixing concurrency + retry failure modes
  • Built metrics/logging + runbooks to reduce MTTR ~30%
MicroservicesRedisPostgreSQL/MySQLCI/CDObservability

Co-Founder & Lead Engineer

ThunderCodes Pvt. Ltd • E-commerce / booking platforms • 10x traffic spikes
Jul 2019 – Sep 2022
  • Owned full SDLC: architecture → deploy → monitor → maintain
  • Designed async workflows to decouple ingestion from downstream processing
  • Improved rollout safety using containerized deployments + rollback strategy
AWSDockerKubernetesTerraform

Senior Backend Developer

KlientScape Software • Financial / HR systems • correctness-first
Jul 2016 – Feb 2019
  • Improved throughput ~40% via schema refactors, indexing, and query tuning
  • Built robust error-handling pipelines reducing recurring failures ~25%
SQLSchema designPerformance

MSCS Candidate (Systems) — Georgia Tech (OMSCS)

Operating Systems • Distributed Systems • Networks • Security
2023 – Present
  • Hands-on labs: concurrency, distributed patterns, failure recovery
  • Built local cloud-native labs using Kubernetes + Terraform
GIOSComputer NetworksInfoSec
Skills
Focused keywords that match infra/backend roles (Discord/Orb-style): reliability, concurrency, observability.

Backend & Distributed Systems core

Concurrency Idempotency Caching Fault tolerance Load balancing Service isolation Retries + backoff Ordering guarantees

Cloud, Infra & Ops infra

Kubernetes Terraform Docker AWS (EC2/S3/IAM) CI/CD Incident response Runbooks SLO mindset

Observability debug

Metrics Logging Alerting Tracing Prometheus Grafana ELK

Languages & Data tools

Java Python C++ SQL Bash PostgreSQL MySQL Redis
Leadership & Mentorship
Ownership signal: co-founder delivery + Code Ninjas center leadership.

Center Director / Instructor — Code Ninjas

MentorshipOperationsTeaching
leadership

Lead operations and mentor learners in programming, debugging, and project execution.

  • Standardized onboarding + improved teaching consistency
  • Helped students build real projects (Unity / JS / Python)

Co-Founder — Startup Delivery

0→1CustomersOn-call
ownership

Built and operated systems under real constraints: traffic spikes, outages, and customer expectations.

  • End-to-end responsibility: architecture → deploy → support
  • Created monitoring + playbooks to reduce blast radius

Systems learning (applied)

OMSCSLabsReliability
growth

I build small, demonstrable labs that map directly to platform interviews: failure recovery, throughput, and debugging.

  • K8s + Terraform drills (self-healing + reproducibility)
  • Concurrency + message passing experiments
Contact

Fastest way

EmailDirect
contact

karmacharya.sushov@gmail.com

Email

Profiles

GitHubLinkedIn
proof

See code, writeups, and work history.

What I’m looking for

BackendPlatformInfra
roles

Backend / Platform / Infra roles where reliability and ownership matter: correctness, latency, observability, safe deployments.

Copied.