
I’m a Software Engineer at Walmart Global Tech, building ML infrastructure and platform systems. I work on Kubernetes-based LLM serving, GPU scheduling, and production retrieval systems that handle tens of thousands of queries per second.
I hold an MS in Computer Science from UC Davis and a BTech from SRM Institute. Previously, I worked on RAG systems and inference optimization at Pure Storage and built HIPAA-compliant ML pipelines at UC Davis Health.
I like racing, cricket, and coffee.
I spend most of my time close to production — building systems that serve ML models reliably at scale, debugging latency under real load, and figuring out how to make GPUs do more for less. I care about infrastructure that actually works when it matters.
An open-source platform for regression testing LLM systems. Supports scheduled monitoring, cross-model comparison, LLM-as-a-judge evaluation, and evolutionary prompt optimization. Built with FastAPI, React, and SQLite.
A multilingual customer support agent for e-commerce. Uses embedding-based intent classification, a translate-process-translate architecture, and hard grounding thresholds to prevent hallucination. Includes an automated evaluation suite tracking faithfulness and latency.
A cycle-accurate microarchitecture simulator for branch prediction strategies (GShare, Perceptron). Runs sensitivity analysis across ML, HPC, and server workloads.