Hey, I’m Ajay 👋
Growing up, JARVIS wasn’t just a cool movie trick — it was a blueprint. Technology that works for you, not against you. That idea stuck. It’s still why I’m here.
I build and benchmark ML infrastructure for large-scale production systems — specifically at the intersection of inference optimization, evaluation frameworks, and high-concurrency LLM performance. At LinkedIn, that’s meant autorater pipelines, RLHF loops, and hallucination monitoring used by 15+ teams running 100+ eval jobs an hour.
What I actually spend time on:
-
Inference engines — throughput, latency, and memory benchmarking across vLLM, SGLang, and Ray
-
Cost architecture — semantic caching, dynamic routing, quantization tradeoffs that hold up under real load
-
Rigorous evals — moving past superficial metrics to calibrated pipelines that give real signals
This site is a notebook of real builds — the architectural decisions, the things that failed in ways benchmarks never predicted, and what actually held up in production.
A JARVIS that hallucinates isn’t helpful. It’s just stressful in a new way. Everything here is about closing that gap.