Hey, I’m Ajay 👋

Growing up, JARVIS wasn’t just a cool movie trick — it was a blueprint. Technology that works for you, not against you. That idea stuck. It’s still why I’m here.

I build and benchmark ML infrastructure for large-scale production systems — specifically at the intersection of inference optimization, evaluation frameworks, and high-concurrency LLM performance. At LinkedIn, that’s meant autorater pipelines, RLHF loops, and hallucination monitoring used by 15+ teams running 100+ eval jobs an hour.

What I actually spend time on:

Inference engines — throughput, latency, and memory benchmarking across vLLM, SGLang, and Ray
Cost architecture — semantic caching, dynamic routing, quantization tradeoffs that hold up under real load
Rigorous evals — moving past superficial metrics to calibrated pipelines that give real signals

This site is a notebook of real builds — the architectural decisions, the things that failed in ways benchmarks never predicted, and what actually held up in production.

A JARVIS that hallucinates isn’t helpful. It’s just stressful in a new way. Everything here is about closing that gap.

Let’s connect →

About

Hey, I’m Ajay 👋