π Machine Learning Engineer | Systems Builder | LLM Infra
I build high-performance AI systems that actually run in production.
From optimizing LLM inference pipelines to working deep in operating systems, I enjoy solving problems across the entire stack β from kernels to large language models.
- β‘ Optimize LLM/VLM systems for latency, throughput, and scale
- π Build end-to-end AI pipelines (RAG, agents, training systems)
- π Design intelligent systems that replace manual workflows
- π§© Work across systems: OS β backend β ML β infra
- π§ LLM inference optimization (TensorRT-LLM, vLLM, SGLang)
- π Retrieval-Augmented Generation (RAG) systems
- π€ Agentic workflows using LangGraph
- βοΈ Distributed systems & Kubernetes-based deployments
Languages
Python C++ Rust
ML / AI
PyTorch TensorRT-LLM Triton vLLM SGLang
Systems & Infra
Kubernetes Docker AWS GCP
Other
AOSP Linux Kernel RAG Pipelines LoRA Quantization
- β‘ Improved LLM performance by 60%+ using speculative decoding & KV cache optimization
- π Built a VLM-based document parsing system (98%+ accuracy)
- π€ Developed autonomous agents for processing real-world business workflows
- π§ͺ Built RAG pipelines to generate automated integration tests from codebases
- π± Former Android OS engineer working on kernel, SELinux & device security
- π§ Linux Kernel / systems programming
- π€ LLM / ML infrastructure
- π Developer tools & infra-heavy projects
- π§ Email: check profile
- π LinkedIn
- π» GitHub
- π± Telegram / Instagram / Unsplash:
@pranavthombare
- π₯ I can use nunchucks
- ποΈ Trekked to Everest Base Camp
Build things that are not just impressive β but useful, scalable, and real.



