ML Research Engineer / ML Systems Engineer
Model Quality · Efficient Training · Evaluation · PyTorch Systems
I improve AI models where quality meets systems: reproducible experiments, controlled ablations, robust evaluation, and training/inference optimization for speed, memory, and cost.
If I claim an ML improvement, it should have a baseline, an ablation, a metric, and a cost.
I am focused on practical AI improvement: making models better, more stable, faster, and cheaper to run through disciplined experimentation and systems-aware engineering.
| Model Quality | training recipes, fine-tuning, robustness, calibration, data-centric improvements |
| ML Systems Efficiency | profiling, AMP, torch.compile, batching, checkpointing, latency, throughput, memory |
| Research Engineering | baselines, ablations, multi-seed evaluation, tracked configs, reproducible reports |
| Project | Focus | What it demonstrates |
|---|---|---|
| ml-systems-lab | Training/inference efficiency | Profiling PyTorch workloads, measuring latency/throughput/memory, reducing cost with AMP, compile, batching, checkpointing |
| vision-recipe-bench | Model quality through training recipes | Controlled ablations for optimizer, LR schedule, augmentation, EMA, regularization, robustness, calibration |
| small-lm-lab | Small Transformer LM training | Tokenization, sequence packing, perplexity, training loop discipline, efficiency-quality trade-offs |
| nlp-ft-discipline | Fine-tuning stability | Seed variance, calibration, validation hygiene, robust evaluation for Transformer classifiers |
Core ML: Python, PyTorch, Transformers, CNNs, small LMs
Experimentation: W&B / MLflow, Hydra / config-driven runs, ablations, multi-seed evaluation
Efficiency: CUDA/NVIDIA GPUs, AMP, torch.compile, profiling, checkpointing, batching
Engineering: Linux, Docker, Git, GitHub Actions, reproducible pipelines
I am aiming for roles where I can work on the practical side of improving AI systems:
- ML Research Engineer Intern
- ML Systems Engineer Intern
- Applied ML / LLM Engineer Intern
- Model Quality / Evaluation Intern
My preferred work is at the intersection of:
better models + reliable experiments + efficient training/inference
For serious ML projects, I try to include:
train.py,eval.py,configs/,scripts/- fixed seeds and reproducible configs
- baseline + ablation table
- training curves and metric plots
- latency / throughput / memory measurements when relevant
results.mdwith what worked, what failed, and what I would try next