ShlokVFX

Follow

🏠

Working from home

Shlok_Limbhare ShlokVFX

🏠

Working from home

Follow

GPU dev

40 followers · 676 following

Achievements

Achievements

Organizations

Pinned Loading

100-days-cuda 100-days-cuda Public

This repository documents my 100-day journey of learning and writing CUDA kernels.

Jupyter Notebook 30 1
SageAttention SageAttention Public

Forked from thu-ml/SageAttention

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda
ThunderKittens ThunderKittens Public

Forked from HazyResearch/ThunderKittens

Tile primitives for speedy kernels

Cuda
Kernels Kernels Public

Python