Hamza's Blog
Home
Blog
02 Oct, 2025
AWQ: Activation-Aware Weight Quantisation
11 Sep, 2025
Paged Attention from First Principles: A View Inside vLLM
25 Aug, 2025
Worklog: Optimising GEMM on NVIDIA H100 for cuBLAS-like Performance (WIP)