Hamza's Blog

02 Oct, 2025 AWQ: Activation-Aware Weight Quantisation
11 Sep, 2025 Paged Attention from First Principles: A View Inside vLLM
25 Aug, 2025 Worklog: Optimising GEMM on NVIDIA H100 for cuBLAS-like Performance (WIP)