Recent Blog Posts

Here are some of my latest blog posts:

Transformer LM — Resource Accounting (Parameters and FLOPs)

October 20, 2025

Transformer Pre Norm Transformer Block

October 17, 2025

Transformer NN Structure — Functional vs Module (Linear)

October 17, 2025

Transformer From Scratch — Transformer Language Model (TransformerLM)

October 17, 2025

Transformer From Scratch — Linear Module (Step-by-Step)

October 17, 2025

Transformer From Scratch — Position-Wise Feed-Forward Network (SwiGLU)

October 17, 2025

Transformer From Scratch — Embedding Module (Step-by-Step)

October 17, 2025

Transformer From Scratch — Pre-Norm Transformer Block

October 17, 2025

Transformer From Scratch — Softmax Implementation

October 17, 2025

Transformer From Scratch — Scaled Dot-Product Attention

October 17, 2025

Transformer From Scratch — Rotary Position Embeddings (RoPE) Implementation

October 17, 2025

Transformer From Scratch — Causal Multi-Head Self-Attention

October 17, 2025

Softmax: What It Does, Stability Trick, and a Brief History

October 10, 2025

Transformer FFN with SwiGLU

October 09, 2025

Rotary Position Embeddings (RoPE)- Intuition, Math, and Examples

October 09, 2025

Architecting for Stability- Pre-Normalization, RMSNorm, and Mixed-Precision Training

October 08, 2025

From tokenizer to uint16 dataset with encode_iterable

September 21, 2025

10K vs 32K Tokenizers Yield Similar Bytes per Token

September 21, 2025

Bytes → UTF-8 → BPE — why not just number the alphabet?

September 21, 2025

Books List

March 25, 2024

Visual Explanation of Transformer with Dimensions

March 25, 2024

LLM Prompts Dump

March 17, 2024

Attention Mechanism

March 17, 2024

Speculative Inferencing

March 05, 2024

LLM Inferencing Optimization

February 28, 2024

Cheat Sheet- Understanding Kubernetes Architecture

February 01, 2024

Understanding LLM Inferencing Challenges and Tools

January 29, 2024

Mixture of Experts (MoE)

January 14, 2024

KV Cache in Transformers- Detailed and Simplified Guide

January 03, 2024

Quantisation LLM

December 22, 2023

LLMOps 1

November 01, 2023

LLM buy vs build

November 01, 2023

Python Production

November 01, 2023

Setting Up WSL Terminal Windows

October 21, 2023

Paper - ATTENTION SINKS

October 15, 2023

Open Source Contribution

October 09, 2023

Your Blog Post Title

October 09, 2023

Paper - PROMPTBREEDER

October 09, 2023

Welcome to Jekyll!

October 01, 2023