Learning
Concepts, notes, and questions I want to understand better.
- Agent-native gateway
A production-focused, visual explanation of the agent-native gateway pattern: how coding agents safely discover, authenticate, and call internal APIs, MCP servers, and third-party applications through an open-source enterprise stack.
June 7, 2026 - LLM inference, from request to token
A visual, source-grounded walkthrough of LLM inference and serving using nano-vllm and Mini-SGLang: tokens, prefill, decode, KV cache, scheduling, batching, prefix caching, CUDA graphs, tensor parallelism, and streaming.
May 31, 2026 - The die is not the whole product
A visual note on why advanced AI chips are shaped by packaging, memory, heat, alignment, power, cooling, and manufacturing.
May 24, 2026 - Follow the numbers through an AI chip
A visual explainer about why AI chips are built around data movement, and how that turns into latency, batching, memory bandwidth, and long-context costs.
May 23, 2026 - Ask for the other drafts
A small note on verbalized sampling and using AI to write more creatively.
May 2, 2026 - AI and software
Notes on how AI changes the way software gets made and learned.
April 25, 2026