gdymind's blog
  • Home
  • Archives
  • Categories
  • Tags
  • About

26 posts in total


2026

04-27
DeepSeek V4 attention: how it handles longer context (English video)
04-19
Rotary Position Embedding (RoPE) deep dive
04-04
vLLM-Omni deep dive
03-08
Pallas examples by Sharad Vikram (Pallas author)
03-07
jax.jit, torch.compile & CUDA graph
03-02
KV cache in sliding-window attention
02-26
XLA02 - shapes, layout & tiling
02-25
XLA01 - architecture & workflows
02-22
Knowledge Distillation 101
02-19
GPU mode - lecture2 - CUDA 101
123

Search

Hexo Fluid
visited times unique visitors: