Rotary Position Embedding (RoPE) deep dive
0. Why positional encoding (PE)?Standard attention has no sense of order. Give it “I ate food” or “food ate I” —> same output We need to inject position info explicitly Two flavors: Absolute PE