May 11, 2026 TRL sequence packing → DeepSeek MLA: 누락된 cu_seqlens 복원 May 10, 2026 MLA 학습 시 modeling-side projection fusion: q_a/kv_a 배치 + K-side absorption