Wonbeom
Jang
Toggle navigation
About
blog
Publications
CV
ctrl k
weak-to-strong
an archive of posts with this tag
May 29, 2026
Removing RLHF Protections in GPT-4 via Fine-Tuning — 340예시로 frontier API 깨기