Won's Blog

공부 및 실험 공유

A.X K1 Technical Report

A.X K1 논문 리뷰 — 519B MoE 모델의 아키텍처, 데이터 파이프라인, Think-Fusion 학습 전략

9 min read · 2026

TelAgentBench: A Multi-faceted Benchmark for Evaluating LLM-based Agents in Telecommunications

TelAgentBench 논문 리뷰 - 통신 도메인 LLM 에이전트의 5가지 핵심 역량 평가 벤치마크

24 min read · 2026

TelBench: A Benchmark for Evaluating Telco-Specific Large Language Models

TelBench 논문 리뷰 — 통신 도메인 특화 LLM 벤치마크의 설계, 구축, 평가

22 min read · 2026

FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling

FlashAttention-4 논문 리뷰 — Blackwell GPU의 비대칭 스케일링에 맞춘 파이프라인 재설계와 소프트웨어 지수함수

11 min read · 2026

FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision

FlashAttention-3 논문 리뷰 — Hopper GPU의 비동기 실행과 FP8을 활용한 Attention 최적화

19 min read · 2026

Triton 07: Flash Attention 3 — Triton으로 어디까지 가능한가

Hopper 전용인 Flash Attention 3를 Triton으로 어디까지 따라잡을 수 있는가 — 확장 autotune·persistent kernel·실패한 실험까지

10 min read · 2026

Triton 06: Flash Attention 2 — FA1 대비 5가지 최적화

Flash Attention 2를 Triton으로 구현한다 — un-scaled 누적, exp2 트릭, Causal 2-stage, tl.dot accumulator, autotune

12 min read · 2026

Triton 05: Flash Attention — 종합 프로젝트

Flash Attention을 Triton으로 구현한다 — Forward/Backward 전체 구현과 RTX 4080·A100·H100·B200 아키텍처별 최적화 포인트

19 min read · 2026

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

FlashAttention-2 논문 리뷰 — non-matmul FLOPs 감소, 병렬화, warp partitioning 개선

12 min read · 2023

Refusal Direction & Abliteration — 거부는 하나의 방향이다

White-Box Safety 시리즈 #1 — open-weight LLM의 거부 행동이 residual stream의 단일 방향에 매개됨을 증명, 가중치 직교화로 alignment 무력화 (Arditi et al., NeurIPS 2024)

11 min read · May 29, 2026

2026 · llm red-teaming safety paper abliteration refusal-direction white-box mechanistic-interpretability · paper
PKU-SafeRLHF-30K: A Dual-Preference Dataset for Safe-RLHF

Red-Teaming 시리즈 #27 — BeaverTails의 preference 자매판 30K, helpful·harmless를 두 라벨로 분리한 RLHF용 dual-rating 표준 (Ji et al., PKU-Alignment, NeurIPS 2023)

11 min read · May 29, 2026

2026 · llm red-teaming safety paper defense dataset rlhf · paper
사이버 보안에서의 LLM: 공격·방어·평가의 지형

사이버 보안 LLM 시리즈의 도입부 — secure coding에서 자율 공격·방어까지의 전개, 그리고 이를 측정하는 벤치마크 지형(Cybench, CVE-Bench, CyberSecEval, CTIBench 등) 개관

4 min read · May 26, 2026

2026 · llm cybersecurity security benchmark evaluation paper · paper
Claude Mythos와 사이버 보안 LLM: 자율 취약점 발견의 변곡점

Anthropic Claude Mythos가 보여준 자율 zero-day 발견·익스플로잇 능력과, 이를 측정하는 사이버 보안 LLM 벤치마크(Cybench, CyberSecEval, CVE-Bench 등) 정리

9 min read · May 26, 2026

2026 · llm cybersecurity security benchmark evaluation paper · paper
Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models

Cybench 논문 리뷰 — 프로 CTF 40과제와 subtask로 LLM 에이전트의 자율 사이버 공격 역량을 평가하는 사실상의 표준 벤치마크

11 min read · May 26, 2026

2026 · llm cybersecurity security benchmark evaluation paper · paper

Won's Blog

공부 및 실험 공유

A.X K1 Technical Report

TelAgentBench: A Multi-faceted Benchmark for Evaluating LLM-based Agents in Telecommunications

TelBench: A Benchmark for Evaluating Telco-Specific Large Language Models

FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling

FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision

Triton 07: Flash Attention 3 — Triton으로 어디까지 가능한가

Triton 06: Flash Attention 2 — FA1 대비 5가지 최적화

Triton 05: Flash Attention — 종합 프로젝트

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Refusal Direction & Abliteration — 거부는 하나의 방향이다

PKU-SafeRLHF-30K: A Dual-Preference Dataset for Safe-RLHF

사이버 보안에서의 LLM: 공격·방어·평가의 지형

Claude Mythos와 사이버 보안 LLM: 자율 취약점 발견의 변곡점

Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models