Won's Blog

공부 및 실험 공유

A.X K1 Technical Report

A.X K1 논문 리뷰 — 519B MoE 모델의 아키텍처, 데이터 파이프라인, Think-Fusion 학습 전략

9 min read · 2026

TelAgentBench: A Multi-faceted Benchmark for Evaluating LLM-based Agents in Telecommunications

TelAgentBench 논문 리뷰 - 통신 도메인 LLM 에이전트의 5가지 핵심 역량 평가 벤치마크

24 min read · 2026

TelBench: A Benchmark for Evaluating Telco-Specific Large Language Models

TelBench 논문 리뷰 — 통신 도메인 특화 LLM 벤치마크의 설계, 구축, 평가

22 min read · 2026

FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling

FlashAttention-4 논문 리뷰 — Blackwell GPU의 비대칭 스케일링에 맞춘 파이프라인 재설계와 소프트웨어 지수함수

11 min read · 2026

FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision

FlashAttention-3 논문 리뷰 — Hopper GPU의 비동기 실행과 FP8을 활용한 Attention 최적화

19 min read · 2026

Triton 07: Flash Attention 3 — Triton으로 어디까지 가능한가

Hopper 전용인 Flash Attention 3를 Triton으로 어디까지 따라잡을 수 있는가 — 확장 autotune·persistent kernel·실패한 실험까지

10 min read · 2026

Triton 06: Flash Attention 2 — FA1 대비 5가지 최적화

Flash Attention 2를 Triton으로 구현한다 — un-scaled 누적, exp2 트릭, Causal 2-stage, tl.dot accumulator, autotune

12 min read · 2026

Triton 05: Flash Attention — 종합 프로젝트

Flash Attention을 Triton으로 구현한다 — Forward/Backward 전체 구현과 RTX 4080·A100·H100·B200 아키텍처별 최적화 포인트

19 min read · 2026

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

FlashAttention-2 논문 리뷰 — non-matmul FLOPs 감소, 병렬화, warp partitioning 개선

12 min read · 2023

Pod의 모든 것 — 생성부터 스케줄링까지

Kubernetes의 최소 배포 단위 Pod — 사이드카 패턴, nginx 실습, Pod가 뜨는 5단계, 생명주기 phase까지

22 min read · July 06, 2026

2026 · kubernetes infra pod scheduling · infra
Kubernetes 아키텍처 — Control Plane과 Node

etcd, kube-apiserver, scheduler, kubelet — 클러스터를 움직이는 컴포넌트들의 역할과 설계 의도를 kind 클러스터에서 직접 확인한다

14 min read · July 06, 2026

2026 · kubernetes infra architecture etcd scheduler · infra
내 노트북에 Kubernetes 클러스터 만들기 — kind와 kubectl

kind로 노트북에 멀티노드 Kubernetes 클러스터를 띄우고, kubectl과 kubeconfig(clusters/users/contexts)로 클러스터를 다루는 법

15 min read · July 06, 2026

2026 · kubernetes infra kind minikube kubectl · infra
Kubernetes의 탄생 — Google Borg에서 CNCF까지

docker run으로 충분하지 않은 순간은 언제 오는가 — Google Borg에서 시작해 CNCF까지, Kubernetes가 태어난 이유와 역사

14 min read · July 06, 2026

2026 · kubernetes infra docker cncf borg · infra
Tamper-Resistant Safeguards (TAR) — Fine-tuning 자체에 견디는 safety

White-Box Safety 시리즈 #12 (마지막) — adversarial fine-tuning을 수천 step 가해도 safety가 견디도록 학습한 tamper-resistant safeguards, open-weight 시대의 마지막 방어선 (Tamirisa et al., CAIS / UIUC / UC Berkeley 외, ICLR 2025)

6 min read · May 30, 2026

2026 · llm red-teaming safety paper defense tamper-resistance meta-learning fine-tuning-defense · paper

Won's Blog

공부 및 실험 공유

A.X K1 Technical Report

TelAgentBench: A Multi-faceted Benchmark for Evaluating LLM-based Agents in Telecommunications

TelBench: A Benchmark for Evaluating Telco-Specific Large Language Models

FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling

FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision

Triton 07: Flash Attention 3 — Triton으로 어디까지 가능한가

Triton 06: Flash Attention 2 — FA1 대비 5가지 최적화

Triton 05: Flash Attention — 종합 프로젝트

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Pod의 모든 것 — 생성부터 스케줄링까지

Kubernetes 아키텍처 — Control Plane과 Node

내 노트북에 Kubernetes 클러스터 만들기 — kind와 kubectl

Kubernetes의 탄생 — Google Borg에서 CNCF까지

Tamper-Resistant Safeguards (TAR) — Fine-tuning 자체에 견디는 safety