Jehyeok Yeon

yeonj.jpg

Oh, hey there! I’m Jehyeok Yeon, an incoming PhD candidate at Max Planck Institute for Intelligent Systems with Maksym Andriushchenko with a focus on building intelligent systems that are both technically robust and socially responsible. I was previously at the University of Illinois Urbana-Champaign, where I was previously advised by Professor Gagandeep Singh at the FOCAL Lab.

My current research interest focuses on scalable oversight and “superalignment”. As AI models become more capable, I believe it’s more important than ever to be able to continue evaluating and analyzing these models, even if the models become smarter than us. This could mean creating more difficult tasks with no ground truth answer, finding scalable ways to control and interpret models, or reverse-engineering their internal representations to guarantee their hidden objectives match our instructions.

Outside of research, I care a lot about writing, both creative and academic, and how it shapes the way we think. I spend a lot of time walking and thinking through ideas, and I try to catch theatre performances whenever I can. There’s something powerful about live storytelling that reminds me why I study intelligence in the first place. When I’m not writing or watching something live, I’m usually reading somethingMoby Dick — Herman Melvillecurrently on my nightstand.

The fastest way to reach me is through tommy8289@gmail.com. Feel free to reach out if you have any interesting ideas to throw around!

news

Jun 17, 2026 InferenceBench accepted to the Agents in the Wild (AIWILD) workshop at ICML 2026 as a Spotlight! :tada:
May 28, 2026 Excited to be joining London AI Safety Research Labs as a Research Scholar this summer, working on the Science of Evaluations with the UK AI Security Institute!
May 28, 2026 GSAE accepted to the AI4GOOD workshop at ICML 2026!
May 5, 2026 FlowGuard accepted to ICML 2026 as a Spotlight paper! :tada:
Apr 13, 2026 Offered acceptance for the NSF Graduate Research Fellowship Program!

latest posts

selected publications

2026

  1. flowguard.png
    Securing Multimodal AI through Internal Information Decomposition
    Jehyeok Yeon, Hyeonjeong Ha, Qiusi Zhan, and 1 more author
    Proceedings of the 43rd International Conference on Machine Learning (ICML 2026) , 2026
    Spotlight
  2. InferenceBench: A Benchmark for Open-Ended LLM Inference Optimization by AI Agents
    Jehyeok Yeon, Ben Rank, and Maksym Andriushchenko
    ICML 2026 Agents in the Wild (AIWILD) Workshop , 2026
    Spotlight
  3. llmcertT.png
    Certifying Robustness of Agent Tool-Selection Under Adversarial Attacks
    Jehyeok Yeon, Isha Chaudhary, and Gagandeep Singh
    ICLR 2026 Agentic AI in the Wild (AIWILD) Workshop , 2026

2025

  1. trap.png
    TRAP: Targeted Redirecting of Agentic Preferences
    Jehyeok Yeon*, Hangoo Kang*, and Gagandeep Singh
    Proceedings of the 39th Conference on Neural Information Processing Systems (NeurIPS 2025) , 2025