publications

2026

  1. flowguard.png
    Securing Multimodal AI through Internal Information Decomposition
    Jehyeok Yeon, Hyeonjeong Ha, Qiusi Zhan, and 1 more author
    Proceedings of the 43rd International Conference on Machine Learning (ICML 2026) , 2026
    Spotlight

2025

  1. gsae.png
    GSAE: Graph-Regularized Sparse Autoencoders for Robust LLM Safety Steering
    Jehyeok Yeon, Federico Cinus, Yifan Wu, and 1 more author
    2025
    Preprint. Under review.
  2. llmcertT.png
    Certifying Robustness of Agent Tool-Selection Under Adversarial Attacks
    Jehyeok Yeon, Isha Chaudhary, and Gagandeep Singh
    ICLR 2026 Agentic AI in the Wild Workshop , 2025
  3. trap.png
    TRAP: Targeted Redirecting of Agentic Preferences
    Jehyeok Yeon*, Hangoo Kang*, and Gagandeep Singh
    Proceedings of the 39th Conference on Neural Information Processing Systems (NeurIPS 2025) , 2025
  4. The Power of Friendship: Analyzing Leadership and Adversarial Attacks in Multi-Agent Collaboration
    Jehyeok Yeon
    2025
    Poster accepted to ACM Collective Intelligence 2025; Non-archival