Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions

Publication
Forty-second International Conference on Machine Learning
Yuxin Xiao
Yuxin Xiao

Yuxin Xiao is a Ph.D. student at MIT IDSS. His research focuses on ethical and deployable LLMs for healthcare, with a particular interest in evaluating and enhancing LLM alignment in safety and faithfulness. Yuxin obtained his M.S. in Machine Learning at Carnegie Mellon University and his B.S. in Computer Science and B.S. in Statistics and Mathematics at the University of Illinois at Urbana-Champaign.

Related