MedPerturbing LLMs: A Comparative Study of Toxicity, Prompt Tuning, and Jailbreaks in Medical QA

Published in Proceedings of the AAAI Symposium Series, 2025

This comparative study evaluates the robustness and safety of Large Language Models (LLMs) in the context of medical question-answering. It specifically analyzes system vulnerabilities by testing for toxicity generation, examining the impact of prompt tuning, and exploring susceptibility to jailbreak techniques.

Recommended citation: Asgari, A., Naziri, A., & Seyyed-Kalantari, L. (2025). "MedPerturbing LLMs: A Comparative Study of Toxicity, Prompt Tuning, and Jailbreaks in Medical QA." Proceedings of the AAAI Symposium Series, 7(1), 438-447.
Download Paper