AI Chatbots Can be Tricked to Bypass Safety Features by Using Poetry, New Study Reveals

Speaking or writing like a poet is no longer used for daily conversations in this day and age, but a new study discovered that using poetry to talk to an AI chatbot could bypass its safety guardrails.

AI Chatbots May Bypass Safety Guardrails by Using Poetry

A new study published by Icaro Labs found a new exploit on chatbots that banks on poetry to bypass the generative AI's safety guardrails and gain access to prohibited content or topics.

The study, titled "Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models," details how the researchers were able to discover it and test if it could indeed bypass the systems.

The study shows how they transformed regular, conversational prompts into a poetic form of speaking to the AI, and this helped them achieve a stellar success rate to trick it into giving out prohibited responses.

According to the researchers, their tests found that using the poetic form "operates as a general-purpose jailbreak operator" for the chatbot.

The team behind the study did not reveal the poem-style prompts they used for the jailbreak attempts to crack chatbots into sharing prohibited content nor did they publish them in their paper.

According to the team (via Wired), these prompts are too dangerous to share with the public, particularly as in the study, they were able to squeeze out information from the chatbot on sensitive content.

According to Icaro Labs, they were able to ask the chatbot for the steps or materials needed to make a nuclear bomb, child sexual abuse materials (CSAM), or self-harm information.

Study Unveils These Chatbot Users May Bypass

According to the team, they tested this poetic form of prompt-making on popular chatbots in the industry, including OpenAI's ChatGPT, Google's Gemini, Anthropic's Claude, and more.

Based on their findings, LLMs like the Google Gemini, DeepSeek, and MistralAI were the chatbots that consistently gave answers after using the poetic exploit to extract prohibited content from it. The company did not reveal what kind of specific answers they got from these chatbots.

That being said, OpenAI's ChatGPT with GPT-5 and Anthropic's Claude with the Haiku 4.5 model were the better performers as the researchers claimed that they were the "least" ones to get bypassed by using the poetic form.

Tags:AI Chatbot

Join the Discussion

AI Chatbots Can be Tricked to Bypass Safety Features by Using Poetry, New Study Reveals

Being a poet and wooing AI chatbots could unlock the internet's dark side.

AI Chatbots May Bypass Safety Guardrails by Using Poetry

Study Unveils These Chatbot Users May Bypass

'Resident Evil' Showcase Is Happening on January 15 With New Announcements from Capcom

'Arc Raiders' First-Person Hack Stuns Fans With Surreal Immersion

Android vs iPhone Battery Management Explained: Charging Speed, Efficiency, and Longevity

'ARC Raiders' Loadouts Guide: Expert Solo vs Squad Tips and Weapon Selection Strategies

'Clair Obscur: Expedition 33' Party Synergies Guide – Best Builds, Rotations, and Combos