{"description":"Trending threats, MITRE ATT\u0026CK coverage, and detection metadata — refreshed continuously.","feed_url":"https://feed.craftedsignal.io/tags/ai-jailbreak/","home_page_url":"https://feed.craftedsignal.io/","items":[{"_cs_actors":[],"_cs_cves":[],"_cs_exploited":false,"_cs_products":["M365 Copilot"],"_cs_severities":["high"],"_cs_tags":["prompt-injection","ai-jailbreak","m365","copilot"],"_cs_type":"advisory","_cs_vendors":["Microsoft"],"content_html":"\u003cp\u003eMicrosoft 365 Copilot is susceptible to jailbreak attempts via prompt injection, where users craft specific prompts designed to bypass or override safety controls. These attacks involve injecting malicious instructions into user prompts to manipulate the AI\u0026rsquo;s behavior, potentially leading to the disclosure of sensitive information, the generation of harmful content, or the execution of unauthorized actions. The attacks leverage techniques like rule manipulation, system bypass commands, and AI impersonation requests, attempting to circumvent built-in safety mechanisms. Successful jailbreaks can compromise the integrity and security of Copilot, enabling threat actors to exploit the AI for malicious purposes.\u003c/p\u003e\n\u003ch2 id=\"attack-chain\"\u003eAttack Chain\u003c/h2\u003e\n\u003col\u003e\n\u003cli\u003eAn attacker crafts a malicious prompt containing specific keywords and phrases designed to manipulate Copilot\u0026rsquo;s behavior.\u003c/li\u003e\n\u003cli\u003eThe attacker injects the prompt into M365 Copilot through a standard user interface, like a chat window.\u003c/li\u003e\n\u003cli\u003eCopilot processes the prompt, attempting to interpret the user\u0026rsquo;s intent.\u003c/li\u003e\n\u003cli\u003eIf the prompt is successfully injected, Copilot\u0026rsquo;s safety controls are bypassed or overridden due to prompt injection techniques.\u003c/li\u003e\n\u003cli\u003eCopilot generates a response based on the manipulated instructions in the prompt, potentially providing unauthorized access to information or functionality.\u003c/li\u003e\n\u003cli\u003eThe attacker exfiltrates sensitive data or uses Copilot to perform actions outside its intended scope.\u003c/li\u003e\n\u003cli\u003eThe attacker leverages the compromised Copilot to create and disseminate malicious content.\u003c/li\u003e\n\u003c/ol\u003e\n\u003ch2 id=\"impact\"\u003eImpact\u003c/h2\u003e\n\u003cp\u003eSuccessful jailbreak attempts can lead to the disclosure of sensitive company data, generation of harmful or inappropriate content, and circumvention of organizational security policies. A single successful jailbreak can affect multiple users if the generated content is shared. If successful, internal copilots could be used to create phishing messages or generate code that gives the attacker a reverse shell on a machine. The risk is increased due to the widespread adoption of M365 Copilot across various industries.\u003c/p\u003e\n\u003ch2 id=\"recommendation\"\u003eRecommendation\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eEnable M365 Exported eDiscovery Prompts logging to capture user interactions with Copilot, as this log source is crucial for detecting jailbreak attempts.\u003c/li\u003e\n\u003cli\u003eDeploy the Sigma rules provided in this brief to your SIEM to identify potential jailbreak attempts based on suspicious keywords and patterns in user prompts.\u003c/li\u003e\n\u003cli\u003eImplement filtering mechanisms based on the \u003ccode\u003em365_copilot_jailbreak_attempts_filter\u003c/code\u003e macro to reduce false positives and focus on high-risk activities.\u003c/li\u003e\n\u003cli\u003eMonitor the \u003ccode\u003eSubject_Title\u003c/code\u003e field in the M365 eDiscovery prompt logs for the presence of jailbreak keywords and phrases such as \u0026ldquo;act as,\u0026rdquo; \u0026ldquo;bypass,\u0026rdquo; \u0026ldquo;ignore,\u0026rdquo; \u0026ldquo;override,\u0026rdquo; \u0026ldquo;pretend you are,\u0026rdquo; and \u0026ldquo;rules=\u0026rdquo;.\u003c/li\u003e\n\u003cli\u003eInvestigate and remediate any identified jailbreak attempts to prevent further exploitation of M365 Copilot.\u003c/li\u003e\n\u003c/ul\u003e\n","date_modified":"2024-01-03T12:00:00Z","date_published":"2024-01-03T12:00:00Z","id":"/briefs/2024-01-03-m365-copilot-jailbreak/","summary":"The detection identifies attempts to jailbreak Microsoft 365 Copilot through prompt injection techniques that attempt to circumvent built-in safety controls by manipulating rules, bypassing system commands, or requesting AI impersonation.","title":"Microsoft 365 Copilot Jailbreak Attempts via Prompt Injection","url":"https://feed.craftedsignal.io/briefs/2024-01-03-m365-copilot-jailbreak/"}],"language":"en","title":"CraftedSignal Threat Feed — Ai-Jailbreak","version":"https://jsonfeed.org/version/1.1"}