{"description":"Trending threats, MITRE ATT\u0026CK coverage, and detection metadata. Fed continuously.","feed_url":"https://feed.craftedsignal.io/tags/rag/","home_page_url":"https://feed.craftedsignal.io/","items":[{"_cs_actors":[],"_cs_cpes":[],"_cs_cves":[],"_cs_exploited":false,"_cs_has_poc":false,"_cs_poc_references":[],"_cs_products":["open-webui (\u003c= 0.8.12)"],"_cs_severities":["high"],"_cs_tags":["rag","poisoning","web-application"],"_cs_type":"advisory","_cs_vendors":["pip"],"content_html":"\u003cp\u003eOpen WebUI, a retrieval-augmented generation (RAG) application, is susceptible to unauthorized knowledge base modification. The vulnerability lies in the \u003ccode\u003eprocess_web\u003c/code\u003e endpoint within \u003ccode\u003ebackend/open_webui/routers/retrieval.py\u003c/code\u003e. Specifically, the \u003ccode\u003ePOST /api/v1/retrieval/process/web\u003c/code\u003e endpoint lacks authorization checks, which allows any authenticated user with knowledge of a target knowledge base UUID to overwrite it with arbitrary content. This is possible due to the \u003ccode\u003eoverwrite\u003c/code\u003e parameter, which defaults to \u003ccode\u003eTrue\u003c/code\u003e and triggers the deletion of the existing vector collection before new content is written via the \u003ccode\u003esave_docs_to_vector_db\u003c/code\u003e function. The issue affects the current main branch (commit \u003ccode\u003e6fdd19bf1\u003c/code\u003e) and likely all versions with RAG functionality. An attacker can leverage this vulnerability to poison the RAG system by injecting malicious content into the knowledge base.\u003c/p\u003e\n\u003ch2 id=\"attack-chain\"\u003eAttack Chain\u003c/h2\u003e\n\u003col\u003e\n\u003cli\u003eAttacker gains a valid user account on the Open WebUI instance.\u003c/li\u003e\n\u003cli\u003eAttacker discovers the victim\u0026rsquo;s knowledge base UUID, potentially through the \u003ccode\u003eknowledge-bases\u003c/code\u003e meta-collection (as mentioned in the report).\u003c/li\u003e\n\u003cli\u003eAttacker crafts a POST request to the \u003ccode\u003e/api/v1/retrieval/process/web\u003c/code\u003e endpoint, setting the \u003ccode\u003ecollection_name\u003c/code\u003e parameter to the victim\u0026rsquo;s KB UUID and ensures \u003ccode\u003eoverwrite=true\u003c/code\u003e.\u003c/li\u003e\n\u003cli\u003eThe POST request includes a \u003ccode\u003eurl\u003c/code\u003e parameter pointing to an attacker-controlled URL containing malicious content.\u003c/li\u003e\n\u003cli\u003eThe Open WebUI server fetches the content from the attacker-controlled URL.\u003c/li\u003e\n\u003cli\u003eThe \u003ccode\u003esave_docs_to_vector_db\u003c/code\u003e function is called, which first deletes the existing vector collection associated with the victim\u0026rsquo;s knowledge base.\u003c/li\u003e\n\u003cli\u003eThe fetched content from the attacker\u0026rsquo;s URL is then embedded and stored as the new content for the knowledge base.\u003c/li\u003e\n\u003cli\u003eWhen the victim queries their knowledge base, the RAG system returns the attacker-controlled content, leading to potential misinformation or malicious actions.\u003c/li\u003e\n\u003c/ol\u003e\n\u003ch2 id=\"impact\"\u003eImpact\u003c/h2\u003e\n\u003cp\u003eSuccessful exploitation leads to data destruction, where the victim\u0026rsquo;s original knowledge base embeddings are permanently deleted from the vector store. Furthermore, the RAG system is poisoned with attacker-controlled content, causing the LLM to return misleading or malicious responses. This can enable indirect prompt injection and manipulation of the victim\u0026rsquo;s LLM behavior. The poisoned content persists until the knowledge base is rebuilt from the original source files, creating a persistent vulnerability. Versions of open-webui up to and including 0.8.12 are affected.\u003c/p\u003e\n\u003ch2 id=\"recommendation\"\u003eRecommendation\u003c/h2\u003e\n\u003cul\u003e\n\u003cli\u003eApply authorization checks to the \u003ccode\u003e/api/v1/retrieval/process/web\u003c/code\u003e endpoint to verify that the user has write access to the target collection, mitigating CVE-2026-44554.\u003c/li\u003e\n\u003cli\u003eMonitor webserver logs for POST requests to \u003ccode\u003e/api/v1/retrieval/process/web\u003c/code\u003e with suspicious \u003ccode\u003ecollection_name\u003c/code\u003e parameters, using the Sigma rule \u0026ldquo;Detect Open WebUI Unauthorized Collection Overwrite Attempt\u0026rdquo; to identify potential exploitation attempts.\u003c/li\u003e\n\u003cli\u003eInspect network traffic for connections to suspicious URLs used in the \u003ccode\u003eurl\u003c/code\u003e parameter of the \u003ccode\u003e/api/v1/retrieval/process/web\u003c/code\u003e endpoint, such as the IOC \u003ccode\u003ehttps://attacker.com/poison\u003c/code\u003e.\u003c/li\u003e\n\u003c/ul\u003e\n","date_modified":"2024-01-18T12:00:00Z","date_published":"2024-01-18T12:00:00Z","id":"/briefs/2024-01-18-open-webui-rag-poisoning/","summary":"Open WebUI is vulnerable to knowledge base destruction and RAG poisoning due to a lack of authorization checks on the `/api/v1/retrieval/process/web` endpoint, allowing an attacker to overwrite a victim's knowledge base with attacker-controlled content.","title":"Open WebUI Knowledge Base Destruction and RAG Poisoning via Unauthorized Collection Overwrite","url":"https://feed.craftedsignal.io/briefs/2024-01-18-open-webui-rag-poisoning/"}],"language":"en","title":"CraftedSignal Threat Feed — Rag","version":"https://jsonfeed.org/version/1.1"}