LangChain Unsafe Deserialization Vulnerability

hello@craftedsignal.io — Thu, 04 Jan 2024 18:00:00 +0000

LangChain contains older runtime code paths that deserialize run inputs, run outputs, or other application-controlled payloads using overly broad object allowlists. These paths may call load() with allowed_objects='all', allowing any trusted LangChain-serializable object to be revived with attacker-supplied constructor arguments. The vulnerability exists when applications accept untrusted structured input (e.g., JSON), fail to validate it before invoking LangChain, preserve attacker-controlled nested dictionaries/lists in LangChain run data, and use affected API paths like RunnableWithMessageHistory, astream_log(), or astream_events(version="v1"). A related secret-marker validation bypass in the serialization layer also contributes to the issue. This vulnerability affects langchain-core versions >= 1.0.0 and <= 1.3.2, as well as versions <= 0.3.84.

Attack Chain

The attacker crafts a malicious JSON payload containing a LangChain serialized constructor dictionary, e.g., for an AIMessage object with attacker-controlled content.
The attacker submits the crafted JSON payload to a vulnerable application endpoint that accepts structured input.
The application, without proper validation or canonicalization, processes the untrusted input and passes it to LangChain.
The attacker-controlled nested dictionaries or lists are preserved in LangChain run inputs or outputs.
The application invokes an affected API path, such as RunnableWithMessageHistory, astream_log(), or astream_events(version="v1"), which uses load() with a broad object allowlist.
LangChain deserializes the malicious payload, instantiating the attacker-specified object (e.g., AIMessage) with attacker-controlled constructor arguments.
The instantiated object’s content is then used in subsequent application logic, potentially leading to prompt injection, chat history poisoning, or other malicious outcomes.
If the instantiated object reads environment credentials, creates clients, or contacts attacker-controlled endpoints during initialization, credential disclosure or server-side request forgery may occur.

Impact

Successful exploitation allows an attacker to inject LangChain serialized constructor payloads, potentially leading to persistent chat-history poisoning (if revived messages are stored by RunnableWithMessageHistory), prompt injection, or the instantiation of unexpected LangChain objects with attacker-controlled arguments. This may lead to credential disclosure, server-side request forgery, or further exploitation within the application. The number of affected applications is currently unknown, but the impact could be significant given the widespread use of LangChain.

Recommendation

Migrate away from the deprecated APIs: RunnableWithMessageHistory, astream_log(), and astream_events(version="v1") to the newer, recommended streaming and memory patterns.
Update LangChain to a patched version that tightens deserialization behavior.
Do not pass user-controlled data to load() or loads(). Only use these functions with trusted LangChain manifests or serialized objects from trusted storage.
Use a narrow allowed_objects value appropriate for the specific trusted manifest being loaded, instead of relying on broad defaults or allowed_objects="all".
Deploy the Sigma rule to detect suspicious process creation involving deserialization of LangChain objects.

Langchain-Core — CraftedSignal Threat Feed

LangChain Unsafe Deserialization Vulnerability

Attack Chain

Impact

Recommendation