Technology

Prompt injection

Prompt injection is the attack where malicious input data tricks an LLM into ignoring its system prompt and following the attacker's instructions. The defining security risk for AI agents that process user-supplied content (emails, support tickets, documents).

More detail

Mitigations: (1) treat all user-supplied content as data, never as instructions, (2) use system-prompt-only auth (the LLM trusts system messages, not embedded user text), (3) output validation — reject responses that contain trigger phrases like 'IGNORE PREVIOUS INSTRUCTIONS', (4) human-in-the-loop on any AI output that affects state. Aiprosol's agents have explicit prompt-injection defence in their system prompts plus output validation layers.

More detail

Related terms

More detail

Related terms