Technology
Prompt injection
Prompt injection is the attack where malicious input data tricks an LLM into ignoring its system prompt and following the attacker's instructions. The defining security risk for AI agents that process user-supplied content (emails, support tickets, documents).
More detail
Mitigations: (1) treat all user-supplied content as data, never as instructions, (2) use system-prompt-only auth (the LLM trusts system messages, not embedded user text), (3) output validation — reject responses that contain trigger phrases like 'IGNORE PREVIOUS INSTRUCTIONS', (4) human-in-the-loop on any AI output that affects state. Aiprosol's agents have explicit prompt-injection defence in their system prompts plus output validation layers.
