Prompt Injection
–-> Go to BOTwiki
Prompt injection is a type of attack against LLM-based systems in which attackers inject manipulated inputs to hijack the model’s behavior—for example, to bypass security rules, retrieve confidential content, or trigger unwanted actions. For a productive AI agent , prompt injection is one of the most significant threat models, which cannot be mitigated by a single protective measure but only through a combination of architecture, filtering, and monitoring.
Direct and Indirect Prompt Injection
- Direct injection: The user crafts the input in such a way that it overrides the system's instructions, for example by instructing it to ignore previous rules.
- Indirect injection: Malicious instructions are hidden within external content processed by the agent, such as web pages, emails, or documents accessed by a tool.
Why single-prompt tools are particularly vulnerable
Wrapper tools that process all requests through a single monolithic prompt are structurally vulnerable. They lack a clear separation between trusted instructions, user input, and external context; instead, everything ends up in the same token stream. Multi-agent orchestration makes such attacks significantly more difficult because each stage has clearly defined responsibilities.
Defense strategies
Effective protective measures consist of several layers:
- Strict separation of user input and system prompts, so that the model does not interpret instructions in the input as commands.
- Filters and detection layers for identifying suspicious patterns in input data and external content.
- Sandboxing external sources so that retrieved content does not enter the model context without being verified.
- Monitoring and alerting for atypical model behavior.
- Regular audits through internal and external penetration tests.
Prompt Injection in an Industry Context
In a service context, security-critical workflows are particularly affected: SAP write-backs in municipal utilities, patient inquiries in healthcare, and payment issues in publishing. Closely related to prompt injection are issues such as prompt jailbreaks and AI safety filters.
Frequently Asked Questions (FAQ)
There is no such thing as 100% security. However, with a multi-layered architecture, a multi-agent setup, and continuous monitoring, the risk can be significantly reduced.
A central one. Because responsibilities are distributed among specialized agents, a single compromised prompt has only a limited scope of impact.
They are often harder to detect because the malicious instructions are not directly embedded in the user input. This makes them a particularly important focus of modern security layers.
The GDPR and the EU AI Act apply to production systems in Germany, Austria, and Switzerland. BOTfriends complies with these requirements and provides the necessary audit logs and data protection impact assessments.
–> Back to the BOTwiki

AI Agent ROI Calculator
Free training: Chatbot crash course
Whitepaper: The acceptance of chatbots