MCP and tool security: why agent risk changes when LLMs can act
The security model changes when an LLM can act.
A chatbot can leak information or give a wrong answer. A tool-using agent can send an email, update a CRM record, move money, delete data, call an internal API, or expose a secret through a tool result.
That is why MCP, A2A, function calling, and internal tool integrations require a security review before production.
The new attack surface
The main risks are not exotic:
- prompt injection hidden in documents, tickets, emails, web pages, or tool output;
- tool misuse from ambiguous instructions;
- over-broad permissions;
- sensitive data in retrieval context;
- unsafe side effects without user confirmation;
- untrusted tool output treated as trusted instructions;
- missing audit logs for model and tool actions.
The model is only one part of the system. The dangerous part is the loop around it.
Controls I expect
For production agents, I look for:
- Least privilege. Tools should expose only the actions and fields needed for the workflow.
- Typed schemas. Tool arguments should be constrained, validated, and logged.
- Permission checks before context injection. Do not retrieve first and filter later.
- Human confirmation for side effects. Especially email, payments, deletions, account changes, and external sends.
- Untrusted-context boundaries. Documents and tool outputs are data, not instructions.
- Tracing and audit logs. Every important model call and tool call should be inspectable.
- Adversarial evals. Test prompt injection, data exfiltration, tool confusion, and refusal behavior.
These controls do not kill agent usefulness. They make it deployable.
The review question
Before shipping an agent, ask:
If malicious text appears inside a document, email, ticket, or tool response, can it change what the agent is allowed to do?
If the answer is yes or unclear, the agent is not ready for enterprise production.
Have a similar AI task?
Send a short brief and I will suggest the smallest paid next step: consultation, audit, security review, or build.