AcademyResourcesCompanyResearchBook a demo ↗
Threats · Deep dive

The New AI Attack Surface: Model, RAG, Tools, Memory, Identity

Securing the model is one-fifth of the job. A component-by-component tour of where AI systems actually get attacked.

Read time
11 min
Threat coverage
Multiple · LLM01–LLM10
Frameworks
OWASP LLM · OWASP Agentic · NIST AI RMF
Audience
Security architects

Ask most teams how they secure their AI and they'll describe what they did to the model — the guardrails, the refusals, the red-team prompts. That's one component of a system that now has five. The other four are where real incidents start.

Beyond the model

The model is the part everyone can see, so it gets the attention. But the model rarely holds the data, calls the APIs, or carries the credentials. The components wired around it do — and an attacker reasons about the whole system, not the part you hardened.

Model

The reasoning core. Attacked through jailbreaks and prompt manipulation that bypass alignment, plus extraction attempts that probe training data or system prompts. Real, but rarely the whole story — and over-indexed relative to the rest.

Retrieval (RAG)

Every document the system retrieves is untrusted input that can carry instructions. Poisoning a single record means it reaches every user who later retrieves it; retrieval manipulation and over-broad indexes leak data the user should never see. The control: treat retrieved content as data, scope the index to the user's entitlements.

Tools

The system's hands — APIs, functions, code execution. The risk is excessive agency: a tool scoped more broadly than the task turns a text manipulation into real action. The control: least-privilege tools, per-call authorization, and approval gates on anything irreversible.

Memory

Persistence makes a one-time manipulation durable. A planted instruction in long-term memory can steer decisions long after the original prompt is gone, and shared memory can bleed one user's context into another's. The control: validate what's written to memory, scope it per user, and expire it.

Identity

The credentials the system acts under — and the variable that sets your blast radius. It is almost always over-provisioned: a standing, broad token that an injected instruction inherits in full. The single highest-leverage control on this list is least-privilege, short-lived identity, because it caps the damage regardless of which other door was opened.

Model RAG Tools Memory Identity
Fig 1 — the five-component surface: a manipulation entering at retrieval can travel to action under the system's identity.

Framework mapping

ComponentPrimary riskOWASP LLM
ModelJailbreak / extractionLLM01 / LLM07
RetrievalPoisoning / leakageLLM03 / LLM06
ToolsExcessive agencyLLM06
MemoryPoisoning / persistenceLLM01
IdentityOver-privilegeLLM06

Checklist

  • All five components are in the threat model — not just the model.
  • Retrieved and tool-returned content is treated as untrusted data.
  • Tools are least-privilege with approval gates on irreversible actions.
  • Memory is validated, scoped per user, and expired.
  • The system's identity is least-privilege and short-lived.

Put the research to work.

See how SecuraAI discovers, scores, and governs every AI asset in your environment.