Section 03 · Threat Modeling
OWASP foundation. AI-specific overlay.
Threat modeling for embedded AI systems must account for traditional application-layer risks, data pipeline vulnerabilities, model integrity risks, inference-time manipulation, and third-party model exposure. This framework aligns with OWASP threat categories, then extends them for AI-specific attack surfaces and maps adversarial techniques against MITRE ATT&CK and ATLAS.
Do I need to threat-model this?
Tier 1
Not required
Standard SDLC review only.
Tier 2+
Required
Documented threat model using template below.
Tier 3+
Governance review
Formal validation by central governance.
Map the system first
All threat modeling begins with a structured system map. Threats are evaluated against each component.
01
Data Sources
02
Feature Engineering Pipeline
03
Model Training / Fine-Tuning
04
Model Registry
05
Inference Layer
06
API Gateway
07
Business Logic Integration
08
Logging & Monitoring
09
Third-Party Dependencies
Core layer: OWASP application risks
These apply directly to AI APIs, feature pipelines, model registries, and training infrastructure. Treat them as the floor. The AI overlay is additive, not a replacement.
AI-specific threat overlay
Ten threat categories that traditional appsec models do not cover. Categories 9 and 10 (prompt injection and indirect prompt injection) are the dominant attack surface for LLM-based systems and are where governance, IAM, and secure RAG converge.
Training Data Poisoning
Malicious or biased data inserted into training sets to alter model behavior, induce backdoors, or skew predictions for protected classes.
Feature Manipulation
Upstream manipulation of model inputs (at the feature pipeline or data source) to alter outputs without compromising the model itself.
Model Extraction
Attackers reconstruct model behavior, weights, or decision boundaries via repeated queries, recovering IP or enabling downstream attacks.
Model Inversion
Reconstruction of sensitive training data via inference queries, which can expose PII or proprietary data even when the training set is private.
Drift Exploitation
Deliberate manipulation of input distributions to degrade model performance over time, evade detection, or push behavior past calibration limits.
Automation Amplification
Model outputs trigger cascading automated decisions. Small errors compound across downstream systems before any human intervention.
Vendor Opacity Risk
Limited visibility into third-party model architecture, training data, or change history. The vendor changes the model; the enterprise inherits the regression.
Unauthorized Retraining
Shadow updates, unauthorized version swaps, or pipeline tampering that replace a reviewed model with one that bypassed governance gates.
Prompt Injection
LLM-eraDirect manipulation of model instructions via user input, overriding system prompts, exfiltrating context, or hijacking the model's behavior.
Indirect Prompt Injection
LLM-eraAdversarial payloads embedded in retrieved documents, tool outputs, or web content that hijack the model when consumed as context. The user never types the attack; the model encounters it through the supply chain. This is where governance, IAM, and secure RAG converge.
Threat modeling workflow
For Tier 2+ systems. Tier 3–4 requires governance validation.
- 01Map architecture
- 02Identify OWASP application risks
- 03Overlay AI-specific risks
- 04Assign impact score
- 05Define mitigation strategy
- 06Log in centralized risk registry
Risk scoring
Each threat is scored across exploitability, business impact, regulatory impact, model integrity impact, and automation amplification. High-risk threats require documented mitigation or formal risk acceptance, not silence.
Adversarial alignment: ATT&CK and ATLAS
For Tier 3–4 systems, identify relevant adversarial techniques from MITRE ATLAS, map potential attack paths to architecture, validate mitigations, and log technique coverage. This integrates adversarial thinking without creating research theater.