ConsultingWorkbench-backed AI security engagements — map, attack, defend, and prove your AI systems.
Scope a Review

AIPSA · AI Product Security

Acme Assistant Platform · Lite · High

Managed

Sample report

Acme Corp B2B SaaS with RAG, agents, and enterprise customers. Scores, evidence gaps, and roadmap are computed from realistic mock responses.

This is a sample AIPSA Scorecard, not an assessment of a real company. Scores are maturity signals based on submitted evidence and reviewed answers. They do not certify that a product or organization is secure.

Run your AIPSA Scorecard
AIPSAAI Product Security Assessmentv2026.1.0

Acme Assistant Platform AIPSA Scorecard

Acme Corp · Security maturity assessment and evidence planning

Overall Maturity

Managed

3.0/ 5

Executive Interpretation

AI product security is managed across important domains, but high-risk gaps remain in incident response, prompt-injection resilience, agentic permissions, and RAG authorization. The priority is reducing time to evidence, closing repeatable controls, and improving verification. The score reflects 1 foundational cap rule that bound the overall maturity claim.

Target Next: Measured· 3.754.49

3.0/ 5

Foundational cap applied

Incident response cap capped overall maturity at 3.0.

Capped from 3.25 to 3.00 because Without AI incident and abuse operations, the program cannot be considered managed even if design-time controls exist.

Ad Hoc
Repeatable
Managed
Measured
Adaptive
012345

Domain Maturity

Domain scores combine answers, evidence confidence, and cap rules.

12345InventoryThreat ModelingPrompt InjectionAgentic PermissionsRAG AuthorizationSupply ChainEvaluationLoggingIncident ResponseGovernanceSDLCCustomer Trust
UnseenAd HocRepeatableManagedMeasuredAdaptive

Domain Scores

Each domain is scored 0–5 using weighted question answers. Domains are sorted weakest first.

AI Incident Response and Abuse Operations

Incident Response

Ad Hoc
1.4 / 5LowHigh

Prompt Injection and Context Manipulation

Prompt Injection

Repeatable
2.6 / 5LowMedium

Agentic Permissions and Tool Safety

Agentic Permissions

Repeatable
2.6 / 5LowMedium

RAG, Data Access, and Authorization

RAG Authorization

Repeatable
2.7 / 5LowMedium

Model, Dataset, and AI Supply Chain Security

Supply Chain

Managed
3.3 / 5LowMedium

Governance, Policy, and Risk Acceptance

Governance

Managed
3.4 / 5LowMedium

AI Product Threat Modeling

Threat Modeling

Managed
3.7 / 5MediumLow

Evaluation, Testing, and Red Teaming

Evaluation

Managed
3.7 / 5MediumLow

AI Inventory and System Boundaries

Inventory

Measured
4.0 / 5MediumLow

Logging, Telemetry, and Forensics

Logging

Measured
4.0 / 5MediumLow

Secure SDLC and Developer Enablement

SDLC

Measured
4.0 / 5MediumLow

Customer Trust, Evidence, and Sales Enablement

Customer Trust

Measured
4.0 / 5MediumLow

Top Gaps

Highest-risk blockers to the next maturity level.

1

AI Incident Response and Abuse Operations

Incident Response

Critical

AI incidents include prompt injection, data exposure, unauthorized tool actions, provider failures, model abuse, retrieval poisoning, and harmful automation. Traditional IR needs AI-specific playbooks.

Score1.4
2

Prompt Injection and Context Manipulation

Prompt Injection

High

Prompt injection is not a clever demo trick. It is the input-validation and confused-deputy problem of LLM application security.

Score2.6
3

Agentic Permissions and Tool Safety

Agentic Permissions

High

An agent with tools is a product actor. Its permissions, identity, approvals, blast radius, and audit trail must be engineered like any other privileged service path.

Score2.6
4

RAG, Data Access, and Authorization

RAG Authorization

High

RAG turns search, embeddings, chunks, documents, and access control into one product-security boundary. Leakage often happens through retrieval, not generation.

Score2.7
5

Model, Dataset, and AI Supply Chain Security

Supply Chain

Medium

AI products inherit risk from providers, models, data pipelines, prompts, plugins, dependencies, and evaluation artifacts. Supply chain visibility is product risk visibility.

Score3.3

30 / 60 / 90 Day Remediation RoadmapPrioritized actions from your triggered findings, grouped by recommended time horizon.

Turn gaps into owners, evidence, and repeatable operating cadence.

30 Days

Foundation

Establish repeatable rails for critical AI systems.

1. 30 days

Create AI-specific incident runbooks

Incident Response Lead with Product Security

Create runbooks for prompt injection, data exposure, malicious tool action, retrieval poisoning, provider outage, model abuse, and harmful automation.

mediumhigh

60 Days

Scaling

Harden testing, logging, authorization, and incident handling.

1. 60 days

Implement tool-call mediation for prompt-influenced actions

AI Platform Engineering

Wrap sensitive tool calls with validation, policy checks, approval requirements, allowlists, deny rules, simulation, and audit logging.

largehigh

2. 60 days

Add approval safeguards for high-impact agent actions

Product Engineering

Require preview, approval, simulation, staged execution, or explicit confirmation for high-impact, irreversible, external, financial, administrative, or customer-visible actions.

largehigh

3. 60 days

Add AI containment and kill switches

Product Engineering with SRE

Add mechanisms to disable or constrain risky models, tools, agents, retrieval sources, prompts, providers, and AI features during incidents.

largehigh

4. 60 days

Add RAG provenance, deletion, and retention controls

RAG Platform Engineering

Ensure chunks preserve source, owner, timestamp, ACL, deletion, retention, freshness, and policy state through ingestion, indexing, retrieval, and citation.

mediumhigh

90 Days

Optimization

Turn the program into a measurable operating model.

1. 90 days

Close the AI incident and findings learning loop

Product Security

Ensure incidents, near misses, red-team findings, abuse reports, and eval failures update controls, tests, runbooks, launch gates, and training.

smallmedium

Evidence GapsEvidence items that are missing or insufficient based on low-scoring questions. Collect these to improve your maturity score.

What is missing, who owns it, and how long it should take to prove.

Do AI-specific incident runbooks exist?

Incident Response

High

Effort

2–4 weeks

Owner

Security Ops

Do AI incidents improve the program?

Incident Response

High

Effort

1–2 weeks

Owner

Security Ops

Are prompt-influenced tool calls mediated?

Prompt Injection

Medium

Effort

1–3 months

Owner

AppSec

Do high-impact agent actions require safeguards?

Agentic Permissions

Medium

Effort

1–3 months

Owner

AI Platform

Does retrieved context preserve provenance and lifecycle state?

RAG Authorization

Medium

Effort

2–4 weeks

Owner

Product Security

Can risky AI behavior be contained quickly?

Incident Response

Medium

Effort

1–3 months

Owner

Security Ops

Framework Crosswalk

Framework crosswalks are evidence-support tools. They do not replace formal audit or certification.

NIST

NIST AI RMF

Govern3.8

Strong evidence. Avg 3.81/5.

Map3.2

Partial evidence. Avg 3.19/5.

Measure3.4

Partial evidence. Avg 3.44/5.

Manage2.7

Partial evidence. Avg 2.67/5.

OWASP

OWASP AIMA

Strategy and Governance4.0

Strong evidence. Avg 4.00/5.

Design and Risk Assessment3.7

Partial evidence. Avg 3.69/5.

Implementation, Validation, Monitoring, and Assurance3.7

Partial evidence. Avg 3.69/5.

Implementation and Operations4.0

Strong evidence. Avg 4.00/5.

CSA

CSA AI Security Scorecard

AI Governance and Risk Management4.0

Strong evidence. Avg 4.00/5.

AI Risk Management3.7

Partial evidence. Avg 3.69/5.

AI Application Security and Deployment Governance2.6

Partial evidence. Avg 2.64/5.

Third-Party and Provider Risk3.3

Partial evidence. Avg 3.27/5.

Monitoring, Incident Response, and Assurance4.0

Strong evidence. Avg 4.00/5.

AI Incident Response and Abuse Operations1.4

Weak evidence. Avg 1.38/5.

Governance, Risk, and Compliance3.4

Partial evidence. Avg 3.42/5.

Compliance, Assurance, and Customer Trust4.0

Strong evidence. Avg 4.00/5.

ISO

ISO/IEC 42001

Context, planning, operation, and AI system inventory evidence4.0

Strong evidence. Avg 4.00/5.

Operational controls and data management evidence2.7

Partial evidence. Avg 2.69/5.

Supplier, resource, and operational control evidence3.3

Partial evidence. Avg 3.27/5.

Performance evaluation and improvement4.0

Strong evidence. Avg 4.00/5.

Operation, performance evaluation, and improvement1.4

Weak evidence. Avg 1.38/5.

Leadership, planning, operation, performance evaluation, and improvement3.4

Partial evidence. Avg 3.42/5.

Support, operation, and competence4.0

Strong evidence. Avg 4.00/5.

Documented information, performance evaluation, customer assurance, and improvement4.0

Strong evidence. Avg 4.00/5.

Aligned
Partially Aligned
Weak
Not Started
View full crosswalk report →

Badge Eligibility

Current level: Managed. Public badges require scope, date, evidence, and caveats.

Eligible marks

  • AIPSA Program: Managed
  • AIPSA Evidence Pack, if scoped assessment is completed

Not eligible yet

  • AIPSA Verified Controls until retest evidence is complete

Public badges require scope, date, evidence, and caveats.

How scoring works

AIPSA combines domain responses, evidence confidence, and cap rules. High scores require not only stated controls, but evidence that controls are repeatable, monitored, and improving.

Export

Evidence pack outputs and shareable artifacts for briefing, customer review, and internal remediation tracking.