Test LLM, RAG, and agentic systems before users and attackers do.
Course thesis
AI features fail differently than traditional software features. They can follow malicious instructions, retrieve the wrong context, expose sensitive data, call tools unsafely, overtrust generated output, or appear safe in happy-path tests while failing under adversarial conditions.
The goal is not to teach reckless exploitation. The goal is to give product teams a repeatable, public-safe, defensive workflow for finding AI product failures before release.
Audience
This course is for QA engineers, test automation engineers, product security teams, security engineers, DevOps, SecOps, SecEng, AI platform teams, AppSec teams, internal red teams, product managers, engineering managers, and governance teams.
Learning outcomes
Learners will be able to:
- map AI product attack surface safely
- design prompt injection and instruction-conflict tests
- test RAG retrieval boundaries and data leakage risks
- evaluate agent tools, permissions, approvals, and action limits
- identify sensitive data exposure paths
- test guardrail behavior without unsafe live abuse
- build AI abuse-case libraries and prompt families
- wire AI red-team regression checks into CI/CD
- capture evidence, severity, and remediation notes
- build an AI abuse-case test plan for product release
\pagebreak
# Module 1: AI Attack Surface for Product Teams
AI product testing starts with attack surface.
Key points
- AI features are workflows, not only model calls.
- Attack surface includes prompts, retrieved context, tool outputs, vector stores, logs, integrations, and output rendering.
- Trust boundaries show where permissions, users, data, and responsibility change.
- Safe testing requires authorized systems, synthetic data, controlled environments, and scoped evidence.
- Release-relevant failures should become test cases.
Practice
Map the attack surface for a fictional customer-support RAG assistant.
\pagebreak
# Module 2: Prompt Injection and Instruction Conflicts
Prompt injection testing checks whether an AI feature follows the wrong instructions.
Key points
- Prompt injection is best understood as instruction conflict.
- Test user input, retrieved documents, uploaded files, tickets, web content, and tool output.
- Define intended instruction hierarchy.
- Retrieved content should usually be treated as data, not control instruction.
- Capture expected behavior, actual behavior, evidence, and severity.
Practice
Design five safe instruction-conflict tests for a fictional RAG assistant.
\pagebreak
# Module 3: RAG Leakage and Retrieval Boundary Tests
RAG systems fail when retrieval crosses a boundary.
Key points
- Grounded answers are not automatically safe.
- Test tenant, workspace, role, document classification, time, and region boundaries.
- Use synthetic canary phrases.
- Test stale, deleted, revoked, and forbidden documents.
- Capture retrieved sources, citations, output, policy decisions, and severity where approved.
Practice
Create a RAG boundary test matrix for a fictional product with two tenants, admin and standard users, public docs, internal docs, finance docs, and revoked documents.
\pagebreak
# Module 4: Agent Tool Abuse and Excessive Agency
Agents create risk because they can act.
Key points
- Agent risk is what the system lets the model do.
- Test tool permissions, approval gates, user roles, tenant boundaries, retries, and state changes.
- Excessive agency occurs when an agent exceeds user authority, task scope, or product policy.
- Meaningful approval requires action visibility.
- Evidence must distinguish proposed, approved, executed, and blocked actions.
Practice
Build an agent test matrix for a fictional customer support agent.
\pagebreak
# Module 5: Sensitive Data Exposure and Output Handling
AI systems can expose sensitive data through prompts, retrieval, outputs, logs, citations, summaries, and downstream workflows.
Key points
- Use synthetic markers, not real sensitive data.
- Test prompts, retrieved content, model output, citations, logs, traces, analytics, exports, notifications, and tickets.
- Check redaction before storage.
- Severity depends on data class, reachability, affected users, logging, and tenant boundaries.
Practice
Create a sensitive data exposure test plan for an AI assistant that summarizes account history.
\pagebreak
# Module 6: Guardrail Evaluation and Regression Testing
Guardrails are useful, but they are not proof.
Key points
- Guardrails are controls to test.
- Test false positives and false negatives.
- Check refusal quality.
- Confirm guardrails still work after model, prompt, retrieval, or policy changes.
- Every confirmed guardrail failure should become a regression test.
Practice
Build a guardrail regression plan for an AI assistant that must avoid revealing restricted account data.
\pagebreak
# Module 7: Test Case Libraries and Prompt Families
One-off AI tests do not scale.
Key points
- A test case library turns ad hoc failures into repeatable evidence.
- Prompt families group related tests by failure class.
- Expected behavior, pass criteria, fail criteria, evidence, severity, and owner must be explicit.
- Use stable synthetic data and canary phrases.
- Test libraries need governance.
Practice
Create ten reusable test cases for a fictional AI product.
\pagebreak
# Module 8: CI/CD AI Red-Team Regression Suites
AI red-team findings should not live only in reports.
Key points
- Known failures should become regression tests.
- AI checks can run at pull request, pre-merge, pre-release, scheduled, and incident follow-up stages.
- Deterministic checks are preferred where possible.
- Model judges can help but should not be the only release control.
- CI runs should produce evidence.
Practice
Design a CI/CD regression suite for a fictional RAG assistant.
\pagebreak
# Module 9: Evidence, Severity, and Remediation Backlogs
A finding is useful only if the team can understand it, reproduce it safely, prioritize it, and fix it.
Key points
- Good evidence turns a failure into an engineering task.
- Severity depends on impact and likelihood.
- Remediation can include access control fixes, retrieval filter changes, prompt policy changes, tool permission reductions, eval additions, and regression tests.
- Evidence itself must not become a sensitive data exposure.
Practice
Write a finding for a fictional RAG assistant that reveals a synthetic cross-tenant canary phrase.
\pagebreak
# Module 10: Capstone AI Abuse-Case Test Plan
The final deliverable is a release-ready AI abuse-case test plan.
Required sections
- feature inventory
- attack surface map
- scope and authorization
- synthetic test data plan
- prompt injection test categories
- RAG boundary test matrix
- agent tool permission tests
- sensitive data exposure tests
- guardrail regression tests
- CI/CD regression plan
- evidence capture plan
- severity model
- remediation backlog workflow
- release decision criteria
Practice
Build a release-ready AI abuse-case test plan for a product with a RAG assistant, summarization feature, agentic support workflow, model gateway, and trust center page.
\pagebreak
# Appendix A: Quick Checklists
Safe test scope
- System is owned or explicitly authorized.
- Environment is approved.
- Data is synthetic.
- No real secrets are used.
- No real customer records are used.
- Destructive actions are disabled or controlled.
- Evidence capture is approved.
- Owners are identified.
RAG boundary tests
- Tenant boundary tested.
- Workspace boundary tested.
- Role boundary tested.
- Classification boundary tested.
- Revoked document tested.
- Stale index tested.
- Forbidden source name not revealed.
- Synthetic canary phrases used.
Agent tests
- Allowed tools listed.
- Forbidden tools listed.
- Approval gates listed.
- Retry limits tested.
- Tool errors tested.
- Confused deputy risk tested.
- Audit trail captured.
Finding checklist
- Clear title.
- Affected feature.
- Risk category.
- Scope.
- Expected behavior.
- Actual behavior.
- Safe evidence.
- Impact.
- Severity.
- Remediation.
- Regression test.
- Owner.
\pagebreak
# Appendix B: Sample Prompt Templates
AI attack surface map
Create an AI attack surface map for this feature.
Feature: [feature]
Known workflow: [workflow]
Users: [users]
Data: [data classes]
Tools: [tools]
Output locations: [outputs]
Provide:
- instruction sources
- retrieval sources
- tool actions
- trust boundaries
- sensitive data locations
- logs and traces
- likely failure modes
- recommended test categories
AI abuse-case test plan
Build a release-ready AI abuse-case test plan.
Product: [product]
AI features: [features]
Users: [users]
Data classes: [data]
Tools: [tools]
Controls: [controls]
Output:
- feature inventory
- attack surface map
- safe scope
- synthetic data plan
- prompt injection tests
- RAG boundary matrix
- agent tool tests
- sensitive data exposure tests
- guardrail regression tests
- CI/CD regression plan
- evidence template
- severity model
- remediation workflow
- release decision criteria
\pagebreak
# Final Message
AI red teaming for product teams is not a performance.
It is a release-readiness discipline.
Test the AI behavior before release. Capture evidence. Fix the product. Add the regression. Then ship with more confidence.