Talent and role-design crisis
The Frankenstein Role
AI Security Engineer role language often bundles five historically separate capability families into one requisition.
Role breadth signal
Flagship thesis
aisecurity.llc
Core findings from job-description analysis, practitioner surveys, research signals, open-source activity, public adversary vocabulary, and industry coverage.
About this report
Job descriptions
Dataset publishing soon
Public-safe counts will appear after the corpus is released.
Survey respondents
386+
4 survey instruments — CISOs, hiring managers, practitioners, and adjacent engineers
Research papers
2,730
Top bucket: Prompt & Gen Security (1,233 papers).
Press items analyzed
613K+
6% classify into AI security themes. Top classified theme: AI Model Research (34K+ items).
AI-relevant CVEs
1,458
Top category: AI/ML Framework Vuln (378). 3 on CISA Known Exploited list.
Open source repos
2500
Top builder topics: vellum-ai/vellum-assistant, nousresearch/hermes-agent, manaflow-ai/cmux, badmadec0/html2md. Research-to-builder ratio: 1.
ATLAS techniques
170
16 tactics, 35 mitigations, and 57 case studies in the curated bundle.
Knowledge pages tracked
3
Codification lag measures time from practice emergence to institutional vocabulary. Currently concentrated in: Prompt Injection Jailbreaks.
Frameworks mapped
8
42 directional crosswalk mappings across ATLAS, NIST AI RMF, and OWASP LLM. 3 machine-readable, 5 document-only.
Each layer uses the same AI security taxonomy. Convergence raises confidence. Divergence is the finding. The main finding is structural, not technical: Companies write team-shaped roles but budget for individuals. Governance language outruns engineering proof. Titles are broader than the work. The compliance reflex is 108:1. The tool gap is 30:1. Agentic roles are 0.24% of postings and rising.
Overview
Public-safe counts will appear after the corpus is released.
Company-level counts will appear with the released benchmark.
Current-period counts are being prepared for release.
Historical comparison counts are still being validated.
Average score estimating cross-discipline role bloat.
Average score estimating AI label without AI-specific security evidence.
Average score estimating unrealistic breadth of role requirements.
Map
Outline
Section 1
Frame the market tension: demand for AI security capability is rising faster than role architecture, operating models, and skills validation.
Section 2
Define the three structural gaps: - title inflation versus capability specificity - governance language versus engineering evidence - executive urgency versus implementation readiness
Section 3
- The Frankenstein Role - Skill Washing - The Unicorn Index - The Probability Pivot - The Evidence Gap - Agentic Anarchy - The vCISO Vacuum - Boardroom-to-Backlog Gap - Skills Validation Gap - Model Supply Chain Blind Spot - Entry-Level Extinction - The Red Team Misnomer - The Compliance Reflex - The Tool Incumbency Trap - The Agentic Surface Emergence
Section 4
Ten vertical profiles with lead hypotheses, dominant signal patterns, blind spots, operating implications, hiring implications, and board-level communication angles.
Section 5
Nine practical AI security archetypes with mission boundaries, anti-patterns, and first 90-day outputs.
Section 6
Audience-specific decisions for CISOs, security leaders, hiring managers, recruiters, and practitioners.
Section 7
Explain claim-readiness labels, mandatory caveats, sponsor-independence boundaries, and what not to claim publicly. Based on analyzed job-description signals, not proof of any individual company's internal security maturity.
Findings
Talent and role-design crisis
AI Security Engineer role language often bundles five historically separate capability families into one requisition.
Role breadth signal
Flagship thesis
Title/substance mismatch
AI-labeled security titles often outpace AI-specific control, testing, and evidence language.
Specificity gap
Cross-vertical signal
Team-shaped requirements
The market prices one role while frequently describing team-level capability breadth.
Requirement compression
Compensation pressure
Systems reasoning shift
AI security language reflects a shift toward probabilistic systems reasoning and ambiguity tolerance.
Mindset signal
Role-language shift
Governance-to-execution gap
Governance language often appears before engineering evidence language such as eval outputs, telemetry, and remediation proof.
Evidence readiness
Execution hinge
Delegated action risk
Agent security is delegated action security; chatbot framing is insufficient.
Control architecture gap
Action authorization
Mid-market exposure gap
Some organizations are too small to hire the unicorn but too exposed to defer AI security.
Coverage gap
Internal staffing mismatch
Execution translation failure
Executive AI risk narratives often fail to translate into named controls, owners, and evidence artifacts.
Translation maturity
Operating model signal
Assessment maturity lag
The market asks for AI security engineering skills before it has standardized practical evaluation pathways.
Assessment gap
Hiring quality risk
Lifecycle control deficit
Model artifacts, provenance, and deployment gates remain under-specified in many role definitions.
Lifecycle coverage
Control omission risk
Talent supply crisis
AI Security Engineering is being invented at the top of the org chart. The market is hiring senior-only into an unproven discipline, with almost no junior pathways.
Pipeline risk
Future capacity threat
Role language confusion
"AI red team" is often used as a catch-all for governance reviews, product assessments, platform controls, and abuse testing rather than adversarial AI evaluation.
Label precision
Matching quality risk
Legacy framework dominance
Legacy compliance frameworks dominate AI-security hiring language by roughly 108:1 versus AI-native governance frameworks.
Legacy vs AI-governance ratio
108:1
Incumbent tooling lock-in
Compliance automation incumbents appear in hiring language far more than AI-native security testing and evaluation tooling.
Incumbent vs AI-native tools
30:1
Early but accelerating risk surface
Prompt-injection, function-calling, and tool-calling security signals are still under 0.3% of all postings, but rising quickly.
Agentic-surface concentration
0.24%
Academic vs market misalignment
What researchers study and what the market hires for are systematically misaligned. arXiv concentrates 45% of AI security papers in prompt/generation security and 11% in agentic action — yet hiring language concentrates in governance, compliance, and red-teaming. Research emerging terms (jailbreak, autonomous agent, tool call, guardrail) all appear with zero prior-period count — when research energy catches up to market, there will be a rapid reskilling event.
Research concentration vs hiring signal
45% research → prompt/gen; market → governance
Codification lag crisis
3 Wikimedia pages exist for a field with 2,730 arXiv papers and 2,500 GitHub repos. There is no canonical public reference for 'AI security engineering' as a practice. The codification lag — time from practice emergence to public knowledge entry — is extreme and unlike any established security domain. You cannot onboard juniors into a discipline that has no Wikipedia. This is the structural explanation for why Entry-Level Extinction and Skills Validation Gap exist: there is no codified curriculum to build on.
Codification index
3 Wikimedia pages vs 2,730 arXiv papers
Open-source tooling gap
GHArchive tracking shows 99.4% of 2,500 tracked repos are unclassified — not AI-security-specific. Job descriptions demand 'AI-native security tooling,' but the open-source ecosystem barely exists. The Tool Incumbency Trap (30:1 legacy vs AI-native) isn't just preference or inertia: the alternative tools haven't been built yet. Practitioners are being hired to implement controls that don't have reference implementations. Incumbents stay dominant not by lock-in, but because the vacuum is real.
Classified AI security repos
0.6% of tracked repos
Media narrative vs operational reality
613K+ media items analyzed. AI model research captures 5.65% of all media volume — 34,686 items. Agent security: 762 items (0.12%). AI red teaming: 34 items (0.005%). MCP tool security: 12 items. Supply chain: 25 items. The ratio of capability coverage to security coverage in industry media is approximately 45:1. This is a leading indicator, not a steady state. When a high-profile AI security incident becomes a mainstream story, that attention will correct rapidly — and the talent market, which follows media narrative, will not be ready.
Capability vs security media ratio
45:1
Governance-implementation structural failure
8 AI-native security frameworks tracked. 5 are document-only. Only 3 are machine-readable. 42 heuristic crosswalk rows across MITRE ATLAS, NIST AI RMF, and OWASP LLM Top 10 — none are natively integrated into CI/CD pipelines, security tooling, or automated evidence collection. The Compliance Reflex (108:1 legacy vs AI-native) is not stubbornness: it's structural. When AI-native frameworks exist only as PDFs with no automation integration and no audit-trail format, practitioners implement what they can actually implement. The standards bodies meant to displace legacy frameworks are too immature to do so.
Machine-readable AI security frameworks
3 of 8 — rest are document-only
Detection and monitoring deficit
AI logging scores 1.4/5 in practitioner surveys — the single lowest-rated control category. arXiv puts only 2.45% of papers in detection and runtime monitoring — the second-lowest research bucket. In media, AI cyber defense barely registers. This is a four-signal convergence: the domain of AI monitoring and detection is simultaneously the lowest-maturity control, the least-researched academic area, and the least-covered media topic. You cannot respond to incidents you cannot detect. The industry is building attack surface faster than it is building monitoring infrastructure.
AI logging control maturity (survey)
1.4/5 — lowest rated
Untapped talent supply
Adjacent engineers — platform, DevOps, ML engineers without security background — represent the most realistic near-term supply of AI security talent. But hiring filters are calibrated to 'security professionals who've added AI,' not 'AI professionals who've added security.' Survey data shows adjacent engineers have moderate confidence in AI security but face specific, navigable barriers: vocabulary gaps and credential expectations rather than capability gaps. The market is filtering out its most viable near-term talent supply.
Adjacent talent supply
Most viable near-term pipeline
Active exploitation, not theoretical risk
3 AI-relevant CVEs have reached CISA Known Exploited Vulnerability status — meaning they are actively being exploited in the wild right now. The market debates AI security as a future risk while defenders are already remediating KEV-listed AI/ML vulnerabilities. The top vulnerability bucket is AI/ML framework and library vulnerabilities (378 of 1,458 CVEs). The tools practitioners are hired to use are the attack surface. AI security investment is not a strategic hedge; it is immediate operational exposure.
CISA Known Exploited AI CVEs
3 KEV entries (active exploitation confirmed)
Research lead vs hiring lag
Privacy-preserving ML and differential privacy are the top research terms in arXiv's AI security corpus — 67 and 55 papers respectively, both surging in the last 12 months. Yet privacy appears in hiring language primarily as a compliance checkbox bundled with GDPR and data protection, not as an engineering capability. There is a 5+ year research lead in privacy-preserving AI techniques that the hiring market has not operationalized. Organizations that hire specifically for privacy-preserving ML engineering skills have first-mover advantage.
Top arXiv AI security research term
#1: privacy-preserving (67 papers, surging)
Next step
Use the report as a diagnostic. Then move into the offer that matches the pressure you have.
Evidence: Academic vs Market
arXiv concentrates 45% of AI security research in prompt and generation security, plus 11% in agentic action security. Market hiring concentrates in governance, compliance, and red-team language. Emerging terms — jailbreak, autonomous agent, tool call, guardrail, evaluation framework — all have zero prior-period count. When research turns into hiring demand, reskilling will be fast.
Supports: Research-to-Hiring Chasm · Agentic Surface Emergence · Probability Pivot · Privacy Asymmetry
External Signals
Directional research momentum from seeded arXiv metadata slices.
External Signals
Classification-bucket composition over time (% of annual seeded pull).
External Signals
Scatter of matched-term counts in prior period vs last 12 months.
External Signals
Number of unique GitHub actors contributing per classification bucket.
Evidence: Detection & Monitoring
AI logging scores 1.4/5 in practitioner surveys, the lowest-rated control. arXiv puts only 2.45% of papers in detection and runtime monitoring. Media barely covers AI cyber defense. Four independent signals point to the same gap: monitoring and detection are under-resourced.
Supports: Telemetry Blind Spot · Agentic Anarchy · Model Supply Chain Blind Spot
External Signals
Monthly volume of AI-relevant vulnerability disclosures (NVD, GHSA, OSV).
External Signals
Distribution of AI-relevant vulnerabilities across taxonomy buckets.
External Signals
Classified news items by AI security taxonomy bucket. Unclassified items excluded.
Evidence: Talent & Knowledge Supply
GHArchive: 99.4% of 2,500 tracked repos are unclassified, so the tooling layer is thin. Wikimedia: 3 concept pages against 2,730 arXiv papers, so the field vocabulary is not codified. Survey: adjacent engineers are the best near-term supply, but hiring filters screen them out.
Supports: Builder Vacuum · Knowledge Desert · Adjacent Reservoir · Entry-Level Extinction · Skills Validation Gap
External Signals
First-seen repository signal — builder ecosystem growth rate.
External Signals
Dominant GitHub event types in the scoped stream.
External Signals
Directional knowledge codification momentum by monthly page creation.
External Signals
Annual codification composition across AI security taxonomy buckets.
Evidence: Active Risk
3 AI-relevant CVEs have reached CISA KEV status — actively exploited in the wild. The top vulnerability bucket is AI/ML framework and library flaws (378 of 1,458 CVEs), so daily tools are the attack surface. Media covers AI capabilities vs AI security at about 45:1. The correction, when it comes, will be sudden.
Supports: The Exploited Present · Attention Deficit · Model Supply Chain Blind Spot · Agentic Surface Emergence
External Signals
Monthly volume of AI-relevant vulnerability disclosures (NVD, GHSA, OSV).
External Signals
Distribution of AI-relevant vulnerabilities across taxonomy buckets.
External Signals
Aggregated CWE groups for AI-relevant software flaws.
External Signals
Aggregated news and blog volume matching AI security taxonomy. Excludes partial month.
Evidence: ATLAS
ATLAS is the vocabulary layer for adversary tactics, techniques, mitigations, and case studies. It does not prove private maturity. It does anchor the report in a public threat model. The explorer exposes the matrix directly so practitioners can inspect technique maturity and mapped evidence.
Supports: Adversary Vocabulary · Technique Maturity · Public Case Studies · Control Mapping
MITRE ATLAS Explorer
Technique, sub-technique, and case-study coverage grouped by tactic.
MITRE ATLAS Explorer
Current maturity labels in the upstream ATLAS technique bundle.
MITRE ATLAS Explorer
How the upstream case studies split between incidents and exercises.
Evidence: Standards & Governance
8 AI-native security frameworks are tracked. 5 are document-only. Only 3 are machine-readable. 42 heuristic crosswalk rows exist, and none natively integrate with CI/CD or automated evidence collection. The Compliance Reflex (108:1) is not stubbornness. Practitioners implement what they can ship. AI-native governance frameworks must become machine-readable before they can replace legacy tooling.
Supports: Framework Paradox · Compliance Reflex · Evidence Gap · Privacy Asymmetry
External Signals
Days since latest detected source update or release date. Missing dates are shown as 999 days and should be treated as metadata gaps.
External Signals
Flattened density summary of directional heuristic mappings between public frameworks.
External Signals
Directional count of crosswalk mappings grouped into AI security domains.
External Signals
Counts of machine-readable assets and document/page references by framework.
External Signals
Distribution of mapping rows by mapping type classification.
External Signals
Per-framework fetch status across clone/download/metadata operations.
External Signals
Asset coverage by framework and family for machine-readable and document references.
Cross-Signal Convergence
Cross-signal convergence scoring matches top terms from arXiv research, industry media, GitHub builder activity, and vulnerability disclosures against the same AI security taxonomy. Topics that appear across multiple layers carry higher claim confidence than single-source signals. Composite scores are weighted: arXiv 30%, Media 25%, GitHub 25%, Vulnerabilities 20%.
Cross-Signal Analysis
Topics where multiple independent signal layers independently agree. Composite score = weighted average.
Top Cross-Signal Convergence Topics
No data available for this signal.
Cross-Signal Analysis
Per-topic signal strength across all four independent signal layers (0–1 normalized).
Signal Convergence Matrix
No data available for this signal.
Consistency
Configured findings
24
Findings index entries
15
Claim readiness rows
4
Missing claim rows
20
| Findings Index Entry | Claim readiness | Chart IDs |
|---|---|---|
| The Frankenstein Role [@gartner-ai-security-2026, @forrester-ai-risk-2026] | public_claim_with_caveat | chart_frankenstein_role_distribution, chart_frankenstein_role_by_industry, chart_role_archetype_by_industry |
| Skill Washing [@mckinsey-ai-state-2026, @idc-ai-future-2026] | public_claim_with_caveat | chart_skill_washing_scatter, chart_ai_specificity_by_role_family, chart_industry_specificity_score |
| The Unicorn Index [@isc2-workforce-2025, @stanford-hai-ai-index-2025] | public_claim_with_caveat | chart_unicorn_index_by_seniority, chart_role_breadth_by_role_family, chart_company_size_industry_heatmap |
| The Probability Pivot [@stanford-hai-human-centered-2026] | public_claim_with_caveat | chart_psychometric_radar, chart_probability_language_shift, chart_psychometric_fingerprint_by_industry |
| The Evidence Gap [@nist-ai-rmf-1, @nist-sp-800-218] | public_claim_with_caveat | chart_evidence_gap_framework_vs_evidence, chart_governance_evidence_artifacts, chart_governance_vs_engineering_matrix |
| Agentic Anarchy [@mitre-atlas-2026, @microsoft-digital-defense-2025] | public_claim_with_caveat | chart_agentic_control_gap, chart_agent_security_control_language, chart_attack_surface_by_industry |
| The vCISO Vacuum [@cisco-ai-readiness-2026] | internal_or_teaser_only | chart_company_size_ai_security_gap, chart_fractional_support_model_interest |
| Boardroom-to-Backlog Gap [@deloitte-ai-governance-2026, @ey-ai-governance-2026] | public_claim_with_caveat | chart_boardroom_to_backlog_gap, chart_evidence_artifact_frequency |
| Skills Validation Gap [@bugcrowd-ai-security-2026, @hackerone-ai-bounty-2026] | public_claim_with_caveat | chart_skills_validation_gap, chart_training_lab_market_map |
| Model Supply Chain Blind Spot [@snyk-state-oss-2025, @linux-foundation-oss-ai-2026] | public_claim_with_caveat | chart_model_supply_chain_signal_frequency, chart_model_supply_chain_control_gap |
| Entry-Level Extinction [@isc2-workforce-2025] | internal_or_teaser_only | chart_entry_level_extinction, chart_seniority_distribution_ai_vs_traditional |
| The Red Team Misnomer [@mitre-atlas-2026, @openai-red-teaming-report-2026] | public_claim_with_caveat | chart_red_team_misnomer_scatter, chart_ai_red_team_vs_actual_work |
| The Compliance Reflex [@nist-ai-rmf-1, @iso-42001-2023] | public_claim_ready | chart_compliance_reflex_framework_split, chart_framework_adoption_tier_comparison |
| The Tool Incumbency Trap [@google-cloud-ai-security-2026, @f5-labs-ai-application-security-2026] | public_claim_ready | chart_tool_incumbency_hierarchy, chart_ai_native_vs_incumbent_tools |
| The Agentic Surface Emergence [@unit42-genai-2026, @crowdstrike-ai-adversary-2026] | public_claim_with_caveat | chart_agentic_surface_map, chart_agentic_signal_share, chart_industry_specificity_score, chart_governance_vs_engineering_matrix, chart_role_archetype_by_industry, chart_attack_surface_by_industry, chart_psychometric_fingerprint_by_industry, chart_company_size_industry_heatmap, chart_employer_type_split |
Draft content
Executive Summary Draft
The AI Security Engineering market has entered its compression phase. Demand outran role architecture. Governance language outran engineering evidence. Title inflation outran hiring clarity. Organizations are now hiring into the gap they created, and the cost is becoming visible. This report draws from six independent data layers: 352,000+ job descriptions analyzed for skill signals, role-language patterns, and title inflation; 386 practitioner responses across four survey instruments; 2,730 AI security papers cla...
Full Draft Snapshot
**Primary concept:** The Frankenstein Role **Subtitle:** What 300,000+ job descriptions reveal about the impossible job companies are trying to hire for Default caveat: Based on analyzed job-description signals, not proof of any individual company's internal security maturity. AI Security Engineering is being built in public, under hiring pressure, without shared role architecture. The market has moved from experimentation to staffing demand faster than it has moved from language to operating model. The conseq...
Executive sections
**The central finding is architectural, not technical.** Organizations are describing team-shaped roles and budgeting for individual contributors. The Frankenstein Role, a requisition that bundles product security, AI system security, governance evidence, executive risk translation, and adversarial evaluation, has become the market default. It is not a talent shortage problem. It is a role-design failure. **Five external signals agree the discipline is accelerating.** arXiv's top AI security bu...
Each finding is a named market concept with supporting evidence, a claim-readiness classification, and audience-specific action implications: 1. **The Frankenstein Role** — Role language increasingly bundles five historically separate capability families into one requisition. One salary, five professions. 2. **Skill Washing** — AI-labeled titles frequently outpace AI-specific control, testing, and evidence language. Specificity gap is consistent across verticals. 3. **The Unicorn Index** — The...
AI security demand has spread from frontier AI labs to SaaS, finance, healthcare, public sector, cybersecurity vendors, and operational industries. The cross-signal convergence data shows this is not a leading-edge phenomenon anymore: it is a broad-market structural shift. But organizational operating models, role architecture, and hiring assessment pathways have not caught up. Organizations are scaling AI risk programs on unstable hiring assumptions — and the cost of that instability will compo...
**CISOs and Security Leaders**: Your survey peers rate AI logging and monitoring, AI red teaming, and AI incident response as the three weakest controls in their portfolios. These are not aspirational controls — they are foundational to any operational security posture for AI systems. Assign ownership, set evidence requirements, and measure closure, not just completion of policy language. **Hiring Managers**: A role that bundles AI product security, governance evidence, adversarial evaluation,...
All job-description findings reflect analyzed role-language signals, not proof of any individual organization's internal security maturity. Survey findings are aggregate directional signals from self-reported responses and should be treated as directional, not independently audited. Psychometric scores reflect role-language signals, not personality diagnoses. GitHub ecosystem references are activity signals, not product endorsements. Sponsor support does not influence methodology, scoring, findi...
Section draft
**Primary concept:** The Frankenstein Role **Subtitle:** What 300,000+ job descriptions reveal about the impossible job companies are trying to hire for Default caveat: Based on analyzed job-description signals, not proof of any individual company's internal security maturity.
AI Security Engineering is being built in public, under hiring pressure, without shared role architecture. The market has moved from experimentation to staffing demand faster than it has moved from language to operating model. The consequence is not subtle: companies are posting role language that implies team-shaped capability while budgeting and interviewing as if they are hiring one specialist contributor.
Claim: AI Security Engineer postings increasingly bundle multiple historically separate capability families into one role. Why now: AI feature velocity, buyer scrutiny, and compliance pressure converged before organizational design matured. Evidence basis: corpus-level role-language patterns, cross-validated with chart evidence and claim-readiness controls. What leaders are misreading: They treat this as a talent shortage alone, when it is also a role-design failure. Failure mode if ignored: Chronic mis-hiring, low tenure stability, and high re-org churn. What this changes now: - CISO: require explicit ownership boundaries before role approval. - Hiring manager: define primary mission...
Claim: finance AI security hiring behaves like model risk plus product security, not classic AppSec. Dominant hiring-signal pattern: control frameworks, risk management, validation, auditability, regulatory vocabulary. Characteristic blind spot: weak specificity around practical AI testing workflows and tooling. Operating-model implication: combine model risk and product security leadership with explicit engineering evidence ownership. Hiring implication: prioritize Governance Evidence Lead plus AI Product Security Engineer pairing over generic AppSec lift-and-shift. Board communication angle: show how governance requirements map to control telemetry and remediation proof. What this ch...
Core mission: secure AI-enabled product capabilities from design through release and post-release operation. Boundary conditions: does not own enterprise-wide governance strategy or full model lifecycle governance outside product scope. Required partner functions: product management, platform engineering, legal/privacy, governance leadership. Typical anti-patterns: overloaded with policy ownership and customer-assurance narrative without implementation authority. First 90-day outputs: product threat model set, AI feature control backlog, release-gate checklist, customer assurance evidence pack. Core mission: integrate AI abuse patterns and controls into secure SDLC practice. Boundary c...
Common mistakes: - assuming AI security ownership is implicit - accepting policy language without evidence pathways - treating one hire as full operating model Three decisions in 90 days: - assign explicit cross-functional ownership - approve role architecture before requisitions - require evidence artifacts in quarterly risk reporting Minimum evidence artifacts: - ownership matrix - AI risk-to-backlog register - control evidence scorecard Early warning signal: risk narratives repeat while control evidence remains static. What this changes now: ownership and evidence standards become gating criteria, not optional improvements. Common mistakes: - writing Chimera Specs - mixing incomp...
Use only: public_claim_ready, public_claim_with_caveat, internal_or_teaser_only, do_not_claim. Never claim company-level maturity from job descriptions. Never use psychometric outputs as personality diagnosis. Never present sponsor support as methodology influence.
The report must remain commercially useful, technically credible, and careful enough to withstand executive and public scrutiny.
Caveat