NEW

Start with the pressure: sales, launch, abuse, agents, data, or guardrails

ARTIFACT TRIAGE

AI Artifact & Binary Triage Benchmark

AI Artifact and Binary Security Triage Benchmark

Evaluate extension, CLI, manifest, binary, config, package, and agent artifact risk detection.

This suite differentiates AI Security LLC by connecting AI security to actual software artifacts and delivery surfaces.

Benchmark

Artifact & Binary Triage

In progress
Private execution available
Artifact classes
7+

Extensions, CLIs, manifests, configs, Docker files, agent packages, binary metadata

Trial count
1,800

Across rules-only, model-assisted, and hybrid evidence variants

Report preview

Report outputs

Artifact analysisIn progress

Publication boundary

Methodology and suite design publish before public scorecards. Suites in active build can be scoped privately while validation continues.

Scorecards are validation-gated.
This benchmark is in active build. Private benchmark runs are available now, and public results will publish after validation.
This suite is in active build. Public artifact benchmark results will publish after validation.

Problem

Why this benchmark matters

AI security is not only prompt security. Teams also need to inspect extensions, CLIs, agents, repos, packages, and artifacts that embed AI behavior or create supply-chain risk.

Why it matters

Vendors and internal teams ship AI-enabled artifacts that may request dangerous permissions, hide risky behavior, leak data, or invoke model/tool workflows without clear controls.

What we will test

We will evaluate artifact triage systems against synthetic and curated files for permission risk, secret exposure, suspicious behavior, unsafe config, binary indicators, and evidence extraction quality.

Buyer value

Teams can triage AI-enabled artifacts, support vendor review, prioritize risky extensions or tools, and generate evidence for security review.

Benchmark scope

What this benchmark will measure

Scope is explicit so buyers can see what the benchmark covers before any public scorecards exist.

Classification

benchmark program

mapattackevidence

Target systems

extensioncli_toolbinary_artifactrepositoryvendor_platform

Buyer problems

artifact riskextension risksupply chain riskvendor reviewbinary triageevidence needed

Risk dimensions

artifact-risk-detectionsensitive-data-leakageauditability-and-evidencevulnerability-introduction

Evaluation task

Extension permission risk

high

Analyze browser extension manifests and bundles for risky permissions and AI-related data flows.

Success condition

System identifies high-risk permissions, data access, remote code, content scripts, and evidence.

Failure condition

System misses dangerous permissions or invents unsupported claims.

Artifact Risk Detection
Sensitive Data Leakage
Artifact Abuse
Data Exfiltration

Evaluation task

Secret and config detection

critical

Analyze configs, env-like files, manifests, and bundles for embedded secrets and unsafe defaults.

Success condition

System flags synthetic secrets, unsafe config, and evidence locations.

Failure condition

System misses synthetic secrets or mislabels benign config as critical.

Artifact Risk Detection
Sensitive Data Leakage
Artifact Abuse

Evaluation task

CLI and agent package risk

high

Analyze CLI tools and packaged agents for unsafe tool permissions, file/network access, and hidden behaviors.

Success condition

System identifies risky tool scope, file access, network behavior, and evidence.

Failure condition

System misses material risk or produces unsupported findings.

Artifact Risk Detection
Agent Tool Control
Artifact Abuse
Tool Misuse

Evaluation task

Evidence export quality

medium

Assess whether findings can be exported with enough proof for review workflows.

Success condition

Output includes file path, snippet, rule, rationale, severity, and remediation.

Failure condition

Output lacks proof, reproduction, or actionable remediation.

Auditability and Evidence
Artifact Abuse

Experiment design

Measure AI-assisted artifact triage quality across extensions, binaries, CLIs, manifests, configs, and packaged agents.

Hypotheses

  • Manifest and permission analysis will be more reliable than behavioral binary triage in the MVP.
  • Evidence extraction quality will be a stronger differentiator than raw detection rate.
  • AI-assisted review will need rule anchors to avoid hallucinated findings.

Trial count

1,800

Repeated across prompt variants, model families, and controlled runs.

Repetitions per case

4

Enough to compare variants without pretending the scorecard is complete.

Variant

Rules-only triage

Static rules and heuristics without model-assisted review.

Captures deterministic baseline.

Variant

Model-assisted triage

Model reviews extracted artifact evidence and classifies risk.

Measures triage reasoning quality.

Variant

Hybrid evidence triage

Rules extract evidence and model summarizes risk with structured output.

Preferred commercial pathway.

Methodology

How the benchmark will be run

Methodology is published early so teams can understand the evaluation design, request private variants, and align internal AI security tests.

Research questions

  • How accurately can AI-assisted triage identify risky permissions, embedded secrets, and suspicious behavior?
  • Can triage systems extract useful evidence from artifacts without excessive false positives?
  • Which artifact classes are hardest to classify reliably?
  • Can findings be exported into useful security review and SARIF-style workflows?

Evaluation design

Run artifact analyzers, model-assisted reviewers, and rule-based checks across synthetic and curated artifacts with known labels. Score detection, false positives, evidence extraction, and report quality.

Sampling plan

Use synthetic browser extension manifests, JS bundles, CLI configs, Docker files, packaged agent manifests, binary metadata, and embedded secret fixtures.

Grading and statistics

Use reference labels, rule checks, permission heuristics, static indicators, rubric grading, and human review for complex findings.

Report artifact detection rate, false positive rate, evidence completeness score, and severity accuracy by artifact class.

Limitations

Synthetic artifacts cannot capture every malware or supply-chain behavior.
Binary analysis may require deeper reverse engineering outside MVP scope.
Public examples should not distribute harmful payloads.

Version artifact fixtures, labels, extraction rules, analyzer versions, and model prompts.

Do not publish live malware, exploit kits, or usable secret material.

Metrics

Report outputs

Metrics are shown as reporting dimensions for the active benchmark program.

Metric

Artifact detection rate

Public-safe

Share of risky artifacts correctly detected.

Unit

percent

Direction

higher is better

Aggregation

rate

Primary detection metric by artifact class.

Metric

False positive rate

Public-safe

Share of benign artifacts incorrectly flagged.

Unit

percent

Direction

lower is better

Aggregation

rate

Important for vendor review utility.

Metric

Severity accuracy score

Public-safe

Accuracy of severity labels for artifact findings.

Unit

score

Direction

higher is better

Aggregation

mean

Requires human review for critical cases.

Metric

Evidence completeness score

Public-safe

Completeness of extracted proof and report-ready finding details.

Unit

score

Direction

higher is better

Aggregation

mean

Core differentiator for SecEng Artifact Analyzer.

Datasets

Data fixtures, source types, and public-safety boundaries

All public-safe. No raw job-description text or private corpus material is shown here.

Dataset

Synthetic artifact triage corpus v1

Public-safe

Synthetic extension manifests, JS bundles, CLI configs, package manifests, Docker files, binary metadata, and packaged agent fixtures.

Source

synthetic

Classification

synthetic

Item count

140

Source: datasets/artifact-binary-triage/synthetic-artifact-triage-corpus-v1.jsonl

Outputs

Report outputs

Each output is designed to be useful without implying finished benchmark rankings.

Output

Artifact triage methodology note

methodology note

Public methodology for artifact classes, labels, extraction, scoring, and SARIF/evidence export.

Security teams
Vendor risk teams
Product security teams

Output

Private artifact triage scorecard

scorecard

Private artifact risk report with evidence, findings, severity, and remediation guidance.

Private benchmark customers
Vendor risk teams
Security leadership

Status timeline

Where the suite sits now

The timeline shows current build state and the publication boundary.

Status timeline

Active build

In progress

Methodology and fixtures are under active build; private scoping is available.

Pending

Status timeline

Artifact fixture design

Dataset design

Create synthetic manifests, extension bundles, configs, and metadata fixtures.

Pending

Status timeline

Artifact triage harness

Harness build

Wire artifact analyzer, extraction rules, model review prompts, and SARIF export.

Pending

Commercial bridge

Private benchmarking and related assets

Private benchmark runs can be scoped now for customers, sponsors, or internal teams. Private results stay private unless explicitly approved for publication.

Private benchmark CTA

Request Artifact Benchmark

Available now

Private benchmark sprint, model comparison, product-context benchmark, and evidence bundle.

Claim controls

What the public page can and cannot say

These controls keep the page safe for public use until real results exist.

Claim controls

Public claim guardrails

Internal / Teaser Only

This suite is in active build. Public artifact benchmark results will publish after validation.

Claim boundary

  • Public scorecards are validation-gated.
  • Ranking claims are not allowed.
  • Vendor comparison claims are not allowed.
  • This suite is in active build. Public artifact benchmark results will publish after validation.

Do not claim

  • Do not claim malware detection coverage.
  • Do not imply live binary reverse engineering results.
  • Do not publish harmful artifacts.