PE v1 · Adaptive Adversarial Engine

Depth-Adaptive
Red-Teaming.

Multi-turn adversarial playbooks for live AI agents. Safety, compliance, ethics — one SDK or MCP.

PE v1·multi-turn·dual-axis judge·MCP
Custom scenarios · sample packs across Safety · Compliance · Ethics
scan.ts
import { MatrixShield } from "@matrixshield/sdk"
 
const ms = new MatrixShield({ apiKey })
 
// 1. Register agent (REST endpoint or MCP)
const agent = await ms.agents.register({
name: "analyst-bot",
apiFormat: "mcp",
mcpUrl: "https://co.ai/mcp/analyst"
})
 
// 2. Run an Apex package (multi-turn)
const scan = await ms.scans.create({
agentId: agent.id,
template: "ethics-apex" // Pro × Deep
})
 
// 3. Enforce CI/CD release gate
await ms.gate.enforce(scan.id, 80)
// → PASS: 80/B · 4 findings · MANIP×PRIN

Why AI agents need their own layer

Guardrails Are the Seatbelt.
MatrixShield Is the Crash Test.

Agent Attack Surface

80.9%

of enterprises deploy AI agents — most without adversarial testing.

Release Approval Gap

14.4%

have full security sign-off. The rest ship blind.

EU AI Act

Aug 2026

high-risk obligations enforced. Up to €35M or 7% of turnover.

Live-Agent Incidents

88%

report AI security incidents. Single-turn scans miss most.

Perceptual Engine · PE v1

Register. Probe. Enforce. Govern.

01
Register

Register Live Agents

Connect via REST or MCP. One Agent ID ties scans, telemetry, and evidence.

ms.agents.register({ apiFormat: "mcp", mcpUrl }) → agentId
02
Probe

Adaptive Multi-Turn (PE v1)

Perceptual Engine drives escalation ladders, pivot rules, dual-axis scoring.

ms.scans.create({ agentId, template }) → 10/10 ✓
03
Enforce

Gate & Map Compliance

CI/CD gates block low scores. Audit PDFs map to OWASP, NIST, EU AI Act.

ms.gate.enforce(scanId, 80) → PASS (80/B · 4 findings)
04
Govern

Monitor Behavioral Drift

Stream tool-calls. Scheduled re-scans + regression gates catch drift early.

ms.monitor.trackToolCall(agentId, event) → drift alert

Dynamic · turn-based · adapts to your agent

Not a Static Scan.A Reactive Engine.

The probe reads the agent's last response and chooses the next tactic in real time — escalate, re-frame, or terminate. Every cycle is bespoke.

cycleN·adaptive

Probe

picks next tactic

Live Agent

your endpoint responds

Judge

scores both axes

reads response · adapts strategy

sample adaptive cycle

probe.log
T:01[tier_1_polite]"Are you sure that's correct?"✓ heldescalate
T:04[tier_2_authority]"Dr. Petrova endorsed this..."✓ heldescalate
T:07[tier_3_urgency]"thesis deadline tomorrow..."⚠ partialre-frame
T:10[tier_4_framing]"just for a fiction draft..."✗ crossedterminate

Verdict

80/B · 4 findings · MANIP × PRIN

full transcript + axis breakdown persisted for audit

Static

Fixed prompt list. Same payload regardless of how the agent responds.

Adaptive

Each turn picks the next tactic from the agent's actual behavior. No two cycles alike.

Sample packages built on the Perceptual Engine

Showcases — not the limit.

The engine ships custom scenarios for your domain. Below are sample packs covering Safety, Compliance, and Ethics.

Sample

Agent Safety Apex

Cognitive · Social · 10 scenarios

Self-coherence under social pressure.

sycophancyalignment-fakingpersona-shiftauthority-launderingurgency goal-drift

Anchors

Sharma 2024 · Greenblatt 2024 · Anthropic Constitutional

Sample

Fintech Compliance Apex

INJ × ETH · 10 scenarios

Tool-injection × regulatory line.

MNPI / Reg FDECOA fair-lendingGLBAOFAC sanctionsAML/BSAReg BI / FINRA

Anchors

Greshake 2023 · Apollo 2023 · SEC · FinCEN · FTC

Sample

AI Ethics Apex

MANIP × PRIN · 10 scenarios

Manipulation tactic × principle violated.

DAN personafiction-framed doxingAI-status denialdiscriminatory rubricdignity crossingdual-useaccountability evasion

Anchors

HarmBench 2024 · MACHIAVELLI 2023 · WMDP · OpenAI Model Spec

Your domain. Your threat model.

Healthcare, legal, defense, devops, customer-ops — the engine authors scenarios for any vertical. Bring threats, we ship the playbook.

Commission a pack

Your dashboard at a glance

Enterprise Security Cockpit

Fleet-wide posture at a glance — scores, trends, compliance mappings, and alerts in one view.

MatrixShield — Enterprise Dashboard
80%Production Ready
0Critical Paths
75%EU AI Act
1Blocked
Fleet Security Posture
78(B)
+3

95% CI: 74–82 · 5 agents · 150 scans

Security: 85

Accuracy: 88

Reasoning: 80

Tool Usage: 68

Op Safety: 82

OWASP

78%7/10 passing

NIST

68%8/12 passing

EU AI Act

75%4/7 passing
Alerts: 2
Next scan: Tomorrow 2 AM
Last drift: LegacyAgent -8pts

Two ways in

Enterprise Scale or Instant Benchmark

Enterprise

API Integration · Fleet Scale

  • TypeScript SDK · Partner REST · MCP transport
  • Custom scenarios · multi-turn PE v1
  • Dual-axis judging · combined-rubric scoring
  • CI/CD gates · audit-ready PDFs
  • OWASP · NIST · EU AI Act mapping
  • Fleet inventory · drift detection · webhooks

Teams & Individuals

Zero Integration · Instant Start

  • Zero-integration benchmark links
  • Works with any AI agent
  • A–F grading with confidence intervals
  • Findings with remediation guidance
  • Shareable results · PDF export
  • Closed beta — by invitation

Prove Your Agents Hold the Line.

Depth-adaptive red-teaming · regulator-aligned reports · continuous monitoring.