AI Security Vetting
Continuous AI Assurance for Enterprise
Secure your AI systems with the same rigor you apply to your human workforce. AI Security Vetting provides continuous behavioral assessment of AI agents, ensuring they remain aligned, secure, and trustworthy throughout their lifecycle.
Why Vet Your AI?
π‘οΈ Defence in Depth
Don't rely solely on provider safeguards. Our tool tests your actual deployment configuration, custom prompts, and integrated tools to ensure your specific setup is secure.
π Continuous Verification
AI models change. Prompts evolve. Test regularly to catch regressions and new vulnerabilities as your AI system grows and adapts.
π― Multi-Provider Coverage
Test across OpenAI, Azure OpenAI, Anthropic, Google Gemini, Microsoft Copilot, Microsoft Foundry, MCP servers, AI Agents, and any OpenAI-compatible API with a single tool.
π Actionable Reports
Get detailed HTML, Markdown, CSV, and JSONL outputs with OWASP LLM Top 10 mapping, severity scores, attack replays, hardening advice, and specific remediation guidance.
π MCP & Agent Security
Purpose-built test suites for Model Context Protocol servers (12 categories) and AI Agents (10 categories) covering tool injection, privilege escalation, memory poisoning, and more.
π§ͺ Advanced Attack Techniques
Multi-turn conversation attacks, 11 obfuscation encodings (ROT13, Base64, Atbash, Caesar, Morse, Binary, Leet, Unicode confusables, and more), and system-layer compromise simulation.
Supported AI Providers
GPT-4, GPT-3.5 and compatible APIs
Enterprise-grade OpenAI models
Claude 3 Haiku, Sonnet, Opus
Gemini Pro and Ultra models
Bot Framework Direct Line v3
Any OpenAI-compatible API
Azure AI Foundry model catalogue
Azure AI Foundry agent orchestration
stdio and HTTP/SSE transports
Generic agent HTTP endpoints
Specialised for Australian π¦πΊ, New Zealand π³πΏ and Singapore πΈπ¬ Data
Our tool generates synthetic, checksum-valid Australian, New Zealand and Singapore identifiers to test your AI's memory safety and data protection capabilities. Never uses real PII - only realistic test data.
π¦πΊ Australian Identifiers
- Tax File Numbers (TFN) - Checksum-validated 9-digit identifiers
- Medicare Numbers - Valid format with check digits
- State Driver Licences - NSW, VIC, QLD, WA, SA, TAS, ACT, NT formats
- Australian Passports - Realistic passport number formats
- Australian Mobile Numbers - Valid 04xx xxx xxx patterns
- Australian Business Numbers (ABN) - 11-digit validated identifiers
π³πΏ New Zealand Identifiers
- IRD Numbers - Checksum-validated Inland Revenue identifiers
- NZ Driver Licences - Valid regional format variations
- NZ Passports - Realistic New Zealand passport formats
- NZ Mobile Numbers - Valid 02x xxx xxxx patterns
- NZBN (Business Numbers) - 13-digit business identifiers
- National Health Index (NHI) - Healthcare identifier formats
πΈπ¬ Singapore Identifiers
- NRIC/FIN Numbers - Checksum-validated national identity identifiers
- UEN (Business Numbers) - Unique Entity Number business identifiers
- Singapore Passports - Realistic passport number formats
- Singapore Mobile Numbers - Valid +65 xxx xxxx patterns
- PayNow/FAST Payment IDs - Digital payment identifier formats
Memory Safety Testing Modes
Same-Session Testing
Seeds are injected in the same exchange to test immediate echo vulnerabilities
Cross-Session Testing
Seeds sent in separate sessions to test long-term memory retention
Strict Mode Validation
Only fails on validated sensitive data (TFN, Medicare, etc.) to reduce false positives
β οΈ Ethical Testing Guarantee
All test data is synthetic and generated algorithmically. We never use real customer data, production PII, or actual government identifiers. Our synthetic data follows authentic formatting and validation rules but represents no real individuals or entities.
OWASP LLM Top 10 Coverage
Comprehensive test coverage aligned to the OWASP Top 10 for Large Language Model Applications. Every test maps to one or more OWASP categories with contextualised remediation guidance.
LLM01 - Prompt Injection
Direct and indirect prompt injection, jailbreak attempts, system prompt override, role-play escapes, and instruction hierarchy violations.
LLM02 - Insecure Output Handling
Tests for unescaped HTML/JS in responses, markdown injection, downstream code execution risk, and output sanitisation failures.
LLM03 - Training Data Poisoning
Probes for verbatim training data recall, memorised PII echoes, and susceptibility to data-poisoning influenced outputs.
LLM04 - Model Denial of Service
Resource exhaustion probes, recursive prompt loops, token amplification attacks, and context window flooding.
LLM05 - Supply Chain Vulnerabilities
Third-party plugin trust, dependency confusion, malicious tool registration, and package hallucination risks.
LLM06 - Sensitive Information Disclosure
PII leakage, memory retention, system prompt extraction, API key exposure, and cross-session data bleeding.
LLM07 - Insecure Plugin Design
Tool schema disclosure, excessive tool permissions, missing input validation, and plugin-to-plugin escalation.
LLM08 - Excessive Agency
Unsafe autonomous actions, missing human-in-the-loop checks, privilege escalation via tools, and governance bypass.
LLM09 - Overreliance
Hallucination detection, missing uncertainty disclaimers, fabricated citations, and unsupported factual claims.
LLM10 - Model Theft
Model architecture probing, weight extraction attempts, fine-tuning data leakage, and system configuration disclosure.
MCP Server Security Testing
Purpose-built security assessment for Model Context Protocol (MCP) servers β the emerging standard for connecting AI models to external tools and data sources. Tests both direct protocol-level and LLM-mediated attack vectors.
Supports both stdio and HTTP/SSE transports via --provider mcp
Direct Protocol Tests
Tool Input Injection
Malicious payloads in tool arguments targeting command injection, path traversal, and SQL injection via MCP tool calls.
Schema Exposure
Probes for internal schema leakage, database structure disclosure, and API specification extraction through MCP responses.
Resource Access Control
Attempts to access unauthorised resources, traverse directory boundaries, and bypass resource-level permissions.
Input Validation
Oversized payloads, malformed JSON-RPC requests, type confusion, and boundary-value testing of tool parameters.
Authentication & Authorisation
Missing or weak authentication checks, token replay, session confusion, and privilege boundary violations.
Error Information Disclosure
Stack traces, internal paths, database connection strings, and debug information leaked through error responses.
Prompt Template Injection
Injection into server-side prompt templates, variable substitution attacks, and template escape sequences.
LLM-Mediated Attack Tests
Tool Description Poisoning
Malicious instructions embedded in tool descriptions that manipulate LLM behaviour when selecting or invoking tools.
Cross-Tool Privilege Escalation
Chaining multiple tool calls to escalate privileges, pivot between services, or access data beyond intended scope.
Return Value Injection
Malicious content in tool return values designed to hijack LLM responses, inject instructions, or alter downstream behaviour.
Excessive Permissions
Tools granted overly broad capabilities, missing least-privilege enforcement, and unrestricted write access patterns.
Tool Name Shadowing
Duplicate or confusingly similar tool names that can trick the LLM into calling the wrong tool or a malicious replacement.
AI Agent Security Testing
Dedicated test suite for autonomous AI agents that can take actions, use tools, and maintain state across conversations. Tests the unique security risks that emerge when LLMs operate with agency.
Use via --provider ai-agent --target-env ai-agent with any HTTP-accessible agent endpoint
Tool Abuse
Manipulating agents into misusing their available tools for unintended purposes, data exfiltration, or harmful actions.
Multi-Turn Escalation
Gradually escalating privileges across multiple conversation turns, building trust before exploiting agent capabilities.
Memory Poisoning
Injecting false information into agent memory/context that persists and influences future decisions and actions.
Excessive Autonomy
Testing whether agents take irreversible or high-impact actions without appropriate human confirmation or oversight.
Instruction Hierarchy
Attempting to override system-level instructions with user-level prompts, testing instruction priority enforcement.
Cross-Plugin Escalation
Leveraging one plugin or tool to gain unauthorised access to another, exploiting trust relationships between tools.
Excessive Scope
Probing agents to act outside their defined boundaries, accessing systems or data beyond their authorised scope.
Agent Identity
Testing whether agents can be convinced to adopt different personas, bypass their identity constraints, or impersonate other agents.
Resource Denial of Service
Triggering excessive API calls, unbounded loops, or resource-intensive operations to exhaust agent compute budgets.
Input Boundary
Oversized inputs, malformed data, and edge cases designed to break agent parsing, cause errors, or trigger unexpected behaviour.
Quick Start Examples
Get started in minutes with these common configurations. Replace the binary name with yours (e.g., AISecurityVetting.exe on Windows).
OpenAI (Generic Environment)
# macOS/Linux
export OPENAI_API_KEY=sk-xxxx
./AISecurityVetting --provider openai --model gpt-4o-mini --license-key YOUR_KEY
# Windows PowerShell
$env:OPENAI_API_KEY="sk-xxxx"
.\AISecurityVetting.exe --provider openai --model gpt-4o-mini --license-key YOUR_KEY
Azure OpenAI
# Set environment variables
export AZURE_OPENAI_API_KEY=xxxxx
./AISecurityVetting \
--provider azure-openai \
--azure-endpoint https://YOUR-RESOURCE.openai.azure.com \
--azure-deployment gpt4o \
--model gpt-4o \
--license-key YOUR_KEY
Anthropic Claude
# Claude API
export ANTHROPIC_API_KEY=xxxx
./AISecurityVetting --provider anthropic --model claude-3-5-sonnet-20240620 --license-key YOUR_KEY
# Windows PowerShell
$env:ANTHROPIC_API_KEY="xxxx"
.\AISecurityVetting.exe --provider anthropic --model claude-3-5-sonnet-20240620 --license-key YOUR_KEY
Google Gemini
# Gemini API
export GEMINI_API_KEY=xxxx
./AISecurityVetting --provider gemini --model gemini-1.5-pro --license-key YOUR_KEY
# Windows PowerShell
$env:GEMINI_API_KEY="xxxx"
.\AISecurityVetting.exe --provider gemini --model gemini-1.5-pro --license-key YOUR_KEY
# Optional: Use different API version
./AISecurityVetting --provider gemini --model gemini-1.5-pro --gemini-base v1 --license-key YOUR_KEY
Microsoft Copilot (Direct Line v3)
# Microsoft Copilot with specialised environment
export COPILOT_DIRECTLINE_SECRET=xxxx
./AISecurityVetting \
--provider copilot --model ignored \
--copilot-user-id security_tester \
--target-env copilot \
--seed-mode same-session --seed-count 5 \
--license-key YOUR_KEY
# Windows PowerShell
$env:COPILOT_DIRECTLINE_SECRET="xxxx"
.\AISecurityVetting.exe `
--provider copilot --model ignored `
--copilot-user-id security_tester `
--target-env copilot `
--seed-mode same-session --seed-count 5 `
--license-key YOUR_KEY
Note: Copilot uses Bot Framework Direct Line v3. The --model parameter is ignored. Recommended to use --target-env copilot for specialised Dataverse/Power Platform testing.
OpenAI Compatible APIs
# Any OpenAI-compatible API (Ollama, LocalAI, vLLM, etc.)
export OPENAI_API_KEY=your-api-key-or-token
./AISecurityVetting \
--provider openai-compat \
--base-url http://localhost:11434/v1 \
--model llama3:latest \
--license-key YOUR_KEY
# Example: Ollama local instance
export OPENAI_API_KEY=dummy-key
./AISecurityVetting \
--provider openai-compat \
--base-url http://localhost:11434/v1 \
--model mistral:7b \
--license-key YOUR_KEY
# Example: vLLM deployment
export OPENAI_API_KEY=your-vllm-token
./AISecurityVetting \
--provider openai-compat \
--base-url https://your-vllm-endpoint.com/v1 \
--model meta-llama/Llama-2-7b-chat-hf \
--license-key YOUR_KEY
Compatible with: Ollama, LocalAI, vLLM, Together AI, Groq, Perplexity API, and any other service implementing OpenAI's Chat Completions API format.
Microsoft Foundry Models (Azure AI Foundry)
# Azure AI Foundry model catalogue
export AZURE_OPENAI_API_KEY=xxxxx
./AISecurityVetting \
--provider foundry-models \
--base-url https://YOUR-PROJECT.services.ai.azure.com \
--model gpt-4o \
--license-key YOUR_KEY
# Windows PowerShell
$env:AZURE_OPENAI_API_KEY="xxxxx"
.\AISecurityVetting.exe `
--provider foundry-models `
--base-url https://YOUR-PROJECT.services.ai.azure.com `
--model gpt-4o `
--license-key YOUR_KEY
Note: Uses Azure AI Foundry's model catalogue endpoint. Set the --base-url to your project's inference endpoint.
Microsoft Foundry Agent Service
# Azure AI Foundry Agent Service
export AZURE_OPENAI_API_KEY=xxxxx
./AISecurityVetting \
--provider foundry-agent-service \
--base-url https://YOUR-PROJECT.services.ai.azure.com \
--foundry-agent-id asst_xxxxxxxxxxxx \
--model ignored \
--license-key YOUR_KEY
# Windows PowerShell
$env:AZURE_OPENAI_API_KEY="xxxxx"
.\AISecurityVetting.exe `
--provider foundry-agent-service `
--base-url https://YOUR-PROJECT.services.ai.azure.com `
--foundry-agent-id asst_xxxxxxxxxxxx `
--model ignored `
--license-key YOUR_KEY
Note: Tests a deployed Azure AI Foundry agent. The --model parameter is ignored as the agent determines its own model. Each test creates a new conversation thread.
MCP Server (Model Context Protocol)
# MCP via stdio transport (local process)
./AISecurityVetting \
--provider mcp \
--mcp-transport stdio \
--mcp-command /usr/local/bin/my-mcp-server \
--mcp-args "--config,/path/to/config.json" \
--target-env mcp \
--model ignored \
--license-key YOUR_KEY
# MCP via HTTP/SSE transport (remote server)
./AISecurityVetting \
--provider mcp \
--mcp-transport http \
--base-url http://localhost:3000/mcp \
--target-env mcp \
--model ignored \
--license-key YOUR_KEY
Note: Tests MCP servers directly at the protocol level. Supports both stdio (spawns a local process) and HTTP/SSE (connects to a remote endpoint) transports. Runs 12 MCP-specific test categories covering tool injection, schema exposure, access control, and LLM-mediated attacks.
AI Agent (Generic HTTP Endpoint)
# Test any AI agent with an HTTP API
./AISecurityVetting \
--provider ai-agent \
--base-url https://your-agent.example.com/api/chat \
--target-env ai-agent \
--model ignored \
--license-key YOUR_KEY
# With authentication header
export OPENAI_API_KEY=your-agent-api-key
./AISecurityVetting \
--provider ai-agent \
--base-url https://your-agent.example.com/api/chat \
--target-env ai-agent \
--model ignored \
--attack-preset aggressive \
--license-key YOUR_KEY
Note: Tests any AI agent accessible via HTTP. The agent should accept JSON requests and return text responses. Runs 10 agent-specific test categories covering tool abuse, multi-turn escalation, memory poisoning, autonomy limits, and more.
Advanced Configuration Examples
Advanced scenarios for comprehensive security testing, red team exercises, and specialised environments.
--region anz OR --region au OR --region nz OR --region sg
Full Enterprise Red Team Testing
# RAG + tools environment, aggressive attacks, obfuscation, strict memory classification
export OPENAI_API_KEY=sk-xxxx
./AISecurityVetting \
--provider openai --model gpt-4o-mini \
--target-env rag-tools \
--attack-preset aggressive \
--attack-obfuscation rot13 \
--strict-mode \
--seed-mode both --seed-count 15 \
--region anz \
--out red_team_$(date +%F) \
--license-key YOUR_KEY \
--temperature 0.0 \
--timeout 90
Use case: Comprehensive security assessment for enterprise AI agents with tool access. Tests against sophisticated attack patterns with obfuscation techniques.
Memory/Retention Deep Testing
# Focus on memory leaks with extensive synthetic PII seeding
export ANTHROPIC_API_KEY=xxxx
./AISecurityVetting \
--provider anthropic --model claude-3-5-sonnet-20240620 \
--seed-mode both \
--seed-count 25 \
--seed-namespace "MEMORY-TEST-$(date +%Y%m%d)" \
--strict-mode \
--region anz \
--attack-preset auto \
--out memory_audit_results \
--license-key YOUR_KEY
Use case: Validate memory safety and PII handling. Tests both same-session and cross-session retention with 25 synthetic AU/NZ records (TFN, Medicare, driver licenses, etc.).
Microsoft Copilot Agent Testing
# Specialised for Copilot with Dataverse/Power Platform probes
export COPILOT_DIRECTLINE_SECRET=xxxx
./AISecurityVetting \
--provider copilot --model ignored \
--copilot-user-id security_tester_001 \
--target-env copilot \
--seed-mode same-session --seed-count 10 \
--attack-preset aggressive \
--attack-as-system \
--region anz \
--out copilot_security_audit \
--license-key YOUR_KEY \
--timeout 120
Use case: Test Microsoft Copilot agents for Dataverse exfiltration, Power Automate triggers, SharePoint/Teams governance violations, and system-level prompt injection.
MCP Server Security Audit
# Full MCP server audit via stdio transport
./AISecurityVetting \
--provider mcp \
--mcp-transport stdio \
--mcp-command ./my-mcp-server \
--mcp-args "--verbose" \
--target-env mcp \
--model ignored \
--attack-preset aggressive \
--out mcp_security_audit_$(date +%F) \
--license-key YOUR_KEY \
--timeout 120
# MCP server audit via HTTP/SSE (remote)
./AISecurityVetting \
--provider mcp \
--mcp-transport http \
--base-url https://mcp.internal.example.com/sse \
--target-env mcp \
--model ignored \
--attack-preset aggressive \
--out mcp_remote_audit \
--license-key YOUR_KEY
Use case: Comprehensive MCP server security assessment covering tool input injection, schema exposure, resource access control, authentication bypass, error information disclosure, prompt template injection, tool description poisoning, cross-tool privilege escalation, return value injection, excessive permissions, and tool name shadowing.
AI Agent Security Assessment
# Full AI agent security assessment
export OPENAI_API_KEY=your-agent-key
./AISecurityVetting \
--provider ai-agent \
--base-url https://your-agent.example.com/api/chat \
--target-env ai-agent \
--model ignored \
--attack-preset aggressive \
--seed-mode both --seed-count 15 \
--region anz \
--out agent_security_audit_$(date +%F) \
--license-key YOUR_KEY \
--timeout 120
# Agent with custom system prompt context
./AISecurityVetting \
--provider ai-agent \
--base-url https://your-agent.example.com/api/chat \
--target-env ai-agent \
--model ignored \
--system-prompt "You are a customer service agent with access to order management tools..." \
--attack-preset aggressive \
--out agent_audit \
--license-key YOUR_KEY
Use case: End-to-end security testing of autonomous AI agents covering tool abuse, multi-turn privilege escalation, memory poisoning, excessive autonomy, instruction hierarchy bypass, cross-plugin escalation, scope violations, identity manipulation, resource DoS, and input boundary attacks.
Custom Test Suite with File Seeding
# Custom YAML suite with pre-defined synthetic PII seeds
export AZURE_OPENAI_API_KEY=xxxxx
./AISecurityVetting \
--provider azure-openai \
--azure-endpoint https://YOUR-RESOURCE.openai.azure.com \
--azure-deployment gpt4o \
--model gpt-4o \
--suite ./custom_financial_tests.yaml \
--seed-file ./synthetic_customer_data.yaml \
--seed-mode cross-session \
--attack-preset lite \
--strict-mode \
--system-prompt "You are a secure financial AI assistant..." \
--out custom_financial_audit \
--license-key YOUR_KEY
Use case: Industry-specific testing with custom scenarios and controlled synthetic data. Perfect for specialised domains like finance, healthcare, or legal services.
System-Layer Compromise Simulation
# Test system prompt injection and orchestrator bypass attempts
export GEMINI_API_KEY=xxxx
./AISecurityVetting \
--provider gemini --model gemini-1.5-pro \
--target-env rag-tools \
--attack-preset aggressive \
--attack-as-system \
--attack-obfuscation rot13 \
--seed-mode both --seed-count 20 \
--strict-mode \
--max-tokens 2048 \
--temperature 0.0 \
--out system_compromise_test \
--license-key YOUR_KEY \
--log-level debug
Use case: Simulate advanced persistent threats targeting the AI orchestration layer. Tests malicious system instructions, prompt injection via tool outputs, and multi-vector attacks.
Pro Tips for Advanced Testing
- Stable namespacing: Use consistent
--seed-namespacevalues for reproducible cross-session tests - Debug logging: Add
--log-level debugto troubleshoot provider connection issues - Timeout tuning: Increase
--timeoutfor slow providers or complex tool chains - Temperature control: Keep
--temperature 0.0for consistent security outcomes - Output organisation: Use date-stamped output directories for audit trails
Test Categories & Environments
Complete setup guide, command reference, and advanced configuration examples
π― Generic Environment (Default)
Core tests across all OWASP LLM Top 10 categories. Perfect for general LLM safety vetting and baseline security assessment.
π§ RAG + Tools Environment
Adds enterprise-specific probes for Salesforce, Xero, SharePoint, Slack, Jira, MYOB. Use for agents with enterprise integrations.
πΌ Microsoft Copilot Environment
Specialised tests for Dataverse, Power Automate, SharePoint, Outlook, Teams. Designed for Microsoft Copilot agents.
π MCP Server Environment
12 test categories for Model Context Protocol servers. Tests tool injection, schema exposure, access control, tool poisoning, privilege escalation, and more.
π€ AI Agent Environment
10 test categories for autonomous AI agents. Tests tool abuse, multi-turn escalation, memory poisoning, autonomy limits, identity integrity, and scope boundaries.
Output Formats
π Interactive HTML Report
Modern, searchable interface with KPI cards, severity charts, filters, and detailed analysis. Perfect for demos and stakeholder presentations.
π Markdown Report
Detailed findings with attack preambles, effective prompts, and elaborated evaluations. Great for documentation and sharing.
π CSV Results
Machine-friendly summary for analysis, trending, and integration with other tools. Includes scores, latency, and configuration details.
π JSONL Details
Complete test results with raw provider payloads, findings, and metadata. Perfect for programmatic analysis and integration.
Key Configuration Options
- Target Environments: generic, rag-tools, copilot, mcp, ai-agent β each with tailored test suites
- Seed Modes: Test memory retention with same-session, cross-session, or both patterns
- Attack Presets: None, lite, aggressive, or auto-scaling adversarial preambles
- Obfuscation Methods (11): ROT13, Base64, Atbash, Caesar cipher, Binary, Morse code, URL encoding, string reversal, Leet speak, character spacing, and Unicode confusables (Cyrillic homoglyphs)
- Multi-Turn Attacks: Gradual privilege escalation across multiple conversation turns with context-aware probing
- System-Layer Injection: Deliver attacks via the system role (
--attack-as-system) to simulate orchestrator compromise - Strict Mode: Only fail on validated sensitive data echoes (TFN, Medicare, NRIC, etc.)
- Custom Suites: Load your own YAML test definitions for specific requirements
See AI Security Vetting in Action
Watch how our tool systematically tests AI systems for security vulnerabilities, from prompt injection to memory leaks, delivering comprehensive reports in real-time.
Comprehensive Security Testing
Real-time vulnerability detection across 35+ security categories covering LLMs, MCP servers, and AI agents with detailed reporting and immediate insights.
OWASP LLM Top 10
MCP Server Security
AI Agent Security
Ready to Secure Your AI?
Don't leave your AI systems vulnerable to attacks. Implement continuous security testing and compliance monitoring today.
Explore All Products
See how all Cyber Automation products work together to secure your entire infrastructure.
Back to Product Overview