Skip to content

Harombe Security Architecture Whitepaper

Executive Summary

Harombe implements a defense-in-depth security architecture for AI agent tool execution. Unlike other open-source agent frameworks that rely on prompt-level safeguards or protocol-level trust, Harombe enforces security at the infrastructure layer using container isolation, network egress filtering, credential vaults, and human-in-the-loop approval gates.

Key insight (Feb 2026 security research): MCP (Model Context Protocol) cannot enforce security at the protocol level. An agent that can send arbitrary JSON-RPC messages can bypass any protocol-level restriction. All security must be enforced at the infrastructure layer.

Threat Model

Threats We Address

Threat Mitigation
Agent executes malicious shell commands Container isolation + HITL approval for dangerous tools
Credential leakage via tool output Secret scanning + audit log redaction
Network exfiltration of sensitive data Per-container egress filtering with allowlists
Prompt injection causing tool misuse Risk classification + HITL gates for high-risk operations
Unauthorized access to credentials Vault-based secret management, never in config files
Lateral movement between tools Separate containers per capability, no shared filesystem
Audit trail tampering Append-only SQLite with WAL mode

Threats We Do Not Address (v0.1.0)

  • Compromised host OS (assumes trusted host)
  • Supply chain attacks on container images (planned for v2)
  • Side-channel attacks on shared hardware
  • Sophisticated evasion of content filters

Architecture

┌─────────────────────────────────────────────────────┐
│  Agent Container                                     │
│  - ReAct loop + LLM                                 │
│  - Can ONLY communicate with MCP Gateway             │
│  - No direct network, filesystem, or credential access│
└───────────────┬─────────────────────────────────────┘
                │ JSON-RPC 2.0
┌─────────────────────────────────────────────────────┐
│  MCP Gateway                                         │
│  - Authentication + authorization                    │
│  - Audit logging (every request/response)            │
│  - Secret scanning (block credential leakage)        │
│  - HITL gates (approval for dangerous operations)    │
│  - Request routing to capability containers          │
└──────┬──────────┬──────────┬───────────┬────────────┘
       │          │          │           │
       ▼          ▼          ▼           ▼
  ┌────────┐ ┌────────┐ ┌────────┐ ┌────────────┐
  │Browser │ │Files   │ │Code    │ │Web Search  │
  │Container│ │Container│ │Container│ │Container   │
  │        │ │        │ │        │ │            │
  │Pre-auth│ │Scoped  │ │gVisor  │ │Allowlisted │
  │cookies │ │volumes │ │sandbox │ │egress      │
  └────────┘ └────────┘ └────────┘ └────────────┘

Security Layers

1. Container Isolation

Every tool capability runs in its own Docker container with:

  • Resource limits: CPU, memory, PID caps
  • Non-root execution: UID 1000
  • Capability dropping: Minimal Linux capabilities
  • No shared filesystem: Explicit volume mounts only

2. Network Egress Filtering

Each container has an independent network namespace with:

  • Default deny: No outbound traffic unless explicitly allowed
  • Domain allowlists: Wildcard support (e.g., *.github.com)
  • DNS filtering: Queries logged and filtered
  • iptables rules: Enforced at the kernel level

3. Credential Management

Secrets never appear in configuration files:

  • HashiCorp Vault: Production-grade dynamic secrets
  • SOPS: Encrypted files for team environments
  • Environment injection: Secrets delivered at container startup, cleaned on stop
  • Secret scanning: Detect credentials in tool output before returning to agent

4. Audit Logging

Every operation is logged to an append-only SQLite database:

  • Event types: Requests, responses, tool calls, security decisions
  • Sensitive data redaction: API keys, passwords, JWT tokens automatically scrubbed
  • Correlation IDs: Track requests across the entire pipeline
  • Retention policies: Configurable cleanup (default 90 days)
  • Performance: WAL mode, <1ms writes

5. Human-in-the-Loop Gates

Risk-based approval system for dangerous operations:

  • Risk levels: LOW, MEDIUM, HIGH, CRITICAL
  • Auto-deny on timeout: 60s default, prevents unattended execution
  • CLI and API interfaces: Rich terminal prompts or webhook notifications
  • Audit trail: Every approval/denial decision logged

6. Browser Security

Pre-authenticated browser automation with:

  • Credential injection before agent access: Agent never sees raw passwords
  • Accessibility-based interaction: Structured semantic tree, not raw DOM
  • HttpOnly cookies: Protected from script access
  • Password field protection: Auto-deny typing into password/secret inputs
  • 16 risk classification rules: Covering navigation, form submission, downloads

Comparison with Competitors

Feature Harombe CrewAI LangGraph AutoGen OpenClaw
Container isolation Per-tool None None None None
Network egress filtering Per-container None None None None
Credential vault Vault/SOPS/env None None None None
Audit logging SQLite + redaction None Langsmith (cloud) None None
HITL approval gates Risk-based None Human-in-loop node None None
Secret scanning Pattern + entropy None None None None
Browser pre-auth Cookie injection None None None None

Configuration Example

security:
  enabled: true
  isolation: docker

  gateway:
    host: 127.0.0.1
    port: 8100

  audit:
    enabled: true
    database: ~/.harombe/audit.db
    retention_days: 90
    redact_sensitive: true

  credentials:
    method: vault
    vault_addr: http://localhost:8200

  containers:
    browser:
      image: harombe/browser:latest
      egress_allow:
        - "*.github.com"
        - "*.google.com"
    filesystem:
      image: harombe/filesystem:latest
      egress_allow: []
      mounts:
        - /home/user/documents:ro

  hitl:
    enabled: true
    timeout: 60

Limitations and Future Work

Experimental Features (not production-validated)

  • Zero-knowledge proofs: Protocol models implemented, not integrated end-to-end
  • Hardware security modules: Software simulation only, requires TPM/SGX/SEV-SNP hardware
  • Compliance reporting: Heuristic templates, not audit-grade

Planned Improvements

  • Container image signing and verification
  • Runtime network enforcement (beyond declarative permissions)
  • Plugin sandboxing with resource quotas
  • Independent security audit engagement