Human-in-the-Loop (HITL) Gates - Phase 4.5¶

Status: Design Complete Implementation: Phase 4.5 Dependencies: Phase 4.1-4.4 (MCP Gateway, Audit Logging)

Overview¶

Human-in-the-Loop (HITL) gates provide a safety mechanism that requires explicit user approval before executing potentially dangerous or irreversible operations. This prevents AI agents from performing destructive actions without human oversight.

Goals¶

Prevent accidental damage - Block destructive operations by default
Enable informed decisions - Show user what will happen before execution
Maintain audit trail - Log all approval/denial decisions
Flexible configuration - Per-tool and per-action rules
Timeout safety - Auto-deny if user doesn't respond

Architecture¶

Request Flow¶

┌─────────────────────────────────────────────────────────┐
│ 1. Agent sends tool call request                        │
│    POST /mcp with {"method": "tools/call", ...}        │
└──────────────────┬──────────────────────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────────────────────┐
│ 2. MCP Gateway receives request                         │
│    - Parses tool name and parameters                    │
│    - Checks HITL configuration                          │
└──────────────────┬──────────────────────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────────────────────┐
│ 3. HITL Gate checks if approval required                │
│    - Risk classification (low/medium/high/critical)     │
│    - Match against HITL rules                           │
│    - Check if user approval needed                      │
└──────────────────┬──────────────────────────────────────┘
                   │
         ┌─────────┴─────────┐
         │                   │
         ▼                   ▼
    ┌─────────┐         ┌─────────┐
    │ No HITL │         │ HITL    │
    │ Required│         │ Required│
    └────┬────┘         └────┬────┘
         │                   │
         │                   ▼
         │          ┌─────────────────────────┐
         │          │ 4. Prompt user          │
         │          │    - Show operation     │
         │          │    - Show parameters    │
         │          │    - Show risk level    │
         │          │    - Wait for response  │
         │          └────┬────────────────────┘
         │               │
         │               ▼
         │          ┌─────────────────────────┐
         │          │ 5. User decision        │
         │          │    - Approve (y)        │
         │          │    - Deny (n)           │
         │          │    - Timeout (auto-deny)│
         │          └────┬────────────────────┘
         │               │
         │         ┌─────┴─────┐
         │         │           │
         │         ▼           ▼
         │    ┌─────────┐ ┌─────────┐
         │    │Approved │ │ Denied  │
         │    └────┬────┘ └────┬────┘
         │         │           │
         ▼         ▼           ▼
    ┌──────────────────────────────────┐
    │ 6. Log decision to audit trail   │
    │    - Decision (approve/deny)     │
    │    - User who decided            │
    │    - Timestamp                   │
    │    - Reason (if provided)        │
    └──────────────┬───────────────────┘
                   │
         ┌─────────┴─────────┐
         │                   │
         ▼                   ▼
    ┌─────────┐         ┌─────────┐
    │Execute  │         │ Return  │
    │Tool     │         │ Denied  │
    └────┬────┘         └────┬────┘
         │                   │
         ▼                   ▼
    ┌──────────────────────────────────┐
    │ 7. Return result to agent        │
    └──────────────────────────────────┘

Core Components¶

1. HITLGate¶

Central class that manages approval requests:

class HITLGate:
    """Manages human-in-the-loop approval for operations."""

    async def check_approval(
        self,
        operation: Operation,
        context: RequestContext
    ) -> ApprovalDecision:
        """Check if operation requires approval and get user decision."""

    async def prompt_user(
        self,
        operation: Operation,
        timeout: int = 60
    ) -> ApprovalDecision:
        """Prompt user for approval with timeout."""

    def classify_risk(
        self,
        operation: Operation
    ) -> RiskLevel:
        """Classify operation risk level."""

2. RiskClassifier¶

Analyzes operations and assigns risk levels:

class RiskLevel(Enum):
    LOW = "low"           # Read-only operations, safe actions
    MEDIUM = "medium"     # Modifications with easy undo
    HIGH = "high"         # Destructive operations, hard to undo
    CRITICAL = "critical" # Irreversible operations, data loss

class RiskClassifier:
    """Classifies operation risk based on rules."""

    def classify(self, operation: Operation) -> RiskLevel:
        """Determine risk level for operation."""

        # Check operation type
        if operation.tool_name == "send_email":
            return RiskLevel.HIGH

        if operation.tool_name == "delete_file":
            # Check if system file
            if is_system_file(operation.params["path"]):
                return RiskLevel.CRITICAL
            return RiskLevel.HIGH

        # Default: low risk
        return RiskLevel.LOW

3. ApprovalPrompt¶

Handles user interaction:

class ApprovalPrompt:
    """Manages user approval prompts."""

    async def prompt_cli(
        self,
        operation: Operation,
        risk_level: RiskLevel,
        timeout: int
    ) -> ApprovalDecision:
        """Show CLI prompt with timeout."""

    async def prompt_api(
        self,
        operation: Operation,
        risk_level: RiskLevel,
        timeout: int
    ) -> ApprovalDecision:
        """Create pending approval for API clients."""

4. ApprovalDecision¶

Result of approval request:

@dataclass
class ApprovalDecision:
    """Result of approval request."""

    decision: Literal["approve", "deny", "timeout"]
    user: str  # Who made the decision
    timestamp: datetime
    reason: Optional[str] = None
    timeout_seconds: Optional[int] = None

Configuration¶

HITL Rules¶

Define which operations require approval:

security:
  hitl:
    enabled: true
    default_timeout: 60 # seconds

    # Rules for requiring approval
    rules:
      # Always require approval for these tools
      - tools: [send_email, delete_file, execute_sql]
        risk: high
        require_approval: true
        timeout: 60

      # Require approval for destructive actions
      - tools: [write_file]
        conditions:
          - param: path
            matches: "^/etc/.*|^/sys/.*|^/root/.*"
        risk: critical
        require_approval: true
        timeout: 30

      # No approval for read-only operations
      - tools: [read_file, list_files, web_search]
        risk: low
        require_approval: false

Risk-Based Approval¶

Configure approval based on risk level:

security:
  hitl:
    enabled: true

    # Risk-based policies
    policies:
      low:
        require_approval: false

      medium:
        require_approval: true
        timeout: 120 # 2 minutes
        allow_skip: true # User can choose "always allow"

      high:
        require_approval: true
        timeout: 60
        allow_skip: false

      critical:
        require_approval: true
        timeout: 30
        allow_skip: false
        require_reason: true # Must provide reason

User Experience¶

CLI Approval Prompt¶

┌─────────────────────────────────────────────────────────┐
│ [!] APPROVAL REQUIRED                                   │
├─────────────────────────────────────────────────────────┤
│                                                         │
│ The agent wants to perform a HIGH RISK operation:      │
│                                                         │
│ Tool: send_email                                        │
│ Action: Send email message                             │
│                                                         │
│ Parameters:                                             │
│   to: user@example.com                                  │
│   subject: "Project Update"                             │
│   body: "The project is complete..."                    │
│                                                         │
│ Risk: HIGH - This operation cannot be easily undone    │
│                                                         │
│ [a] Approve  [d] Deny  [v] View full details           │
│                                                         │
│ Auto-deny in 60 seconds...                             │
└─────────────────────────────────────────────────────────┘

API Approval Flow¶

For API clients (web UI, mobile apps):

Gateway returns 202 Accepted with pending approval ID
Client polls /hitl/pending/{approval_id} for status
User approves/denies via /hitl/decide/{approval_id}
Original request completes or returns error

# API endpoint
POST /hitl/decide/{approval_id}
{
  "decision": "approve",
  "reason": "Reviewed email, looks good"
}

# Response
{
  "status": "approved",
  "approved_by": "user@example.com",
  "approved_at": "2026-02-09T15:30:45Z"
}

Audit Integration¶

All HITL decisions are logged to the audit database:

# Audit log entry
{
  "event_type": "hitl_decision",
  "correlation_id": "req-12345",
  "timestamp": "2026-02-09T15:30:45Z",
  "decision": "approve",
  "operation": {
    "tool_name": "send_email",
    "params": {
      "to": "user@example.com",
      "subject": "Project Update"
    }
  },
  "risk_level": "high",
  "user": "admin@example.com",
  "reason": "Reviewed email, looks good",
  "timeout_seconds": 60
}

Query approval history:

# Get all denied operations
harombe audit query --event-type=hitl_decision --filter='decision=deny'

# Get critical operations
harombe audit query --event-type=hitl_decision --filter='risk_level=critical'

Security Considerations¶

Default Deny¶

All timeouts result in DENY - Never auto-approve
Unknown operations default to HIGH risk - Require approval
Configuration errors result in DENY - Fail-safe

Bypass Prevention¶

No programmatic bypass - Agent cannot approve itself
Audit all decisions - Even when HITL is disabled
Require authentication - Verify user identity for approvals

Privilege Escalation¶

Per-user rules - Some users can approve critical operations
Role-based access - Admin vs. standard user approval rights
Approval delegation - Support approval workflows

Performance Considerations¶

Timeout Handling¶

Non-blocking waits - Use async/await for timeout
Graceful timeout - Clear error message on timeout
Configurable defaults - Per-operation timeout overrides

Caching¶

"Always allow" cache - User can skip future prompts for specific operations
Cache expiration - Clear cache after N hours
Per-session cache - Don't persist across sessions by default

Rate Limiting¶

Max pending approvals - Limit to N simultaneous pending approvals
Approval queue - Queue additional requests
Request deduplication - Detect duplicate approval requests

Implementation Phases¶

Phase 1: Core Implementation (Days 1-2)¶

Implement HITLGate class
Implement RiskClassifier with basic rules
Implement CLI approval prompt
Implement timeout handling
Add audit logging integration

Phase 2: Gateway Integration (Day 3)¶

Add HITL middleware to MCP Gateway
Update configuration schema
Implement approval decision storage
Test with existing tools

Phase 3: API Support (Day 4)¶

Add API endpoints for pending approvals
Add approval decision endpoint
Implement polling mechanism
Add WebSocket support for real-time updates

Phase 4: Testing & Documentation (Day 5)¶

Testing Strategy¶

Unit Tests¶

async def test_approval_required():
    """Test that high-risk operations require approval."""
    gate = HITLGate()
    operation = Operation(tool_name="send_email", params={...})

    decision = await gate.check_approval(operation)
    assert decision.decision == "deny"  # No approval given

async def test_timeout_denies():
    """Test that timeout results in deny."""
    gate = HITLGate(timeout=1)
    operation = Operation(tool_name="delete_file", params={...})

    # Don't provide approval, let it timeout
    decision = await gate.check_approval(operation)
    assert decision.decision == "timeout"

async def test_low_risk_auto_approved():
    """Test that low-risk operations auto-approve."""
    gate = HITLGate()
    operation = Operation(tool_name="read_file", params={...})

    decision = await gate.check_approval(operation)
    assert decision.decision == "approve"

Integration Tests¶

async def test_gateway_blocks_without_approval():
    """Test that gateway blocks high-risk operations."""
    # Send tool call to gateway
    response = await client.post("/mcp", json={
        "method": "tools/call",
        "params": {
            "name": "send_email",
            "arguments": {...}
        }
    })

    # Should return pending approval
    assert response.status_code == 202
    assert "approval_id" in response.json()

async def test_approval_flow():
    """Test full approval flow."""
    # 1. Submit operation
    response = await client.post("/mcp", json={...})
    approval_id = response.json()["approval_id"]

    # 2. Approve operation
    await client.post(f"/hitl/decide/{approval_id}", json={
        "decision": "approve"
    })

    # 3. Original request should complete
    result = await client.get(f"/hitl/result/{approval_id}")
    assert result.status_code == 200

Future Enhancements¶

Phase 4.6+¶

Approval templates - Pre-configured approval rules
Approval workflows - Multi-level approvals
Approval analytics - Track approval rates, common denials
Smart suggestions - Learn from past decisions
Batch approvals - Approve multiple operations at once

References¶

Audit Logging - Audit trail integration
MCP Gateway Design - Gateway architecture
Security Network - Network isolation patterns