Best Practices - AgentWarden

Overview

Follow these best practices to build secure, reliable, and maintainable AI agent systems with AgentWarden.

Architecture Patterns

1. Single Responsibility Agents

Create separate agents for different purposes rather than one “super agent”.

# Customer support agent
support_agent = Agent(
    name="customer-support-bot",
    permissions=["stripe.refund", "email.send", "ticket.update"]
)

# Data processing agent
data_agent = Agent(
    name="data-processing-bot",
    permissions=["database.read", "s3.upload", "api.fetch"]
)

# DevOps agent
devops_agent = Agent(
    name="devops-bot",
    permissions=["deploy.staging", "deploy.production"]
)

Benefits:

Better security isolation
Clearer audit trails
Easier to debug
Simpler permission management

2. Check-Execute-Log Pattern

Always follow this pattern for agent actions:

def execute_agent_action(agent_id: str, action: str, context: dict, 
                        execute_fn: callable) -> dict:
    """
    Standard pattern for all agent actions
    
    1. Check permission
    2. Execute if allowed
    3. Log the outcome
    """
    # 1. CHECK
    result = guard.check(agent_id, action, context)
    
    if not result.allowed:
        # Log denial
        guard.log(agent_id, action, "denied", context)
        
        if result.requires_approval:
            return {"status": "pending_approval", "approval_id": result.approval_id}
        else:
            return {"status": "denied", "reason": result.reason}
    
    # 2. EXECUTE
    try:
        outcome = execute_fn()
        
        # 3. LOG SUCCESS
        guard.log(agent_id, action, "success", {**context, "result": outcome})
        return {"status": "success", "result": outcome}
        
    except Exception as e:
        # 3. LOG FAILURE
        guard.log(agent_id, action, "failed", {**context, "error": str(e)})
        return {"status": "failed", "error": str(e)}

Never skip any step:

❌ Don’t execute without checking
❌ Don’t check but forget to log
❌ Don’t log only successes

3. Tiered Permissions

Use tiered permissions for different risk levels:

# Setup in AgentWarden dashboard:
# 
# Action: stripe.refund.small
# Max Amount: $50
# Requires Approval: No
#
# Action: stripe.refund.medium  
# Max Amount: $200
# Requires Approval: No
#
# Action: stripe.refund.large
# Max Amount: $1000
# Requires Approval: Yes

def process_refund(amount: float):
    """Automatically route to appropriate permission tier"""
    if amount <= 50:
        action = "stripe.refund.small"
    elif amount <= 200:
        action = "stripe.refund.medium"
    else:
        action = "stripe.refund.large"
    
    result = guard.check(AGENT_ID, action, {"amount": amount})
    
    # Handle based on result...

Benefits:

Automatic escalation for high-risk actions
Reduce approval bottlenecks for low-risk actions
Clear risk boundaries

Permission Management

4. Use Descriptive Action Names

Action names should clearly indicate what they do:

"stripe.refund"
"stripe.subscription.cancel"
"database.users.delete"
"email.marketing.send"
"deploy.production"
"api.external.call"

Naming convention:

service.resource.operation

Examples:

stripe.refund - Stripe service, refund operation
database.users.delete - Database service, users resource, delete operation
api.sendgrid.email.send - API integration, SendGrid, email resource, send operation

5. Environment-Specific Permissions

Use different permissions for different environments:

import os

env = os.getenv("ENVIRONMENT", "production")

# Staging: no approvals needed
if env == "staging":
    action = "deploy.staging"
    
# Production: requires approval
elif env == "production":
    action = "deploy.production"

result = guard.check(AGENT_ID, action, {"version": "v2.1.0"})

Dashboard setup:

Action: deploy.staging
Requires Approval: No

Action: deploy.production
Requires Approval: Yes

6. Regularly Review Permissions

Set up a quarterly permission audit:

# Script to audit permissions
def audit_permissions():
    """Review all agent permissions and flag unused ones"""
    agents = get_all_agents()
    
    for agent in agents:
        permissions = get_agent_permissions(agent.id)
        
        # Check last usage
        for perm in permissions:
            last_used = get_last_log_for_action(agent.id, perm.action)
            
            if not last_used:
                print(f"⚠️ {agent.name}: '{perm.action}' never used")
            elif last_used > 90_days_ago:
                print(f"⚠️ {agent.name}: '{perm.action}' not used in 90 days")

# Run quarterly
audit_permissions()

Context and Logging

7. Provide Rich Context

Always include comprehensive context for permission checks and logs:

result = guard.check(
    agent_id=AGENT_ID,
    action="stripe.refund",
    context={
        "amount": 50.00,
        "customer_id": "cus_ABC123",
        "customer_email": "john@example.com",
        "order_id": "ord_XYZ789",
        "order_date": "2024-02-15",
        "reason": "defective_product",
        "ticket_id": "SUPPORT-12345",
        "agent_version": "v2.1.0"
    }
)

Why it matters:

Approvers can make informed decisions
Better debugging when issues occur
Compliance and audit requirements
Analytics and reporting

8. Structured Error Context

When logging failures, include structured error information:

try:
    process_refund(amount)
    
except StripeError as e:
    guard.log(
        agent_id=AGENT_ID,
        action="stripe.refund",
        status="failed",
        context={
            "amount": amount,
            "error_type": type(e).__name__,
            "error_code": getattr(e, 'code', None),
            "error_message": str(e),
            "stripe_request_id": getattr(e, 'request_id', None),
            "retry_count": retry_count,
            "timestamp": datetime.utcnow().isoformat()
        }
    )
    
except Exception as e:
    guard.log(
        agent_id=AGENT_ID,
        action="stripe.refund",
        status="failed",
        context={
            "amount": amount,
            "error_type": type(e).__name__,
            "error_message": str(e),
            "traceback": traceback.format_exc()
        }
    )

9. Log Denials Too

Even if permission was denied, log it:

result = guard.check(agent_id, action, context)

if not result.allowed:
    # Always log denials
    guard.log(
        agent_id=agent_id,
        action=action,
        status="denied",
        context={
            **context,
            "denial_reason": result.reason,
            "requires_approval": result.requires_approval
        }
    )
    
    # Handle the denial
    notify_user(result.reason)

Why log denials:

Track attempted unauthorized actions
Identify misconfigured permissions
Security monitoring
Detect potential abuse

Security

10. Never Expose API Keys

Keep API keys secure:

import os
from agentwarden import AgentWarden

# Read from environment
guard = AgentWarden(api_key=os.getenv('AGENTWARDEN_API_KEY'))

Best practices:

Store in environment variables
Use secrets managers (AWS Secrets Manager, HashiCorp Vault)
Never commit to version control
Rotate keys regularly (every 90 days)
Use different keys per environment

11. Fail-Safe Defaults

When in doubt, deny:

def safe_check(agent_id: str, action: str, context: dict) -> bool:
    """
    Wrapper that fails safely
    """
    try:
        result = guard.check(agent_id, action, context)
        return result.allowed
        
    except Exception as e:
        # If AgentWarden is unreachable, deny by default
        logger.error(f"Permission check failed: {e}")
        return False  # Fail-safe: deny

# Usage
if safe_check(agent_id, "dangerous.action", context):
    execute_dangerous_action()
else:
    # Denied - safe!
    pass

Exception: Only allow fail-open for truly low-risk actions that you’ve explicitly whitelisted.

12. Least Privilege Principle

Grant only the minimum permissions needed:

# ✅ Good - Specific permissions
permissions = [
    "email.send",           # Can send emails
    "database.users.read",  # Can read user data
    "api.sendgrid.call"     # Can call SendGrid API
]

# ❌ Bad - Overly broad permissions
permissions = [
    "email.*",         # Can do anything with email
    "database.*",      # Can do anything with database
    "api.*"            # Can call any API
]

Start restrictive, then relax:

Begin with minimal permissions
Monitor for denied actions
Add permissions as needed
Remove unused permissions

Performance

13. Reuse AgentWarden Instance

Create one instance and reuse it:

# app/services/guard.py
from agentwarden import AgentWarden
import os

# Create once
_guard_instance = None

def get_guard():
    global _guard_instance
    if _guard_instance is None:
        _guard_instance = AgentWarden(
            api_key=os.getenv('AGENTWARDEN_API_KEY')
        )
    return _guard_instance

# Use everywhere
from app.services.guard import get_guard

guard = get_guard()
result = guard.check(agent_id, action, context)

14. Implement Caching (Carefully)

Cache permission checks for identical actions:

from functools import lru_cache
import hashlib
import json

@lru_cache(maxsize=1000)
def cached_check(agent_id: str, action: str, context_hash: str):
    """
    Cache permission checks for 60 seconds
    
    ⚠️ Use with caution - permissions can change!
    """
    context = json.loads(context_hash)
    return guard.check(agent_id, action, context)

def check_with_cache(agent_id: str, action: str, context: dict, ttl: int = 60):
    """
    Check permission with caching
    """
    # Create deterministic hash of context
    context_str = json.dumps(context, sort_keys=True)
    context_hash = hashlib.md5(context_str.encode()).hexdigest()
    
    # Add TTL to cache key
    cache_key = f"{agent_id}:{action}:{context_hash}:{int(time.time() // ttl)}"
    
    return cached_check(agent_id, action, cache_key)

⚠️ Caching caveats:

Only cache for short periods (30-60 seconds)
Clear cache when permissions change
Don’t cache approval-required actions
Monitor cache hit rates

15. Batch Operations

For bulk operations, check once and execute many:

# ✅ Good - Check once for batch
result = guard.check(
    agent_id=AGENT_ID,
    action="email.send_bulk",
    context={"recipient_count": len(recipients)}
)

if result.allowed:
    for recipient in recipients:
        send_email(recipient)
    
    # Log once for the batch
    guard.log(
        agent_id=AGENT_ID,
        action="email.send_bulk",
        status="success",
        context={"recipients_sent": len(recipients)}
    )

# ❌ Bad - Check for each recipient
for recipient in recipients:
    result = guard.check(AGENT_ID, "email.send", {"recipient": recipient})
    if result.allowed:
        send_email(recipient)

Testing

16. Test Permission Logic

Write tests for permission scenarios:

import pytest
from unittest.mock import Mock, patch

def test_refund_allowed_small_amount():
    """Test that small refunds are auto-approved"""
    with patch('agentwarden.AgentWarden') as MockGuard:
        mock_guard = MockGuard.return_value
        mock_guard.check.return_value = Mock(
            allowed=True,
            requires_approval=False
        )
        
        result = process_refund(amount=25.00)
        
        assert result['status'] == 'approved'
        mock_guard.check.assert_called_once()

def test_refund_requires_approval_large_amount():
    """Test that large refunds require approval"""
    with patch('agentwarden.AgentWarden') as MockGuard:
        mock_guard = MockGuard.return_value
        mock_guard.check.return_value = Mock(
            allowed=False,
            requires_approval=True,
            approval_id="apr_123"
        )
        
        result = process_refund(amount=500.00)
        
        assert result['status'] == 'pending_approval'
        assert 'approval_id' in result

17. Use Test API Keys

AgentWarden provides test API keys for development:

# .env.development
AGENTWARDEN_API_KEY=test_1234567890abcdef

# .env.production  
AGENTWARDEN_API_KEY=ak_production_key_here

Test keys:

Prefix: test_
Don’t count against plan limits
Isolated test data
Can be shared with developers

Monitoring

18. Track Key Metrics

Monitor these metrics:

from prometheus_client import Counter, Histogram

# Permission checks
permission_checks = Counter(
    'agentwarden_checks_total',
    'Total permission checks',
    ['agent_id', 'action', 'result']
)

# Check latency
check_duration = Histogram(
    'agentwarden_check_duration_seconds',
    'Permission check duration'
)

# Approvals
approvals_pending = Gauge(
    'agentwarden_approvals_pending',
    'Number of pending approvals'
)

# Usage
with check_duration.time():
    result = guard.check(agent_id, action, context)

permission_checks.labels(
    agent_id=agent_id,
    action=action,
    result='allowed' if result.allowed else 'denied'
).inc()

Key metrics to track:

Permission check latency
Denial rate by action
Approval response time
Error rate
Rate limit hits

19. Set Up Alerts

Alert on anomalies:

# Alert on high denial rate
if denial_rate > 0.3:  # More than 30% denials
    send_alert("High denial rate detected", severity="warning")

# Alert on API errors
if error_rate > 0.05:  # More than 5% errors
    send_alert("High error rate from AgentWarden", severity="critical")

# Alert on pending approvals
if pending_approvals > 10:
    send_alert("Approval backlog building up", severity="warning")

Documentation

20. Document Your Agents

Keep documentation for each agent:

"""
Agent: customer-support-bot
Purpose: Automated customer support with refund capabilities
Owner: support-team@company.com
Created: 2024-01-15

Permissions:
- stripe.refund.small (≤ $50) - Auto-approved
- stripe.refund.medium (≤ $200) - Auto-approved  
- stripe.refund.large (≤ $1000) - Requires approval
- email.send - Auto-approved
- zendesk.ticket.update - Auto-approved

Last Audit: 2024-02-15
Notes: 
- Large refunds (>$200) escalate to finance team
- Average approval time: 15 minutes
- Handles ~500 requests/day
"""

Checklist

Use this checklist for every new agent:

Quick Reference

Practice	Do	Don’t
Agent Design	One agent per purpose	One super agent
Permissions	Least privilege, specific actions	Broad wildcards
Security	Environment variables, fail-safe	Hardcoded keys, fail-open
Context	Rich, detailed context	Minimal context
Logging	Log everything (success, fail, deny)	Only log successes
Errors	Handle gracefully, retry with backoff	Ignore errors
Testing	Test all permission scenarios	No tests
Monitoring	Track metrics, set alerts	No monitoring

Next Steps

Integration Examples

See complete working examples

Error Handling

Handle errors properly

Troubleshooting

Solve common problems

Python SDK

Full SDK reference

Get Started

Core Concepts

SDK

Integration Guides

​Overview

​Architecture Patterns

​1. Single Responsibility Agents

​2. Check-Execute-Log Pattern

​3. Tiered Permissions

​Permission Management

​4. Use Descriptive Action Names

​5. Environment-Specific Permissions

​6. Regularly Review Permissions

​Context and Logging

​7. Provide Rich Context

​8. Structured Error Context

​9. Log Denials Too

​Security

​10. Never Expose API Keys

​11. Fail-Safe Defaults

​12. Least Privilege Principle

​Performance

​13. Reuse AgentWarden Instance

​14. Implement Caching (Carefully)

​15. Batch Operations

​Testing

​16. Test Permission Logic

​17. Use Test API Keys

​Monitoring

​18. Track Key Metrics

​19. Set Up Alerts

​Documentation

​20. Document Your Agents

​Checklist

​Quick Reference

​Next Steps