Hoody AI

Access 300+ AI models from your containers. Hoody AI is a self-hosted gateway running on YOUR server that gives containers simple HTTP access to Claude, GPT, Gemini, Llama, and hundreds of other models.

The simplicity: Use container-X to authenticate from any container. Fully OpenAI-compatible API—works with any library or tool that supports OpenAI’s format (OpenAI SDK, LangChain, hoody-agent, Claude Code, Cline, Cursor, etc.). Zero configuration needed.

The security: No real API keys in containers—just container-X tokens that only work from your infrastructure. Safe freelancer onboarding, vibe coding, and AI-generated code without key exposure risk.

Transparent pricing: Hoody AI adds a 5% markup on model provider costs. See Models & Pricing for current model pricing and cost optimization strategies.

What It Is

Hoody AI is an AI gateway that runs on the HOST (your bare metal server), not inside containers.

Architecture:

Your Server
├── Hoody AI Gateway (Host Only)
│   ├── URL: https://ai.hoody.icu/api/v1
│   ├── Credits: Your Hoody AI credits
│   └── Accessible: Only from containers on this server
│
└── Container 1, 2, 3...
    └── Auth: "container-1" (proves container identity)

Privacy by architecture: Hoody AI runs on YOUR server (the host). Traffic flows from your containers through the Hoody AI gateway running on your own server, then out to AI providers (OpenAI, Anthropic, Google, Meta, etc.) with a 5% markup. No Hoody-operated platform servers sit between your gateway and the inference providers — only your host and the provider. For complete control, you can proxy to your own AI providers via hoody-exec.

How It Works

1. Container Gets Magic API Key

When you create a container, it automatically gets access to Hoody AI via a container identity token:

API Key: container-{containerName}

Example: Container named dev-env → identity token is container-dev-env. Containers created without a custom name are addressable by their numbered form (container-1, container-2, …); both the name-derived and numbered forms are accepted — they identify the same container, not an API key you can copy.

2. Works with Everything

Hoody AI is fully OpenAI-compatible, so it works with:

Any OpenAI SDK (Python, Node.js, Go, etc.)
AI frameworks (LangChain, LlamaIndex, etc.)
AI coding tools (Cursor, Windsurf, Claude Code, Cline, Continue.dev)
hoody-agent - Native integration for container orchestration

# hoody-agent uses Hoody AI automatically
# Just set base URL and key in config
curl -X POST "https://{projectId}-{containerId}-workspaces-1.{node}.containers.hoody.icu/api/tasks" \
  -d '{
    "prompt": "Build a todo app",
    "ai_config": {
      "base_url": "https://ai.hoody.icu/api/v1",
      "api_key": "container-dev-env",
      "model": "anthropic/claude-sonnet-4.5"
    }
  }'

POST Direct AI request from terminal

https://ai.hoody.icu/api/v1/chat/completions

Click "Run" to execute the request

Settings:

Base URL: https://ai.hoody.icu/api/v1
API Key: container-{containerName}
Provider: Custom (OpenAI-compatible)

Works immediately. No key rotation needed.

AI Settings:

Custom endpoint: https://ai.hoody.icu/api/v1
API Key: container-{containerName}

All AI features work without exposing real keys.

3. Make an AI Request

Once your container is running, call Hoody AI directly:

SDK
HTTP

import { HoodyClient } from '@hoody-ai/hoody-sdk';

const client = new HoodyClient({ baseURL: 'https://api.hoody.icu', token: process.env.HOODY_TOKEN });

// List available AI models
const models = await client.api.ai.listModels();
console.log(models.data.models);

// For chat completions, call the AI gateway directly from your container:
const response = await fetch('https://ai.hoody.icu/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer container-dev-env',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'anthropic/claude-sonnet-4.5',
    messages: [{ role: 'user', content: 'Hello!' }]
  })
});
const data = await response.json();
console.log(data.choices[0].message.content);

# List available AI models (from the gateway)
curl "https://ai.hoody.icu/api/v1/models" \
  -H "Authorization: Bearer container-dev-env"

# Chat completion
curl -X POST "https://ai.hoody.icu/api/v1/chat/completions" \
  -H "Authorization: Bearer container-dev-env" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4.5",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

4. Automatic Container Access

AI access is enabled by default for all containers. Each container can immediately start making AI requests using its container-X authentication token.

Why Hoody AI

Universal Model Access

300+ models from 15+ AI inference providers through one API. Your server communicates directly with these providers through Hoody AI’s gateway (with our 5% markup):

Major Inference Providers:

Anthropic - Claude Opus 4.1, Sonnet 4.5, Haiku 4.0
OpenAI - GPT-4o, GPT-4 Turbo, GPT-3.5
Google (Vertex AI) - Gemini 2.5 Pro Exp, Gemini 1.5 Pro, Gemini Flash
Meta (via providers) - Llama 3.3 70B, Llama 3.1 405B, Llama 3.1 70B
Mistral AI - Mistral Large, Mistral Medium, Mixtral 8x7B
Deepseek - Deepseek V3, Deepseek Coder V2
Qwen (Alibaba) - Qwen 2.5 72B, QwQ 32B Preview
Cohere - Command R+, Command R, Embed models
xAI - Grok 2, Grok 2 Vision
Perplexity AI - Sonar Pro, Sonar models
Together AI - Hosting platform for open models
Fireworks AI - Optimized inference for open models
And more providers…

Browse the complete model list: See Models & Pricing for all available models, current pricing, and provider-specific capabilities.

No provider management needed: One API, hundreds of models. No separate accounts or SDKs.

Bring Your Own Provider — 75+ Options, No Lock-In

You’re never forced into Hoody AI’s gateway. Connect any provider directly: set ANTHROPIC_API_KEY, OPENAI_API_KEY, or any provider’s environment variable inside a container and call them straight from your code. Point hoody-agent, Cursor, or Cline at any OpenAI-compatible endpoint — local Ollama, Azure OpenAI, Together AI, an enterprise proxy — and the entire stack works without modification. Switch models mid-conversation by swapping the active profile. A/B test Claude versus GPT-4o across two containers simultaneously. Today Claude Sonnet, tomorrow your own fine-tuned Llama, next week whatever model wins the benchmarks. It’s a config change, not a migration.

# Direct provider access — set in container environment
ANTHROPIC_API_KEY=sk-ant-...   # Anthropic direct
OPENAI_API_KEY=sk-...          # OpenAI direct
OPENAI_BASE_URL=http://localhost:11434/v1  # Local Ollama

# Or route through Hoody AI for key-less container auth
# API key: container-{containerName}  → works from this server only

Container-Native Integration

Built specifically for container-based workflows:

Each container gets automatic AI access
Use container-X format for authentication
Works immediately with hoody-agent, AI coding tools
No environment variable management
No credential rotation needed

Security as a Benefit

Container-restricted authentication means:

No real API keys stored in containers
Access automatically tied to container lifecycle
Safe freelancer/contractor onboarding
Protection against key leakage in AI-generated code

Intercept & Control AI

The game-changer: Because Hoody AI requests flow through HTTP, you can intercept and modify everything using hoody-exec as a MITM (Man-In-The-Middle) proxy.

The simplicity: Deploy a MITM script once, then just change the URL in your AI client.

Without MITM: https://ai.hoody.icu/api/v1
With MITM:    https://your-project-container-exec-1.node-us.containers.hoody.icu/api/v1

Switch on-demand. No code changes. Complete control.

What you gain:

🎛️ Tool Call Tampering - Intercept and modify AI tool calls (redirect file writes, block dangerous commands, modify paths)
👤 Human-in-the-Loop - Pause AI for approval on high-stakes operations (deployments, deletions, payments)
🤖 Agent Cascades - Trigger other agent instances via HTTP, building multi-agent systems
💰 Cost Optimization - Compress prompts, cache responses, route to cheaper models (40-70% savings)
🧠 Context Injection - Auto-enhance AI with your knowledge base, company policies, codebase docs
📊 Complete Observability - Log every prompt, response, and decision for debugging and compliance

Quick example - Sandbox all AI file operations in 30 lines:

// /api/ai-proxy.js in hoody-exec
// @mode worker

const response = await fetch('https://ai.hoody.icu/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer container-1',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify(req.body)
});

const data = await response.json();

// Intercept and redirect file operations to sandbox
if (data.choices[0].message.tool_calls) {
  data.choices[0].message.tool_calls.forEach(call => {
    if (call.function.name === 'write_file') {
      const args = JSON.parse(call.function.arguments);
      args.path = '/sandbox' + args.path;  // Force sandbox
      call.function.arguments = JSON.stringify(args);
    }
  });
}

return res.json(data);

AI can code freely. All writes automatically sandboxed. Zero risk.

See the full guide: Intercept & Control AI → covers tool call interception, agent cascade orchestration, human-in-the-loop workflows, stalling patterns, context injection, cost optimization, and more.

Use Cases

1. Safe Freelancer Onboarding

Give contractors container access with container-X API key. They can use AI tools (Cursor, Windsurf, Claude Code) without ever seeing your real keys. Delete container when project ends—instant revocation.

2. Consumer SaaS with AI

Build applications that use AI without embedding real API keys. Users can’t extract keys even with full source access. Keys only work from your infrastructure.

3. Vibe-Coded Apps

Let AI generate entire applications. Even if generated code tries to log/exfiltrate API keys, container-X is useless outside your server.

4. Multi-Tenant AI Access

Each client gets their own container with isolated AI access. Enable/disable per client. Track usage per container.

5. Development Environments

Developers use AI coding assistants (Cursor, Cline) with container-dev key. Production uses container-prod. No key sharing between environments.

Best Practices

Container Naming Strategy

Use descriptive container names—they become part of the API key:

container-prod-frontend (clear purpose)
container-dev-alice (per-developer)
container-client-acme (per-client)

Use Specific Models

Specify exact models in AI requests to control costs:

{
  "model": "anthropic/claude-haiku-4.0"
}

Use cheaper models (Haiku) for simple tasks, more capable models (Opus) for complex ones.

Monitor Container AI Usage

Monitor which containers have AI access enabled:

# List all containers and their AI status
hoody containers list -o json | jq '.[] | {id, name, ai, status}'

import { HoodyClient } from '@hoody-ai/hoody-sdk';

const client = new HoodyClient({ baseURL: 'https://api.hoody.icu', token: process.env.HOODY_TOKEN });

const containers = await client.api.containers.list();
containers.data.containers.forEach(c => {
  console.log(c.name, c.ai, c.status);
});

# List all containers and their AI status
curl "https://api.hoody.icu/api/v1/containers" \
  -H "Authorization: Bearer $HOODY_TOKEN" \
  | jq '.data.containers[] | {id, name, ai, status}'

Snapshot Before AI-Heavy Operations

Create snapshot before letting AI generate large amounts of code. Restore if results are undesirable.

Useful Questions

Can I use my own API keys or other AI gateways?

Absolutely. Because it’s YOUR infrastructure, you’re completely free to use any AI provider or gateway you want:

Set environment variables in your containers and use providers directly (OpenAI, Anthropic, etc.)
Use other AI gateways, Together AI, or self-hosted models
Proxy through hoody-exec to any external service (see MITM section above)
Mix multiple providers with custom routing logic

Trade-offs to consider:

Using raw API keys in containers exposes them to container processes (security risk if containers are compromised)
Hoody AI’s container-X authentication provides key isolation—keys only work from your infrastructure
With custom proxies via hoody-exec, you control everything but manage your own security

The freedom: It’s your bare metal server, your containers, your choice of AI provider. Hoody AI is convenient and secure, but not required.

What happens if I copy the `container-X` key elsewhere?

It won’t work. Keys are cryptographically bound to the specific container and your server. Copied keys fail authentication.

Can containers access each other’s AI requests?

No. Each container’s AI authentication is isolated. Container A cannot see or intercept Container B’s AI traffic.

Does this work with local models (Ollama, LM Studio)?

Hoody AI is for cloud providers. For local models, run them directly in containers and access via localhost.

What models are supported?

See Models for the complete list. Hoody AI supports 300+ models from providers including Anthropic, OpenAI, Google, Meta, Mistral, and more.

Troubleshooting

”Unauthorized” Error

Problem: AI requests return 401 Unauthorized

Solutions:

Check API key format: container-{containerName} (exact match)
Confirm request is coming from the correct container
Ensure container is running (stopped containers can’t access AI)
Verify Hoody AI service is running on your server

”Model Not Found”

Problem: Requested model doesn’t exist

Solution: Model IDs pass through from Hoody AI’s upstream catalog, so the live list is authoritative. Pull it from your gateway and use the exact ID returned:

curl -s https://ai.hoody.icu/api/v1/models -H "Authorization: Bearer container-$NAME" | jq -r '.data[].id'

{"model": "anthropic/claude-sonnet-4.5"}  // ✅ exact ID returned by /models
{"model": "claude-sonnet"}                 // ❌ short names are not valid IDs

Rate Limit Exceeded

Problem: Too many requests from container

Solutions:

Implement request throttling in application code
Check your Hoody AI credits balance
Distribute workload across multiple containers
Use more efficient models (e.g., Haiku instead of Opus)

Connection Timeout

Problem: AI requests timing out

Solutions:

Verify container has network access (firewall rules)
Check if Hoody AI service is running on host
Ensure container isn’t being rate-limited at network level
Try with simpler prompt to isolate issue

What’s Next

Learn More:

Usage Guide → - Complete examples and integration patterns
Security Model → - How key-less operation protects you
Models & Pricing → - Browse available AI models and pricing
Intercept & Control → - MITM proxy patterns with hoody-exec

Use AI in Apps:

hoody-exec → - Turn scripts into AI-powered APIs
Claude Code/Cline Setup → - Use AI IDEs securely

Related Concepts:

The HTTP Revolution → - Why HTTP enables AI superpowers
Security Principles → - How Hoody AI fits into overall security
Container Management → - Managing container AI permissions

Hoody AI

What It Is

How It Works

1. Container Gets Magic API Key

2. Works with Everything

3. Make an AI Request

4. Automatic Container Access

Why Hoody AI

Universal Model Access

Bring Your Own Provider — 75+ Options, No Lock-In

Container-Native Integration

Security as a Benefit

Intercept & Control AI

Use Cases

1. Safe Freelancer Onboarding

2. Consumer SaaS with AI

3. Vibe-Coded Apps

4. Multi-Tenant AI Access

5. Development Environments

Best Practices

Container Naming Strategy

Use Specific Models

Monitor Container AI Usage

Snapshot Before AI-Heavy Operations

Useful Questions

Can I use my own API keys or other AI gateways?

What happens if I copy the container-X key elsewhere?

Can containers access each other’s AI requests?

Does this work with local models (Ollama, LM Studio)?

What models are supported?

Troubleshooting

”Unauthorized” Error

”Model Not Found”

Rate Limit Exceeded

Connection Timeout

What’s Next

What happens if I copy the `container-X` key elsewhere?