Skip to main content

Overview

The Warm Pool is a container management system that maintains pre-warmed, ready-to-use containers for each supported runtime. This dramatically reduces cold start latency from seconds to milliseconds.

How It Works

┌─────────────────────────────────────────────────────────────┐
│                      Warm Pool Manager                       │
├─────────────────────────────────────────────────────────────┤
│  ┌───────────┐  ┌───────────┐  ┌───────────┐  ┌───────────┐ │
│  │  Python   │  │   Node    │  │    Go     │  │   Java    │ │
│  │  Pool     │  │   Pool    │  │   Pool    │  │   Pool    │ │
│  │  [3]      │  │   [3]     │  │   [3]     │  │   [3]     │ │
│  └───────────┘  └───────────┘  └───────────┘  └───────────┘ │
└─────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────┐
│                     Health Monitor                           │
├─────────────────────────────────────────────────────────────┤
│  - Periodic health checks    - Auto-replacement              │
│  - Container TTL management  - Pool size maintenance         │
└─────────────────────────────────────────────────────────────┘

Performance Benefits

MetricWithout Warm PoolWith Warm PoolImprovement
Cold Start2-5 seconds50-100ms20-50x faster
First Request~3 seconds~80ms37x faster
Container OverheadHighMinimalEliminated

Configuration

Environment Variables

# Enable warm pool (default: true)
WARM_POOL_ENABLED=true

# Containers per runtime (default: 3)
WARM_POOL_SIZE_PER_RUNTIME=3

# Container time-to-live in seconds (default: 3600)
WARM_POOL_CONTAINER_TTL=3600

# Health check interval in seconds (default: 30)
WARM_POOL_HEALTH_CHECK_INTERVAL=30

Pool Size Recommendations

Use CasePool SizeNotes
Development1-2Lower resource usage
Production (Low Traffic)3-5Good balance
Production (High Traffic)10+Scale as needed

Container Lifecycle

1. Container Creation

Containers are pre-created and warmed during startup:
┌─────────────────┐
│  Pool Manager   │
│    Startup      │
└────────┬────────┘


┌─────────────────┐     ┌─────────────────┐
│  Create         │────▶│  Initialize     │
│  Container      │     │  Runtime        │
└─────────────────┘     └────────┬────────┘


                        ┌─────────────────┐
                        │  Add to Pool    │
                        │  (Ready)        │
                        └─────────────────┘

2. Container Acquisition

When a request arrives:
# Pseudocode for container acquisition
container = warm_pool.acquire(language="python")
if container:
    # Use pre-warmed container
    result = container.execute(code)
else:
    # Fall back to creating new container
    container = create_new_container(language)
    result = container.execute(code)

3. Container Reset & Return

After execution, containers are reset and returned to the pool:
┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   Execution     │────▶│   Reset State   │────▶│  Return to      │
│   Complete      │     │   Clean Env     │     │  Pool           │
└─────────────────┘     └─────────────────┘     └─────────────────┘
The reset process ensures:
  • No data leakage between executions
  • Clean environment variables
  • Fresh filesystem state
  • Reset resource counters

Health Monitoring

The warm pool continuously monitors container health:

Health Check Types

  1. Liveness Check: Container is running
  2. Readiness Check: Runtime is initialized
  3. Resource Check: Memory/CPU within limits

Auto-Replacement

Unhealthy containers are automatically replaced:
┌─────────────────┐     ┌─────────────────┐
│  Health Check   │────▶│   Container     │
│     Failed      │     │   Unhealthy     │
└─────────────────┘     └────────┬────────┘

         ┌───────────────────────┴───────────────────────┐
         │                                               │
         ▼                                               ▼
┌─────────────────┐                             ┌─────────────────┐
│   Remove from   │                             │  Create New     │
│   Pool          │                             │  Container      │
└─────────────────┘                             └─────────────────┘

Advanced Configuration

Per-Runtime Settings

Configure pool size per language:
WARM_POOL_PYTHON_SIZE=5
WARM_POOL_JAVASCRIPT_SIZE=3
WARM_POOL_GO_SIZE=2
WARM_POOL_JAVA_SIZE=2

TTL Management

Containers are recycled after TTL expires to prevent resource drift:
# Recycle containers every hour
WARM_POOL_CONTAINER_TTL=3600

# Grace period before forced termination
WARM_POOL_GRACEFUL_SHUTDOWN=30

Monitoring

Metrics

The warm pool exposes the following metrics:
MetricDescription
warm_pool_sizeCurrent pool size per runtime
warm_pool_hitsRequests served from pool
warm_pool_missesRequests requiring new containers
warm_pool_health_failuresHealth check failures

API Endpoint

Check warm pool status:
curl -X GET http://localhost:8002/api/v1/warm-pool/status \
  -H "X-API-Key: your-api-key"
Response:
{
  "enabled": true,
  "pools": {
    "python": {
      "size": 3,
      "available": 2,
      "in_use": 1
    },
    "javascript": {
      "size": 3,
      "available": 3,
      "in_use": 0
    }
  },
  "total_hits": 1523,
  "total_misses": 12,
  "hit_rate": 0.992
}

Best Practices

Monitor your traffic patterns and adjust pool size. Too small causes misses; too large wastes resources.
Enable health checks to ensure containers are ready for requests.
Balance between container freshness and startup costs. 1-4 hours is typical.
Aim for 95%+ hit rate. If lower, increase pool size or investigate traffic patterns.

Troubleshooting

High Miss Rate

If experiencing many pool misses:
  1. Increase WARM_POOL_SIZE_PER_RUNTIME
  2. Check for bursty traffic patterns
  3. Verify health checks aren’t failing

Memory Issues

If containers are using too much memory:
  1. Reduce WARM_POOL_SIZE_PER_RUNTIME
  2. Lower WARM_POOL_CONTAINER_TTL
  3. Check for memory leaks in executed code