Overview
The Skill Worker is a standalone service that executes skills in isolated environments. It runs separately from the main backend, ensuring that skill code cannot interfere with the platform or other tenants. The worker listens for tasks on a Redis queue, executes them using a configurable executor, and returns results.In development, the Skill Worker runs as part of the backend process. For production, deploy it as a separate container with its own scaling policy.
Architecture
The Skill Worker sits between the backend API and the actual skill execution environment. When a workflow invokes a skill, the backend’sSkillWorkerService dispatches a task to the worker via Redis. The worker picks up the task, runs it in a sandbox, and posts the result back.
Key Components
| Component | Responsibility |
|---|---|
| SkillWorkerService | Backend service that validates permissions, resolves skill paths, and dispatches execution tasks |
| Redis Queue | Task queue (nadoo:skill:tasks) decoupling the backend from worker processes |
| Executor Factory | Selects the appropriate executor based on configuration and system capabilities |
| SubprocessExecutor | Default executor using OS-level process isolation with resource limits |
| GVisorExecutor | Production-grade executor with kernel-level sandboxing via gVisor’s runsc |
Execution Flow
When an AI agent invokes a skill, the following sequence occurs:Validation
The
SkillWorkerService verifies the skill is APPROVED, validates that all declared permissions are allowed by the execution context, and resolves the skill’s filesystem path.Concurrency Control
A per-workspace semaphore limits concurrent executions to 10 by default. If the workspace is at capacity, the request waits up to 30 seconds before returning a
RESOURCE_LIMIT error.Task Dispatch
The task (skill path, entry point, parameters, permissions) is pushed to the Redis queue
nadoo:skill:tasks.Executor Selection
The worker pops the task and routes it to the configured executor —
SubprocessExecutor by default, or GVisorExecutor when gVisor is enabled and the runsc binary is available.Isolated Execution
The executor runs the skill’s entry point (default:
skill.py) in a sandboxed environment with enforced resource limits (memory, CPU, timeout). Parameters are passed via stdin as JSON.Executor Types
The worker supports two executor implementations with different isolation guarantees.Subprocess Executor (Default)
Subprocess Executor (Default)
Uses Python’s
subprocess module to run skills in a separate OS process.Isolation features:- Separate process with its own memory space
- Configurable memory limit (
RLIMIT_AS) - CPU time limit enforcement
- Execution timeout with automatic kill
- Restricted filesystem access via
allowed_paths - Network access controlled by permission flag
gVisor Executor (Production)
gVisor Executor (Production)
Uses Google’s gVisor runtime (
runsc) to execute skills inside a user-space kernel sandbox.Isolation features:- Syscall filtering — only allows a safe subset of system calls
- Network namespace isolation — skills cannot reach the host network unless explicitly allowed
- Filesystem containment — skills see only their own directory
- Memory and CPU cgroup enforcement
- Kernel-level separation from the host OS
runsc binary must be installed and accessible on the worker host.Best for: Production deployments handling untrusted or community-contributed skills.ExecutionResult
Every skill execution produces anExecutionResult with the following fields:
ExecutionStatus enum covers five possible outcomes:
| Status | Description |
|---|---|
SUCCESS | Skill completed normally, output is available |
ERROR | Skill raised an exception or returned a non-zero exit code |
TIMEOUT | Execution exceeded the configured timeout |
PERMISSION_DENIED | Required permissions were not allowed by the execution context |
RESOURCE_LIMIT | Memory, CPU, or concurrency limits were exceeded |
Permission Enforcement
The worker enforces a two-level permission model before executing any skill.- Skill-level: Each skill declares its required permissions in the
SKILL.mdmanifest (e.g.,network,file_read,shell). - Context-level: The
ExecutionContextmay specifyallowed_permissionsto restrict what a particular invocation can do. If set toNone, all declared permissions are allowed.
Configuration Reference
All configuration is managed through environment variables with theNADOO_ prefix.
| Variable | Default | Description |
|---|---|---|
NADOO_WORKER_ID | worker-1 | Unique identifier for this worker instance |
NADOO_SKILL_WORKER_TIMEOUT | 300 | Max execution time per skill (seconds) |
NADOO_SKILL_WORKER_MAX_MEMORY_MB | 512 | Max memory per execution (MB) |
NADOO_SKILL_WORKER_MAX_CPU_PERCENT | 100 | Max CPU usage percentage |
NADOO_SKILL_WORKER_NETWORK_ENABLED | false | Allow outbound network (requires network permission) |
NADOO_SKILL_WORKER_GVISOR_ENABLED | false | Use gVisor sandbox instead of subprocess |
NADOO_REDIS_HOST | localhost | Redis server hostname |
NADOO_REDIS_PORT | 6379 | Redis server port |
NADOO_REDIS_PASSWORD | — | Redis password (optional) |
NADOO_REDIS_DB | 0 | Redis database number |
NADOO_BACKEND_URL | http://localhost:8000 | Backend API URL for skill registry |
NADOO_BACKEND_API_KEY | — | API key for backend authentication |
NADOO_SKILL_CACHE_DIR | System temp dir | Cache directory for cloned Git repos |
NADOO_SKILL_WORKER_TEMP_DIR | System temp dir | Temp directory for execution artifacts |
Deployment
Docker
Docker Compose
Scaling
The worker is stateless and horizontally scalable. Each instance pulls tasks from the same Redis queue, so adding more workers increases throughput linearly.Scaling on Kubernetes
Scaling on Kubernetes
Deploy workers as a Kubernetes Deployment. Use Horizontal Pod Autoscaler (HPA) to scale based on Redis queue depth or CPU utilization.For gVisor isolation, configure the pods with the
gvisor RuntimeClass:Scaling on AWS Fargate
Scaling on AWS Fargate
Run workers as Fargate tasks behind an ECS Service with auto-scaling policies. Fargate provides built-in task isolation via Firecracker microVMs, adding another layer of security even in subprocess mode.
Concurrency Limits
Concurrency Limits
The backend enforces a per-workspace concurrency limit (default: 10 concurrent executions). This prevents any single workspace from monopolizing worker capacity. The limit is configurable via
SkillWorkerService.MAX_CONCURRENT_PER_WORKSPACE.If a workspace hits its limit, additional requests wait up to 30 seconds before returning a RESOURCE_LIMIT error.Monitoring
The worker exposes status information throughworker.get_status():
SKILL_EXECUTED and SKILL_EXECUTION_FAILED, including execution time, error details, and the triggering user/workflow context.