Skip to main content

System Architecture

Nadoo AI is a full-stack platform built as a monorepo with a Next.js frontend, FastAPI backend, LangGraph workflow engine, and supporting infrastructure services. The architecture is designed for async-first processing, horizontal scalability, and extensibility.

High-Level Architecture

Component Details

Next.js Frontend

The frontend is a Next.js application built with React and TypeScript. It provides the primary user interface for all platform interactions.
ComponentTechnologyPurpose
Visual Workflow EditorReact + Canvas APIDrag-and-drop graph editor for building workflows
Chat InterfaceReact + SSEReal-time conversational UI with streaming responses
Knowledge Base UIReactDocument upload, management, and search testing
Admin DashboardReactWorkspace management, user roles, API keys, analytics
State ManagementZustandClient-side state management
UI FrameworkTailwind CSS + Radix UIComponent library and styling

FastAPI Backend

The backend is a FastAPI server running on Uvicorn with async request handling. It serves as the API gateway for all platform operations.
ServiceResponsibility
Auth ServiceJWT-based authentication, RBAC authorization, API key management
Workflow ServiceCRUD operations for workflows, triggers LangGraph execution
Knowledge ServiceDocument ingestion, chunking, embedding, and hybrid search
Channel ServiceManages messaging platform integrations and webhook routing
Chat ServiceHandles conversation sessions, message history, and SSE streaming
Application ServiceApplication lifecycle management across Chat, Workflow, and Channel types

LangGraph Engine

The workflow execution engine is built on LangGraph, a framework for building stateful, multi-step AI applications as graphs.
LangGraph provides the runtime for executing workflow graphs with support for cycles (loops), conditional branching, parallel execution, and persistent state management. Each workflow defined in the visual editor is compiled into a LangGraph graph at execution time.
Key capabilities:
  • Executes directed graphs with nodes and conditional edges
  • Maintains execution state across multi-step workflows
  • Supports 6 AI agent strategies: Standard, Chain of Thought (CoT), ReAct, Function Calling, Reflection, Tree of Thoughts
  • Handles streaming output via Server-Sent Events
  • Integrates with 12+ AI model providers through a unified interface

Celery Workers

Background task processing is handled by Celery with Redis as the message broker. Common background tasks:
  • Document chunking and embedding generation for knowledge bases
  • Large file processing and format conversion
  • Scheduled workflow execution
  • Channel webhook processing
  • Analytics aggregation

Skill Worker

A dedicated service that executes Skills imported from Git repositories in an isolated environment. Execution flow:
  1. Skill is imported from a Git repository (clone/pull)
  2. Dependencies are installed in an isolated environment
  3. Skill execution requests are received via Redis queue
  4. Results are returned to the calling workflow

PostgreSQL + pgvector

PostgreSQL serves as the primary data store with the pgvector extension enabling vector similarity search. Stored data:
  • Workspaces, users, and RBAC permissions
  • Applications, workflows, and node configurations
  • Knowledge base documents and text chunks
  • Vector embeddings (via pgvector) for semantic search
  • Chat sessions and message history
  • Channel configurations and credentials
  • Celery task results
pgvector enables storing and querying high-dimensional vector embeddings directly in PostgreSQL, eliminating the need for a separate vector database. Combined with BM25 full-text search indexes, this powers the hybrid retrieval pipeline.

Redis

Redis handles caching, session management, and message brokering. Uses:
  • Celery message broker and result backend
  • Workflow execution state caching
  • Session data and rate limiting
  • Real-time pub/sub for streaming responses
  • Skill Worker task queue

Data Flow

A typical request flows through the system as follows:
1

User Request

A user sends a message through the web chat interface, a messaging channel (e.g., Slack), or a direct API call. The request reaches the FastAPI backend.
2

Authentication and Routing

The Auth Service validates the JWT token or API key, checks RBAC permissions, and routes the request to the appropriate service (Workflow, Chat, or Channel).
3

Workflow Execution

The Workflow Service loads the application’s workflow graph and passes it to the LangGraph Engine for execution. The engine processes nodes sequentially, following edges and conditional branches.
4

Node Processing

Each node in the workflow executes its specific operation:
  • Search Knowledge queries PostgreSQL (pgvector + BM25) for relevant document chunks
  • AI Agent calls an AI provider (OpenAI, Anthropic, etc.) with the assembled prompt and context
  • Condition evaluates logic and determines the next branch
  • HTTP Request calls external APIs
  • Code Executor runs custom code in a sandboxed environment
5

Response Streaming

As the AI Agent node generates output, tokens are streamed back to the client via Server-Sent Events (SSE). The complete response is persisted to the chat session history in PostgreSQL.
6

Channel Delivery

If the request originated from a messaging channel, the Channel Service formats the response according to the platform’s requirements (Slack blocks, Discord embeds, etc.) and sends it back to the user.

Monorepo Structure

The codebase is organized as a monorepo with the following top-level structure:
DirectoryDescriptionTechnology
packages/backend/FastAPI server with all backend services, API routes, models, and business logicPython 3.11+, FastAPI, SQLAlchemy, Alembic
packages/frontend/Next.js web application with the visual editor, chat UI, and admin dashboardNext.js, React, TypeScript, Tailwind CSS
packages/nadoo-plugin-sdk/Python SDK for building custom plugins that extend the platformPython, Pydantic
packages/official-plugins/6 official plugins included with the platform (Calculator, Web Search, etc.)Python (Plugin SDK)
packages/skill-worker/Service for executing Skills imported from Git repositoriesPython
infrastructure/Docker Compose configurations, Dockerfiles, and deployment scriptsDocker, Docker Compose

Backend Package Structure

packages/backend/
├── src/
│   ├── api/              # FastAPI route handlers
│   ├── models/           # SQLAlchemy ORM models
│   ├── services/         # Business logic layer
│   ├── workflow/         # LangGraph workflow engine
│   │   ├── nodes/        # Node type implementations
│   │   └── strategies/   # AI agent strategy implementations
│   ├── knowledge/        # RAG pipeline (chunking, embedding, search)
│   ├── channels/         # Messaging platform integrations
│   └── core/             # Shared utilities, auth, config
├── migrations/           # Alembic database migrations
├── tests/                # Test suite
└── alembic.ini           # Alembic configuration

Frontend Package Structure

packages/frontend/
├── src/
│   ├── app/              # Next.js App Router pages
│   ├── components/       # React components
│   │   ├── workflow/     # Visual workflow editor components
│   │   ├── chat/         # Chat interface components
│   │   └── knowledge/    # Knowledge base UI components
│   ├── stores/           # Zustand state stores
│   ├── hooks/            # Custom React hooks
│   ├── lib/              # Utility functions and API client
│   └── types/            # TypeScript type definitions
├── public/               # Static assets
└── next.config.js        # Next.js configuration

Key Architectural Decisions

The backend is built entirely on async Python using FastAPI and Uvicorn. All I/O operations — database queries, AI provider calls, external API requests — use async/await for non-blocking execution. This enables high throughput with minimal resource consumption.
@router.post("/chat/messages")
async def create_message(request: MessageRequest):
    # Non-blocking database query
    session = await chat_service.get_session(request.session_id)

    # Non-blocking AI provider call with streaming
    async for chunk in workflow_engine.execute_stream(session, request.message):
        yield chunk
Workflows are executed using LangGraph rather than a custom execution engine. LangGraph provides battle-tested support for:
  • Cycles and loops — Essential for iterative AI reasoning (e.g., ReAct loops)
  • Conditional branching — Route execution based on AI output or custom conditions
  • State management — Maintain and update state across workflow steps
  • Streaming — Stream intermediate results during execution
  • Checkpointing — Resume workflows from saved state
This decision reduces the complexity of the custom workflow engine while leveraging a well-maintained open-source foundation.
Vector storage uses a pluggable factory pattern (VectorStoreFactory) supporting multiple backends:
  • pgvector (default) — PostgreSQL extension, no additional infrastructure
  • Milvus — High-performance distributed vector database (planned)
  • Qdrant — Cloud-native vector search engine (planned)
Why pgvector as default:
  • Operational simplicity — One database to manage instead of two
  • Transactional consistency — Vector data participates in the same transactions as relational data
  • Hybrid search — Combine vector similarity with SQL filtering and BM25 full-text search in a single query
  • Distance metrics — Cosine, Euclidean, Dot Product supported
  • Sufficient scale — HNSW and IVFFlat indexes handle millions of vectors
Embedding providers (9+): OpenAI, HuggingFace, Local, Azure OpenAI, AWS Bedrock, Google AI Studio, Google Vertex AI, vLLM, Ollama
Long-running operations (document processing, embedding generation, scheduled tasks) are offloaded to Celery workers with Redis as the broker.Why Celery:
  • Mature and battle-tested task queue for Python
  • Supports task retries, rate limiting, and priority queues
  • Redis broker is already part of the infrastructure
  • Flower dashboard for monitoring worker health and task status
  • Scales horizontally by adding more workers
The project uses a monorepo structure with Git submodules for the Plugin SDK and official plugins.Benefits:
  • Unified development — Frontend, backend, and plugins in one repository
  • Atomic changes — Cross-package changes in a single commit
  • Shared tooling — Common linting, testing, and CI/CD configuration
  • Independent versioning — Plugin SDK and plugins can be versioned separately via submodules
Multi-tenancy is implemented at the application level using workspace-scoped database queries rather than separate databases per tenant.How it works:
  • Every database table includes a workspace_id foreign key
  • All queries are automatically filtered by the authenticated user’s workspace
  • RBAC rules are evaluated per-workspace
  • API keys and credentials are encrypted and scoped to workspaces
  • This approach balances isolation with operational simplicity

Technology Stack Summary

TechnologyVersionPurpose
Python3.11+Primary backend language
FastAPI0.115+Web framework and API server
UvicornLatestASGI server
SQLAlchemy2.0ORM and database toolkit
AlembicLatestDatabase migrations
Pydanticv2Data validation and serialization
Celery5.4Background task queue
LangGraphLatestWorkflow execution engine
python-joseLatestJWT authentication

Deployment Architecture

The platform supports multiple deployment models:

Local Development

Run everything locally with npm run start. Docker Compose manages PostgreSQL and Redis, while the backend and frontend run as development servers with hot reload.

Docker Compose (Production)

Deploy all services as Docker containers with production configurations, persistent volumes, and environment-based settings.

Kubernetes

Deploy to Kubernetes with Helm charts for horizontal scaling, rolling updates, and health monitoring.

Cloud Platforms

Deploy to AWS, GCP, or Azure using container services (ECS, Cloud Run, AKS) with managed database instances.

Next Steps