Skip to main content

Overview

Nadoo AI uses Server-Sent Events (SSE) as the primary protocol for streaming real-time data from the server to the client. As an AI agent generates tokens, calls tools, retrieves documents, and reasons through steps, each action is emitted as a typed event that clients render incrementally. This provides a transparent, responsive experience where users see the agent’s work as it happens. SSE is a standard HTTP-based protocol where the client opens a single long-lived connection and the server pushes a sequence of typed events. It is simpler than WebSocket, reconnects automatically, and works reliably through HTTP proxies and load balancers.

Connecting to the SSE Stream

Chat Streaming

POST /api/v1/chat/completions
Content-Type: application/json
Authorization: Bearer {token}

{
  "session_id": "session-uuid",
  "message": "Summarize the Q4 report",
  "stream": true
}

Workflow Execution Streaming

POST /api/v1/workflows/{workflow_id}/execute/stream
Content-Type: application/json
Authorization: Bearer {token}

{
  "inputs": {
    "query": "Analyze sales trends"
  }
}
Both endpoints return a text/event-stream response. Each event follows the SSE format:
event: <event_type>
data: <json_payload>

Set "stream": true in the chat completions request body to receive an SSE stream. Without this flag, the endpoint returns a standard JSON response after the full completion is generated.

Event Types Reference

Nadoo AI defines 19 SSE event types organized into four categories. Each event carries a JSON payload with context-specific data.

Workflow Events

Emitted during workflow execution to mark the lifecycle of the entire workflow run. These events are only present when the application is a Workflow App.
Emitted when a workflow execution begins.
{
  "workflow_id": "wf-uuid-123",
  "execution_id": "exec-uuid-456"
}
FieldTypeDescription
workflow_idstringThe workflow being executed
execution_idstringUnique ID for this execution run
Emitted when a workflow execution completes successfully.
{
  "execution_id": "exec-uuid-456",
  "duration_ms": 3421
}
FieldTypeDescription
execution_idstringThe execution run ID
duration_msintegerTotal execution time in milliseconds
Emitted when a workflow execution fails with an error.
{
  "execution_id": "exec-uuid-456",
  "error": "Node 'ai-agent-1' failed: Context window exceeded"
}
FieldTypeDescription
execution_idstringThe execution run ID
errorstringHuman-readable error description

Node Events

Emitted as individual nodes within a workflow execute. Each node (AI Agent, Condition, Search Knowledge, etc.) produces start, end, and optionally error events.
Emitted when a node begins execution.
{
  "node_id": "ai-agent-1",
  "node_type": "ai_agent",
  "node_name": "Classify Intent"
}
FieldTypeDescription
node_idstringUnique node identifier within the workflow
node_typestringThe node type (e.g., ai_agent, condition, search_knowledge)
node_namestringHuman-readable node label
Emitted when a node finishes execution successfully.
{
  "node_id": "ai-agent-1",
  "output": {
    "response": "The user is asking about sales data.",
    "confidence": 0.95
  },
  "duration_ms": 1230
}
FieldTypeDescription
node_idstringThe node that completed
outputobjectThe node’s output data
duration_msintegerNode execution time in milliseconds
Emitted when a node encounters an error during execution.
{
  "node_id": "search-kb-1",
  "error": "Knowledge base 'kb-123' not found"
}
FieldTypeDescription
node_idstringThe node that failed
errorstringError description

Agent Events

Emitted by the AI Agent node to provide visibility into the agent’s reasoning process, tool usage, and self-reflection behavior. These events are essential for building transparent AI interfaces.
Emitted at the start of each reasoning iteration in ReAct, Reflection, or Tree of Thoughts modes. Tracks how many iterations the agent has performed.
{
  "iteration": 1,
  "max_iterations": 5
}
FieldTypeDescription
iterationintegerCurrent iteration number (1-based)
max_iterationsintegerMaximum allowed iterations
Emitted when the agent decides to invoke a tool.
{
  "tool_name": "web_search",
  "arguments": {
    "query": "Q4 2024 revenue figures"
  }
}
FieldTypeDescription
tool_namestringName of the tool being called
argumentsobjectArguments passed to the tool
Emitted when a tool returns its result to the agent.
{
  "tool_name": "web_search",
  "result": "Q4 revenue was $12.3M, up 15% year-over-year..."
}
FieldTypeDescription
tool_namestringName of the tool that completed
resultstringThe tool’s output
Emitted in Reflection mode when the agent critiques and refines its own output.
{
  "critique": "The summary lacks specific revenue numbers.",
  "refinement": "I should include the exact figures from the knowledge base."
}
FieldTypeDescription
critiquestringThe agent’s self-critique
refinementstringThe planned improvement
Emitted when the agent enters a reasoning phase, providing visibility into its thought process.
{
  "content": "Let me analyze the quarterly data step by step. First, I need to compare Q3 and Q4 figures."
}
FieldTypeDescription
contentstringThe agent’s reasoning text
Emitted in Chain of Thought mode for each discrete reasoning step.
{
  "step": 1,
  "thought": "First, I need to identify the key metrics from the Q4 report."
}
FieldTypeDescription
stepintegerStep number (1-based)
thoughtstringThe reasoning content for this step

System Events

Emitted to provide metadata, resource usage information, and system-level error reporting throughout the streaming session.
Reports token consumption for the current LLM call or response.
{
  "prompt_tokens": 850,
  "completion_tokens": 62,
  "total_tokens": 912
}
FieldTypeDescription
prompt_tokensintegerTokens in the input prompt
completion_tokensintegerTokens in the generated response
total_tokensintegerSum of prompt and completion tokens
Reports the estimated monetary cost for the current response based on the model’s pricing.
{
  "cost_usd": 0.0048,
  "model": "gpt-4o"
}
FieldTypeDescription
cost_usdfloatEstimated cost in USD
modelstringThe model used for this call
Emitted when the conversation context was trimmed to fit within the model’s context window.
{
  "original_tokens": 12000,
  "trimmed_tokens": 8000
}
FieldTypeDescription
original_tokensintegerToken count before trimming
trimmed_tokensintegerToken count after trimming
Context trimming uses the strategy configured on the application: truncate (drop oldest messages), summarize (condense history), or error (fail if context exceeds limit). The default strategy is truncate with a 4096-token limit.
Emitted when an LLM API call is initiated.
{
  "model": "gpt-4o",
  "provider": "openai"
}
FieldTypeDescription
modelstringThe model being called
providerstringThe AI provider (openai, anthropic, google, etc.)
Emitted when an LLM API call completes.
{
  "model": "gpt-4o",
  "latency_ms": 890
}
FieldTypeDescription
modelstringThe model that was called
latency_msintegerRound-trip latency in milliseconds
Emitted when the conversation memory is updated (e.g., messages trimmed from the buffer, summary regenerated).
{
  "type": "buffer",
  "messages_retained": 20
}
FieldTypeDescription
typestringMemory type: buffer, summary, or knowledge_graph
messages_retainedintegerNumber of messages kept in memory
Emitted when a system-level error occurs during streaming.
{
  "code": "rate_limit",
  "message": "OpenAI rate limit exceeded. Retrying in 5 seconds."
}
FieldTypeDescription
codestringMachine-readable error code
messagestringHuman-readable error description
See the Error Handling page for the full list of error codes.

Event Summary Table

#EventCategoryDescription
1workflow_startWorkflowWorkflow execution has begun
2workflow_endWorkflowWorkflow completed successfully
3workflow_errorWorkflowWorkflow failed with an error
4node_startNodeA node has started executing
5node_endNodeA node has finished executing
6node_errorNodeA node encountered an error
7agent_iterationAgentAn agent reasoning iteration occurred
8agent_tool_callAgentThe agent is invoking a tool
9agent_tool_resultAgentA tool returned its result
10agent_reflectionAgentThe agent is reflecting on its output
11agent_thinkingAgentThe agent is in a reasoning phase
12cot_stepAgentA chain-of-thought reasoning step
13token_usageSystemToken consumption for the response
14cost_updateSystemEstimated cost for the response
15context_trimmedSystemContext was trimmed to fit the model window
16llm_call_startSystemAn LLM API call has been initiated
17llm_call_endSystemAn LLM API call has completed
18memory_updateSystemConversation memory was updated
19errorSystemA system-level error occurred

Example SSE Stream

Below is a complete SSE stream from a Chat App with tool use and knowledge base retrieval:
event: llm_call_start
data: {"model": "gpt-4o", "provider": "openai"}

event: agent_tool_call
data: {"tool_name": "search_knowledge", "arguments": {"query": "Q4 revenue figures"}}

event: agent_tool_result
data: {"tool_name": "search_knowledge", "result": "Q4 revenue was $12.3M, up 15% YoY..."}

event: agent_thinking
data: {"content": "The knowledge base contains the Q4 figures. Let me summarize the key points."}

event: text_chunk
data: {"content": "Based on the Q4 report, "}

event: text_chunk
data: {"content": "revenue reached $12.3 million, "}

event: text_chunk
data: {"content": "representing a 15% year-over-year increase."}

event: llm_call_end
data: {"model": "gpt-4o", "latency_ms": 1240}

event: token_usage
data: {"prompt_tokens": 850, "completion_tokens": 62, "total_tokens": 912}

event: cost_update
data: {"cost_usd": 0.0048, "model": "gpt-4o"}

event: done
data: {}

Client Implementation

async function streamChat(sessionId, message, token) {
  const response = await fetch('/api/v1/chat/completions', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${token}`,
    },
    body: JSON.stringify({
      session_id: sessionId,
      message: message,
      stream: true,
    }),
  });

  const reader = response.body.getReader();
  const decoder = new TextDecoder();
  let buffer = '';

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    buffer += decoder.decode(value, { stream: true });
    const lines = buffer.split('\n');
    buffer = lines.pop(); // Keep incomplete line in buffer

    let eventType = '';
    for (const line of lines) {
      if (line.startsWith('event: ')) {
        eventType = line.slice(7);
      } else if (line.startsWith('data: ') && eventType) {
        const data = JSON.parse(line.slice(6));
        handleEvent(eventType, data);
        eventType = '';
      }
    }
  }
}

function handleEvent(eventType, data) {
  switch (eventType) {
    case 'text_chunk':
      appendToResponse(data.content);
      break;

    case 'agent_tool_call':
      showToolCall(data.tool_name, data.arguments);
      break;

    case 'agent_tool_result':
      showToolResult(data.tool_name, data.result);
      break;

    case 'agent_thinking':
      showThinkingStep(data.content);
      break;

    case 'cot_step':
      showReasoningStep(data.step, data.thought);
      break;

    case 'agent_reflection':
      showReflection(data.critique, data.refinement);
      break;

    case 'node_start':
      updateNodeStatus(data.node_id, 'running');
      break;

    case 'node_end':
      updateNodeStatus(data.node_id, 'completed');
      break;

    case 'token_usage':
      updateTokenCounter(data);
      break;

    case 'cost_update':
      updateCostDisplay(data.cost_usd);
      break;

    case 'error':
      showError(data.code, data.message);
      break;

    case 'done':
      finalizeResponse();
      break;
  }
}

Events by Agent Mode

Different agent modes emit different combinations of events. Use this table to understand which events to expect based on your application’s configuration.
EventStandardChain of ThoughtReActFunction CallingReflectionTree of Thoughts
agent_thinkingYesYesYesYes
cot_stepYes
agent_iterationYesYesYes
agent_tool_callYesYes
agent_tool_resultYesYes
agent_reflectionYes
llm_call_start/endYesYesYesYesYesYes
token_usageYesYesYesYesYesYes
cost_updateYesYesYesYesYesYes

Error Handling

Reconnection Strategy

SSE connections may drop due to network issues. The browser’s native EventSource API automatically reconnects, but when using fetch for POST-based SSE streams, implement reconnection manually:
1

Detect connection loss

Monitor the fetch stream for read errors or unexpected stream termination.
2

Wait with exponential backoff

Start at 1 second and double the wait time on each retry, up to a maximum of 30 seconds.
3

Reconnect

Re-establish the SSE connection. If your server supports Last-Event-ID, include it in the request header to resume from the last received event.
4

Inform the user

Display a reconnection indicator in the UI so the user knows the stream is temporarily interrupted.

Common Error Codes

CodeDescriptionRecommended Action
rate_limitAI provider rate limit exceededWait and retry (server may auto-retry)
context_overflowInput exceeded model’s context windowReduce message history or input size
tool_errorA tool invocation failedCheck tool configuration and inputs
timeoutRequest or LLM call timed outRetry with a simpler prompt or lower max_tokens
provider_errorAI provider returned an unexpected errorCheck provider status; retry after a delay
auth_errorAuthentication failure mid-streamRefresh token and reconnect
When using SSE for workflow execution, a workflow_error event indicates that the entire workflow failed. Check the node_error events emitted before it to identify which specific node caused the failure.

Frontend Integration

Nadoo AI provides built-in React components for SSE integration.

useAgentSSE Hook

The useAgentSSE hook manages the SSE connection lifecycle, parses events, and exposes reactive state:
import { useAgentSSE } from '@/hooks/useAgentSSE';

function ChatPanel({ sessionId }) {
  const {
    sendMessage,
    response,             // Accumulated response text
    isStreaming,           // Whether a response is in progress
    toolCalls,            // List of tool calls with results
    tokenUsage,           // { prompt_tokens, completion_tokens, total_tokens }
    suggestedQuestions,    // Follow-up suggestions
    error,                // Current error state
  } = useAgentSSE(sessionId);

  return (
    <div>
      <MessageList messages={response} />
      {isStreaming && <StreamingIndicator />}
      {toolCalls.map(tc => <ToolCallCard key={tc.id} {...tc} />)}
      {suggestedQuestions && (
        <SuggestionChips questions={suggestedQuestions} />
      )}
      <MessageInput onSend={sendMessage} disabled={isStreaming} />
    </div>
  );
}

AgentExecutionMonitor Component

For Workflow Apps, the AgentExecutionMonitor renders a visual timeline of workflow execution:
import { AgentExecutionMonitor } from '@/components/AgentExecutionMonitor';

function WorkflowChat({ sessionId }) {
  return (
    <div>
      <ChatPanel sessionId={sessionId} />
      <AgentExecutionMonitor
        sessionId={sessionId}
        showNodeDetails={true}
        showTokenUsage={true}
        collapsible={true}
      />
    </div>
  );
}
The monitor displays:
  • Node execution timeline with status indicators (running, completed, failed)
  • Expandable node details showing input, output, and duration
  • Tool call cards with arguments and results
  • Thinking/reasoning steps from CoT and ReAct strategies
  • Token usage and cost per node and overall

Next Steps