Chat Application - Nadoo AI

Overview

The Chat Application is the simplest application type in Nadoo AI. It wraps a single AI model with optional knowledge bases and tools into a conversational agent that responds to user messages in real time. Chat Apps are ideal for customer support bots, internal Q&A assistants, and any use case where a direct conversation with an LLM is the core interaction.

User Message → [Memory Recall] → [Knowledge Retrieval] → [LLM + Tools] → Streaming Response

Creating a Chat App

Navigate to Applications

Open your workspace and click New Application from the dashboard.

Select Chat Type

Choose Chat as the application type. Give it a name and optional description.

Configure the Model

Select an AI model and tune generation parameters to match your use case.

Write a System Prompt

Define the agent’s persona, instructions, and behavioral guidelines in the system prompt.

Attach Knowledge Bases (Optional)

Connect one or more knowledge bases to enable retrieval-augmented generation.

Test in Sandbox

Use the built-in chat sandbox to test and iterate on your configuration.

Model Configuration

The model configuration controls how the LLM generates responses. These settings are available in the application’s Model tab.

Parameter	Default	Description
Model	—	The AI model to use (e.g., `gpt-4o`, `claude-3.5-sonnet`, `gemini-pro`)
System Prompt	—	Instructions that define the agent’s persona, tone, and behavior
Temperature	0.7	Controls randomness. `0` = deterministic, `1` = highly creative
Max Tokens	4096	Maximum number of tokens the model can generate per response
Top P	1.0	Nucleus sampling threshold. Lower values restrict the token pool for more focused output

Start with temperature 0.3 — 0.5 for factual Q&A applications and 0.7 — 0.9 for creative or conversational agents.

System Prompt

The system prompt is the most important configuration for a Chat App. It defines who the agent is, what it should do, and how it should behave.

You are a helpful customer support agent for Acme Corp.

Rules:
- Always be polite and professional
- Only answer questions about Acme products
- If you don't know the answer, say so and offer to connect the user with a human agent
- Use the knowledge base to look up product information before answering

Best practices for system prompts:

Be specific — clearly state the agent’s role, domain, and boundaries
Set guardrails — define what the agent should and should not do
Reference tools — if you attach tools, explain when the agent should use them
Structure with sections — use headers and bullet points for complex instructions

Conversation Memory

Memory allows the agent to recall earlier messages in the conversation. Nadoo AI supports two memory strategies for Chat Apps.

Buffer Memory

Retains the last N messages in their original form. Simple and effective for short to medium conversations.

memory_type: "buffer"
window_size: 20

The agent receives the 20 most recent messages as context for each response.

Summary Memory

Condenses older messages into a running summary while keeping recent messages intact. Effective for long conversations where full history would exceed the model’s context window.

memory_type: "summary"
window_size: 10

Messages beyond the window are summarized by the LLM and prepended as context.

Choose Buffer Memory for most use cases. Switch to Summary Memory when conversations routinely exceed 30+ turns or when you need to preserve context over long sessions.

Knowledge Base Integration

Attach one or more knowledge bases to enable retrieval-augmented generation (RAG). When a user sends a message, the system retrieves relevant documents from the knowledge base and includes them in the LLM’s context before generating a response.

How RAG Works in Chat Apps

User Message
  → Embedding generation
  → Similarity search across attached knowledge bases
  → Top-K relevant chunks retrieved
  → Chunks injected into LLM context alongside system prompt and memory
  → LLM generates a grounded response

Attaching a Knowledge Base

Go to the Chat App’s Knowledge tab
Click Add Knowledge Base
Select one or more knowledge bases from your workspace
Configure retrieval parameters:

Parameter	Default	Description
Top K	5	Number of document chunks to retrieve per query
Score Threshold	0.5	Minimum similarity score to include a chunk
Search Mode	Hybrid	`vector`, `keyword`, or `hybrid` (vector + BM25)

Attaching too many knowledge bases or setting Top K too high can consume a large portion of the model’s context window. Monitor token usage and adjust retrieval parameters accordingly.

Tool Integration

Attach tools (plugins) that the agent can invoke during conversation. Tools extend the agent beyond text generation — enabling web search, API calls, calculations, database queries, and more.

Attaching Tools

Go to the Chat App’s Tools tab
Click Add Tool
Select from available plugins in your workspace (e.g., Web Search, Calculator, custom plugins)

The agent decides when to invoke a tool based on the conversation context and tool descriptions. Tool calls and results are streamed in real time via SSE events.

Tools work best with models that support function calling, such as GPT-4o, Claude 3.5 Sonnet, and Gemini Pro.

Streaming Responses

Chat Apps deliver responses in real time using Server-Sent Events (SSE). As the LLM generates tokens, they are streamed to the client immediately — providing a responsive, typewriter-style experience. Key SSE events for Chat Apps:

Event	Description
`message_start`	Response generation has begun
`text_chunk`	A chunk of the response text
`text_end`	Text generation is complete
`tool_call_start`	The model is invoking a tool
`tool_result`	Result returned from the tool
`retrieval_result`	Documents retrieved from knowledge base
`suggested_questions`	AI-generated follow-up questions
`usage`	Token usage statistics
`done`	Stream is complete

See Real-time Streaming for the full event reference.

Testing in the Chat Sandbox

Every Chat App includes a built-in sandbox for testing. The sandbox provides:

Real-time chat with your configured agent
Message history to review conversation flow
Token usage display showing prompt and completion token counts
Debug panel with raw SSE events, tool calls, and retrieved documents
Parameter overrides to temporarily adjust temperature, max tokens, and other settings

Use the sandbox to iterate quickly on your system prompt. Try edge cases, test tool invocations, and verify knowledge base retrieval before deploying.

API Access

Chat Apps are accessible through the REST API for programmatic integration:

# Send a message with streaming
POST /api/v1/chat/completions
{
  "session_id": "session-uuid",
  "message": "What products do you offer?",
  "stream": true
}

# Send a message without streaming
POST /api/v1/chat/completions
{
  "session_id": "session-uuid",
  "message": "What products do you offer?",
  "stream": false
}

See the API Reference for full endpoint documentation.

Next Steps

Real-time Streaming

Deep dive into SSE event types and frontend integration

Canvas

Rich content editing within the chat interface

Workflow App

Build complex multi-step logic with the visual workflow editor

AI Models

Configure and choose the right model for your Chat App

​Overview

​Creating a Chat App

​Model Configuration

​System Prompt

​Conversation Memory

​Buffer Memory

​Summary Memory

​Knowledge Base Integration

​How RAG Works in Chat Apps

​Attaching a Knowledge Base

​Tool Integration

​Attaching Tools

​Streaming Responses

​Suggested Questions

​Testing in the Chat Sandbox

​API Access

​Next Steps

Real-time Streaming

Canvas

Workflow App

AI Models

Overview

Creating a Chat App

Model Configuration

System Prompt

Conversation Memory

Buffer Memory

Summary Memory

Knowledge Base Integration

How RAG Works in Chat Apps

Attaching a Knowledge Base

Tool Integration

Attaching Tools

Streaming Responses

Suggested Questions

Testing in the Chat Sandbox

API Access

Next Steps