Overview
The Nadoo AI Knowledge Base provides a complete Retrieval-Augmented Generation (RAG) pipeline that connects your documents to your AI agent workflows. Instead of relying solely on what a language model was trained on, you can ground its responses in your own data — company documents, product manuals, research papers, or any text corpus. The pipeline has four stages: Document Processing, Vector Storage, Retrieval, and Integration.RAG Pipeline
Document Processing
Upload files and convert them into searchable chunks.
- Upload — Drag and drop files or provide URLs
- Parse — Extract text from PDF, DOCX, TXT, Markdown, Excel, and web pages
- Chunk — Split documents into overlapping segments (default: 1000 characters with 200-character overlap)
- Extract metadata — Capture titles, headings, page numbers, and custom metadata for filtering
Vector Storage
Generate embeddings and store them for fast similarity search.
- Generate embeddings — Convert each chunk into a vector using a configurable embedding model (OpenAI, HuggingFace, Azure, Bedrock, Google, vLLM, Ollama, Local)
- Store in vector database — Persist vectors via pluggable VectorStore (pgvector default, Milvus/Qdrant planned). Distance metrics: cosine, euclidean, dot product
- Index — Build HNSW or IVFFlat indexes for approximate nearest neighbor search at scale
Retrieval
Find the most relevant chunks for a given query.
- Query embedding — Convert the user’s question into a vector using the same embedding model
- Similarity search — Find the closest vectors in the index
- Rerank — Optionally re-score results with a cross-encoder reranker for higher precision
- Context assembly — Combine the top chunks into a context window, respecting token limits
Integration
Inject retrieved context into your AI agent’s prompt.
- Prompt injection — Insert retrieved chunks into the system prompt or user message
- Citation tracking — Record which documents contributed to the response for transparency
- Feedback loop — Use user feedback to improve retrieval quality over time
Supported Document Formats
| Format | Extensions | Notes |
|---|---|---|
.pdf | OCR support for scanned documents | |
| Microsoft Word | .docx, .doc | Preserves heading structure |
| Plain Text | .txt | Direct ingestion |
| Markdown | .md, .mdx | Preserves heading hierarchy |
| Excel | .xlsx, .xls | Each sheet processed separately |
| Web Pages | URL | Fetches and parses HTML content |
Search Modes
The knowledge base supports three search modes that you can configure per query or per knowledge base.- Vector Search
- BM25 (Keyword)
- Hybrid
Semantic similarity — Finds documents whose meaning is closest to the query, even if the exact words differ.Uses cosine similarity on embedding vectors. Best for natural language questions where the user’s phrasing may not match the document’s exact wording.
Configuration
Embedding Model
Choose the embedding model used to generate vectors. The model must be consistent between indexing and querying.Embedding providers include OpenAI, HuggingFace, Local models, Azure OpenAI, AWS Bedrock, Google AI Studio, Google Vertex AI, vLLM, and Ollama. The embedding model is set at the knowledge base level and applies to all documents within it.
Chunking
Control how documents are split into segments.| Parameter | Default | Description |
|---|---|---|
chunk_size | 1000 | Maximum number of characters per chunk |
chunk_overlap | 200 | Number of overlapping characters between consecutive chunks |
separator | \n\n | Primary split boundary (falls back to sentence/word boundaries) |
Retrieval Settings
Fine-tune how documents are fetched at query time.| Parameter | Default | Description |
|---|---|---|
top_k | 5 | Number of chunks to retrieve |
score_threshold | 0.5 | Minimum similarity score (0.0 to 1.0) |
reranking | false | Enable cross-encoder reranking for higher precision |
rerank_model | — | Model to use for reranking (e.g., cohere-rerank-v3) |
rerank_top_k | 3 | Number of chunks to keep after reranking |
Advanced Features
Contextual Retrieval
Enhance each chunk with a brief AI-generated summary of its context within the full document. This improves retrieval accuracy by embedding each chunk with awareness of its surrounding content.Knowledge Graphs
Extract entities and relationships from documents to build a knowledge graph. This enables graph-based queries that traverse relationships rather than relying solely on text similarity.Multi-Hop Reasoning
For complex questions that require information from multiple documents, multi-hop reasoning chains together several retrieval steps:- Retrieve initial context for the question
- Identify follow-up sub-questions based on the initial context
- Retrieve additional context for each sub-question
- Synthesize all retrieved information into a comprehensive answer