Overview
Azure OpenAI Service provides enterprise-grade access to OpenAI models through Microsoft Azure. It offers the same GPT-4 and GPT-3.5 models as OpenAI’s direct API, but with Azure’s security, compliance, and regional data residency guarantees. This makes Azure OpenAI the preferred choice for organizations that require enterprise-level data governance. Key differences from direct OpenAI:- Deployment-based — Models are deployed to named endpoints rather than selected by model ID
- Regional hosting — Choose the Azure region where your data is processed
- Enterprise security — Azure AD authentication, private endpoints, and managed identity support
- Compliance — SOC 2, HIPAA, GDPR, and other certifications through Azure
Setup
Create an Azure OpenAI Resource
In the Azure Portal, create an Azure OpenAI resource. Select your subscription, resource group, and region.
Azure OpenAI requires approval. If you do not have access, apply at aka.ms/oai/access.
Deploy Models
In Azure OpenAI Studio, go to Deployments and create deployments for the models you need. Each deployment gets a unique name that you will use in Nadoo.Common deployment setup:
| Deployment Name | Model | Use Case |
|---|---|---|
gpt-4o-deploy | gpt-4o | General chat and reasoning |
gpt-35-deploy | gpt-3.5-turbo | Cost-efficient tasks |
embedding-deploy | text-embedding-ada-002 | Vector embeddings for RAG |
Get Credentials
From your Azure OpenAI resource in the Azure Portal, navigate to Keys and Endpoint. Copy:
- Endpoint URL (e.g.,
https://your-resource.openai.azure.com/) - API Key (Key 1 or Key 2)
Configure in Nadoo
Go to Admin > Model Providers > Azure OpenAI and enter:
| Field | Required | Description |
|---|---|---|
| Endpoint URL | Yes | Your Azure OpenAI endpoint (e.g., https://your-resource.openai.azure.com/) |
| API Key | Yes | One of your Azure OpenAI API keys |
| API Version | Yes | The API version to use (e.g., 2024-02-15-preview) |
| Deployment Names | Yes | Comma-separated list of your model deployment names |
Available Models
Azure OpenAI supports the same models as direct OpenAI, deployed under your custom deployment names.Chat / LLM
| Model | Context Window | Best For |
|---|---|---|
gpt-4o | 128K tokens | General-purpose, multimodal, fastest GPT-4 class |
gpt-4o-mini | 128K tokens | Cost-efficient tasks, high-volume workloads |
gpt-4-turbo | 128K tokens | Complex reasoning with large context |
gpt-3.5-turbo | 16K tokens | Simple tasks, lowest cost |
Embedding
| Model | Dimensions | Best For |
|---|---|---|
text-embedding-ada-002 | 1536 | Standard embeddings for RAG |
text-embedding-3-small | 1536 | Cost-efficient embeddings |
text-embedding-3-large | 3072 | Highest quality embeddings |
Model availability varies by Azure region. Check Azure OpenAI model availability for the latest regional support.
Capabilities
Chat Completion
Conversational AI with streaming, function calling, and JSON mode — identical capabilities to direct OpenAI.
Embeddings
Generate vector representations of text for semantic search and RAG knowledge bases.
Enterprise Security
Azure AD, managed identity, private endpoints, and virtual network integration.
Regional Compliance
Data stays in your chosen Azure region for regulatory compliance.
Deployment Configuration
Unlike direct OpenAI where you select a model by name, Azure OpenAI uses deployment names. When configuring an application in Nadoo, you select your Azure deployment instead of the base model name.Multiple Deployments
You can create multiple deployments of the same model for different purposes:- Production deployment with higher token-per-minute (TPM) quota
- Development deployment with lower quota for testing
- Embedding deployment dedicated to knowledge base indexing
Environment Variables
When self-hosting, configure Azure OpenAI via environment variables:Recommended Models by Use Case
| Use Case | Recommended Deployment | Reason |
|---|---|---|
| General chatbot | gpt-4o | Best balance of speed, quality, and cost |
| High-volume support | gpt-4o-mini or gpt-3.5-turbo | Low cost with strong performance |
| Document analysis | gpt-4o | Long context window with vision support |
| Knowledge base search | text-embedding-3-small | High quality at low cost |
| Premium RAG pipeline | text-embedding-3-large | Maximum embedding quality |
Rate Limits and Quotas
Azure OpenAI uses Tokens Per Minute (TPM) quotas that are configured per deployment. You can adjust quotas in the Azure Portal under your deployment’s settings.Troubleshooting
404 Resource Not Found
404 Resource Not Found
The endpoint URL or deployment name is incorrect. Verify both values in the Azure Portal under your OpenAI resource’s Keys and Endpoint and Deployments sections.
401 Unauthorized
401 Unauthorized
429 Rate Limit Exceeded
429 Rate Limit Exceeded
You have exceeded your deployment’s TPM quota. Increase the quota in the Azure Portal or distribute traffic across multiple deployments.
Model not available in region
Model not available in region
Not all models are available in every Azure region. Check model availability and consider creating a resource in a supported region.