AI Assistant API
REST API endpoints for interacting with the AI chat assistant programmatically.
Overview
The AI Assistant API gives you programmatic access to the same chat assistant that powers the search widget on your documentation site. Send natural language questions and receive streaming responses grounded in your documentation content. The assistant uses agentic RAG with three tools to search, retrieve, and synthesize answers.
Base path: https://api.holydocs.com/api/v1/assistant/:projectId
Send Message
Send a message to the AI assistant and receive a streaming response. The assistant searches your documentation, retrieves relevant pages, and synthesizes an answer.
bashPOST /api/v1/assistant/:projectId/chat
bashcurl -N -X POST "https://api.holydocs.com/api/v1/assistant/$PROJECT_ID/chat" \ -H "Content-Type: application/json" \ -d '{"message":"How do I set up a custom domain?","visitorId":"v_anon_abc"}'
bashholydocs api post "/assistant/$PROJECT_ID/chat" \ --json '{"message":"How do I set up a custom domain?","visitorId":"cli"}'
tsimport { HolyDocs } from '@holydocs/sdk';const client = new HolyDocs(); // public — no api key neededfor await (const chunk of client.assistant.chat(projectId, { message: 'How do I set up a custom domain?', visitorId: 'v_anon_abc',})) { process.stdout.write(chunk.token ?? '');}
Path Parameters
| Parameter | Type | Description |
|---|---|---|
projectId | string | Project ID or project slug |
Request Body
json{ "message": "How do I set up a custom domain?", "visitorId": "v_anon_abc123", "conversationId": "conv_xyz789"}
| Field | Type | Required | Description |
|---|---|---|---|
message | string | Yes | The user's question (1-2000 characters) |
visitorId | string | Yes | Unique identifier for the visitor (used for rate limiting and conversation tracking) |
conversationId | string | No | Continue an existing conversation. Omit to start a new one. |
Response
The response is an SSE (Server-Sent Events) stream. Each event is a newline-delimited message with a data: prefix.
textHTTP/1.1 200 OKContent-Type: text/event-streamCache-Control: no-cacheConnection: keep-alivedata: {"type":"start","conversationId":"conv_xyz789","messageId":"msg_abc123"}data: {"type":"tool_call","tool":"search_docs","input":{"query":"custom domain setup"}}data: {"type":"tool_result","tool":"search_docs","results":3}data: {"type":"tool_call","tool":"get_page","input":{"path":"/custom-domains"}}data: {"type":"tool_result","tool":"get_page","title":"Custom Domains"}data: {"type":"text","content":"To set up a custom domain"}data: {"type":"text","content":" for your HolyDocs project, follow"}data: {"type":"text","content":" these steps:\n\n1. Go to **Settings"}data: {"type":"text","content":" > Domain** in your dashboard\n2."}data: {"type":"text","content":" Enter your domain (e.g., `docs.yourcompany.com`)\n3."}data: {"type":"text","content":" Add a CNAME record pointing to `proxy.holydocs.com`\n4."}data: {"type":"text","content":" Wait for DNS propagation and SSL provisioning\n\n"}data: {"type":"citations","sources":[{"path":"/custom-domains","title":"Custom Domains"},{"path":"/api/domains","title":"Domains API"}]}data: {"type":"done","tokensUsed":342}
SSE Event Types
| Type | Description | Fields |
|---|---|---|
start | Stream opened, conversation initialized | conversationId, messageId |
tool_call | Assistant is invoking a tool | tool, input |
tool_result | Tool returned results | tool, summary fields |
text | Content token(s) | content (text fragment) |
citations | Source pages the answer is grounded in | sources (array of {path, title}) |
done | Stream complete | tokensUsed |
error | An error occurred | code, message |
javascriptasync function askAssistant(projectId, message, visitorId) { const response = await fetch( `https://api.holydocs.com/api/v1/assistant/${projectId}/chat`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ message, visitorId }) } ); const reader = response.body.getReader(); const decoder = new TextDecoder(); let fullText = ''; while (true) { const { done, value } = await reader.read(); if (done) break; const chunk = decoder.decode(value, { stream: true }); const lines = chunk.split('\n'); for (const line of lines) { if (!line.startsWith('data: ')) continue; const event = JSON.parse(line.slice(6)); switch (event.type) { case 'text': fullText += event.content; process.stdout.write(event.content); break; case 'citations': console.log('\nSources:', event.sources.map(s => s.title).join(', ')); break; case 'error': throw new Error(event.message); } } } return fullText;}
pythonimport requestsimport jsondef ask_assistant(project_id, message, visitor_id): response = requests.post( f"https://api.holydocs.com/api/v1/assistant/{project_id}/chat", json={"message": message, "visitorId": visitor_id}, stream=True ) full_text = "" for line in response.iter_lines(): if not line: continue decoded = line.decode("utf-8") if not decoded.startswith("data: "): continue event = json.loads(decoded[6:]) if event["type"] == "text": full_text += event["content"] print(event["content"], end="", flush=True) elif event["type"] == "citations": sources = [s["title"] for s in event["sources"]] print(f"\nSources: {', '.join(sources)}") elif event["type"] == "error": raise Exception(event["message"]) return full_text
bashcurl -N -X POST "https://api.holydocs.com/api/v1/assistant/proj_abc123/chat" \ -H "Content-Type: application/json" \ -d '{ "message": "How do I set up a custom domain?", "visitorId": "v_anon_test" }'
Use the -N flag with cURL to disable output buffering, which is required to see SSE events in real time.
Agentic RAG Flow
The assistant does not simply embed the query and return a vector match. It follows an agentic Retrieval-Augmented Generation (RAG) flow, autonomously deciding which tools to call and how many rounds of retrieval to perform before answering.
Query Analysis
The assistant receives the user's message along with any prior conversation history. It analyzes the query to determine what information is needed.
Tool Selection
Based on the query, the assistant selects one or more tools to invoke. For a simple factual question, it may call search_docs once. For a complex question spanning multiple topics, it may chain several tool calls.
Retrieval
Each tool call retrieves information from the documentation. search_docs uses hybrid search (keyword + semantic via RRF). get_page retrieves full page content. list_pages returns the navigation tree.
Synthesis
With retrieved context, the assistant synthesizes a grounded answer. It cites specific pages and avoids hallucinating information not present in the documentation.
Citation
The response concludes with a citations event listing all source pages, so readers can verify and explore further.
Assistant Tools
The assistant has access to three tools. These are invoked automatically during the agentic loop -- you do not call them directly.
Performs hybrid search (keyword + semantic) across the documentation. Returns the top matching pages with snippets.
Input:
json{ "query": "custom domain DNS setup", "limit": 5}
Behavior: Runs both keyword and semantic search in parallel, merges results via RRF, and returns the top matches. This is the most frequently used tool, invoked in approximately 74% of conversations.
Retrieves the full content of a specific documentation page by path. Used when the assistant needs detailed information from a page identified by search.
Input:
json{ "path": "/custom-domains"}
Behavior: Fetches the complete page content from KV cache or R2 storage. Returns the title, content body, and metadata. Used in approximately 50% of conversations, typically after search_docs identifies a relevant page.
Returns the full navigation tree of the documentation site. Used when the assistant needs to understand the overall structure or find pages by section.
Input:
json{}
Behavior: Returns the complete navigation configuration including all sections, pages, and their paths. Used in approximately 16% of conversations, typically for broad "what can I do" or "where is" questions.
List Conversations
Retrieve a list of AI assistant conversations for analytics and review. Requires authentication.
bashGET /api/v1/assistant/:projectId/conversations
Query Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
page | number | No | Page number (default: 1) |
perPage | number | No | Results per page (default: 20, max: 100) |
sortBy | createdAt | messageCount | No | Sort field (default: createdAt) |
order | asc | desc | No | Sort order (default: desc) |
Response
json{ "data": [ { "id": "conv_abc123", "visitorId": "v_anon_xyz", "messageCount": 4, "tokensUsed": 1240, "firstMessage": "How do I set up a custom domain?", "satisfied": true, "createdAt": "2026-04-10T14:23:00Z", "lastMessageAt": "2026-04-10T14:25:30Z" }, { "id": "conv_def456", "visitorId": "v_anon_abc", "messageCount": 7, "tokensUsed": 3100, "firstMessage": "What authentication methods are supported?", "satisfied": null, "createdAt": "2026-04-10T12:10:00Z", "lastMessageAt": "2026-04-10T12:18:45Z" } ], "meta": { "total": 1240, "page": 1, "perPage": 20, "totalPages": 62 }}
Usage Statistics
Retrieve aggregated AI assistant usage for billing and monitoring. Requires authentication.
bashGET /api/v1/assistant/:projectId/usage
Query Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
period | 7d | 30d | 90d | billing | No | Time period (default: billing, which uses current billing cycle) |
Response
json{ "data": { "period": "billing", "startDate": "2026-04-01T00:00:00Z", "endDate": "2026-04-30T23:59:59Z", "totalConversations": 1240, "totalMessages": 5680, "totalTokens": 2450000, "tokensLimit": 5000000, "tokensRemaining": 2550000, "usageByDay": [ { "date": "2026-04-10", "conversations": 45, "messages": 198, "tokens": 89000 }, { "date": "2026-04-11", "conversations": 38, "messages": 165, "tokens": 74000 } ], "model": "anthropic/claude-sonnet-4", "avgResponseTime": 2.3, "satisfactionRate": 0.87 }}
bashcurl "https://api.holydocs.com/api/v1/assistant/proj_abc123/usage?period=billing" \ -H "Authorization: Bearer hd_a1b2c3d4e5f67890a1b2c3d4e5f67890"
javascriptconst response = await fetch( 'https://api.holydocs.com/api/v1/assistant/proj_abc123/usage?period=billing', { headers: { 'Authorization': 'Bearer hd_a1b2c3d4e5f67890a1b2c3d4e5f67890' } });const { data } = await response.json();const pctUsed = ((data.totalTokens / data.tokensLimit) * 100).toFixed(1);console.log(`Token usage: ${pctUsed}% (${data.tokensRemaining.toLocaleString()} remaining)`);
Token Limits by Plan
| Plan | Monthly Token Limit | Model |
|---|---|---|
| Free | 100,000 | anthropic/claude-sonnet-4 |
| Starter | 500,000 | anthropic/claude-sonnet-4 |
| Pro | 2,000,000 | anthropic/claude-sonnet-4 |
| Business | 5,000,000 | anthropic/claude-sonnet-4 |
| Enterprise | Custom | Configurable |
Token usage is metered on the LLM's input and output tokens combined. Tool calls (search, page retrieval) do not count against your token limit, but the content injected into the LLM context from those tool results does.
Embedding the Assistant
You can embed the AI assistant in your own application using the public chat endpoint:
Generate a visitor ID
Create a unique visitor ID for each user session. This can be a UUID, a hash of the user's session, or any unique string. It is used for rate limiting and conversation continuity.
javascriptconst visitorId = `v_${crypto.randomUUID()}`;
Open a chat stream
POST to the chat endpoint with the visitor's message. Parse the SSE stream to display tokens in real time.
Continue the conversation
Pass the conversationId from the start event in subsequent requests to maintain context across multiple turns.
javascriptlet conversationId = null;async function sendMessage(message) { const response = await fetch( `https://api.holydocs.com/api/v1/assistant/${projectId}/chat`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ message, visitorId, conversationId }) } ); // Parse SSE stream, capture conversationId from 'start' event}
Display citations
When you receive the citations event, render links to the source pages so readers can verify the answer.
Rate Limits
| Endpoint | Authentication | Rate Limit |
|---|---|---|
POST /chat | Public | 10 messages/minute per visitorId |
GET /conversations | Required | Plan-based (see Authentication) |
GET /usage | Required | Plan-based |
Error Codes
| Code | Status | Description |
|---|---|---|
NOT_FOUND | 404 | Project not found or AI assistant is not enabled |
VALIDATION_ERROR | 400 | Invalid request body (missing message, visitorId too long) |
LIMIT_EXCEEDED | 429 | Rate limit or monthly token limit exceeded |
AI_UNAVAILABLE | 503 | LLM provider is temporarily unavailable |
CONTENT_FILTERED | 400 | Message was flagged by content safety filters |