AI Assistant API

REST API endpoints for interacting with the AI chat assistant programmatically.

Overview

The AI Assistant API gives you programmatic access to the same chat assistant that powers the search widget on your documentation site. Send natural language questions and receive streaming responses grounded in your documentation content. The assistant uses agentic RAG with three tools to search, retrieve, and synthesize answers.

Base path: https://api.holydocs.com/api/v1/assistant/:projectId

Public Chat

The chat endpoint is public. Embed it in your own interfaces without exposing API keys.

SSE Streaming

Responses stream via Server-Sent Events for low-latency, token-by-token delivery.

Send Message

Send a message to the AI assistant and receive a streaming response. The assistant searches your documentation, retrieves relevant pages, and synthesizes an answer.

bash
POST /api/v1/assistant/:projectId/chat

bash
curl -N -X POST "https://api.holydocs.com/api/v1/assistant/$PROJECT_ID/chat" \  -H "Content-Type: application/json" \  -d '{"message":"How do I set up a custom domain?","visitorId":"v_anon_abc"}'

bash
holydocs api post "/assistant/$PROJECT_ID/chat" \  --json '{"message":"How do I set up a custom domain?","visitorId":"cli"}'

ts
import { HolyDocs } from '@holydocs/sdk';const client = new HolyDocs(); // public — no api key neededfor await (const chunk of client.assistant.chat(projectId, {  message: 'How do I set up a custom domain?',  visitorId: 'v_anon_abc',})) {  process.stdout.write(chunk.token ?? '');}

Path Parameters

Parameter	Type	Description
`projectId`	string	Project ID or project slug

Request Body

json
{  "message": "How do I set up a custom domain?",  "visitorId": "v_anon_abc123",  "conversationId": "conv_xyz789"}

Field	Type	Required	Description
`message`	string	Yes	The user's question (1-2000 characters)
`visitorId`	string	Yes	Unique identifier for the visitor (used for rate limiting and conversation tracking)
`conversationId`	string	No	Continue an existing conversation. Omit to start a new one.

Response

The response is an SSE (Server-Sent Events) stream. Each event is a newline-delimited message with a data: prefix.

text
HTTP/1.1 200 OKContent-Type: text/event-streamCache-Control: no-cacheConnection: keep-alivedata: {"type":"start","conversationId":"conv_xyz789","messageId":"msg_abc123"}data: {"type":"tool_call","tool":"search_docs","input":{"query":"custom domain setup"}}data: {"type":"tool_result","tool":"search_docs","results":3}data: {"type":"tool_call","tool":"get_page","input":{"path":"/custom-domains"}}data: {"type":"tool_result","tool":"get_page","title":"Custom Domains"}data: {"type":"text","content":"To set up a custom domain"}data: {"type":"text","content":" for your HolyDocs project, follow"}data: {"type":"text","content":" these steps:\n\n1. Go to **Settings"}data: {"type":"text","content":" > Domain** in your dashboard\n2."}data: {"type":"text","content":" Enter your domain (e.g., `docs.yourcompany.com`)\n3."}data: {"type":"text","content":" Add a CNAME record pointing to `proxy.holydocs.com`\n4."}data: {"type":"text","content":" Wait for DNS propagation and SSL provisioning\n\n"}data: {"type":"citations","sources":[{"path":"/custom-domains","title":"Custom Domains"},{"path":"/api/domains","title":"Domains API"}]}data: {"type":"done","tokensUsed":342}

SSE Event Types

Type	Description	Fields
`start`	Stream opened, conversation initialized	`conversationId`, `messageId`
`tool_call`	Assistant is invoking a tool	`tool`, `input`
`tool_result`	Tool returned results	`tool`, summary fields
`text`	Content token(s)	`content` (text fragment)
`citations`	Source pages the answer is grounded in	`sources` (array of `{path, title}`)
`done`	Stream complete	`tokensUsed`
`error`	An error occurred	`code`, `message`

javascript
async function askAssistant(projectId, message, visitorId) {  const response = await fetch(    `https://api.holydocs.com/api/v1/assistant/${projectId}/chat`,    {      method: 'POST',      headers: { 'Content-Type': 'application/json' },      body: JSON.stringify({ message, visitorId })    }  );  const reader = response.body.getReader();  const decoder = new TextDecoder();  let fullText = '';  while (true) {    const { done, value } = await reader.read();    if (done) break;    const chunk = decoder.decode(value, { stream: true });    const lines = chunk.split('\n');    for (const line of lines) {      if (!line.startsWith('data: ')) continue;      const event = JSON.parse(line.slice(6));      switch (event.type) {        case 'text':          fullText += event.content;          process.stdout.write(event.content);          break;        case 'citations':          console.log('\nSources:', event.sources.map(s => s.title).join(', '));          break;        case 'error':          throw new Error(event.message);      }    }  }  return fullText;}

python
import requestsimport jsondef ask_assistant(project_id, message, visitor_id):    response = requests.post(        f"https://api.holydocs.com/api/v1/assistant/{project_id}/chat",        json={"message": message, "visitorId": visitor_id},        stream=True    )    full_text = ""    for line in response.iter_lines():        if not line:            continue        decoded = line.decode("utf-8")        if not decoded.startswith("data: "):            continue        event = json.loads(decoded[6:])        if event["type"] == "text":            full_text += event["content"]            print(event["content"], end="", flush=True)        elif event["type"] == "citations":            sources = [s["title"] for s in event["sources"]]            print(f"\nSources: {', '.join(sources)}")        elif event["type"] == "error":            raise Exception(event["message"])    return full_text

bash
curl -N -X POST "https://api.holydocs.com/api/v1/assistant/proj_abc123/chat" \  -H "Content-Type: application/json" \  -d '{    "message": "How do I set up a custom domain?",    "visitorId": "v_anon_test"  }'

Use the -N flag with cURL to disable output buffering, which is required to see SSE events in real time.

Agentic RAG Flow

The assistant does not simply embed the query and return a vector match. It follows an agentic Retrieval-Augmented Generation (RAG) flow, autonomously deciding which tools to call and how many rounds of retrieval to perform before answering.

Query Analysis

The assistant receives the user's message along with any prior conversation history. It analyzes the query to determine what information is needed.

Tool Selection

Based on the query, the assistant selects one or more tools to invoke. For a simple factual question, it may call search_docs once. For a complex question spanning multiple topics, it may chain several tool calls.

Retrieval

Each tool call retrieves information from the documentation. search_docs uses hybrid search (keyword + semantic via RRF). get_page retrieves full page content. list_pages returns the navigation tree.

Synthesis

With retrieved context, the assistant synthesizes a grounded answer. It cites specific pages and avoids hallucinating information not present in the documentation.

Citation

The response concludes with a citations event listing all source pages, so readers can verify and explore further.

Assistant Tools

The assistant has access to three tools. These are invoked automatically during the agentic loop -- you do not call them directly.

search_docs

Performs hybrid search (keyword + semantic) across the documentation. Returns the top matching pages with snippets.

Input:

json
{  "query": "custom domain DNS setup",  "limit": 5}

Behavior: Runs both keyword and semantic search in parallel, merges results via RRF, and returns the top matches. This is the most frequently used tool, invoked in approximately 74% of conversations.

get_page

Retrieves the full content of a specific documentation page by path. Used when the assistant needs detailed information from a page identified by search.

Input:

json
{  "path": "/custom-domains"}

Behavior: Fetches the complete page content from KV cache or R2 storage. Returns the title, content body, and metadata. Used in approximately 50% of conversations, typically after search_docs identifies a relevant page.

list_pages

Returns the full navigation tree of the documentation site. Used when the assistant needs to understand the overall structure or find pages by section.

Input:

json
{}

Behavior: Returns the complete navigation configuration including all sections, pages, and their paths. Used in approximately 16% of conversations, typically for broad "what can I do" or "where is" questions.

List Conversations

Retrieve a list of AI assistant conversations for analytics and review. Requires authentication.

bash
GET /api/v1/assistant/:projectId/conversations

Query Parameters

Parameter	Type	Required	Description
`page`	number	No	Page number (default: 1)
`perPage`	number	No	Results per page (default: 20, max: 100)
`sortBy`	`createdAt` \| `messageCount`	No	Sort field (default: `createdAt`)
`order`	`asc` \| `desc`	No	Sort order (default: `desc`)

Response

json
{  "data": [    {      "id": "conv_abc123",      "visitorId": "v_anon_xyz",      "messageCount": 4,      "tokensUsed": 1240,      "firstMessage": "How do I set up a custom domain?",      "satisfied": true,      "createdAt": "2026-04-10T14:23:00Z",      "lastMessageAt": "2026-04-10T14:25:30Z"    },    {      "id": "conv_def456",      "visitorId": "v_anon_abc",      "messageCount": 7,      "tokensUsed": 3100,      "firstMessage": "What authentication methods are supported?",      "satisfied": null,      "createdAt": "2026-04-10T12:10:00Z",      "lastMessageAt": "2026-04-10T12:18:45Z"    }  ],  "meta": {    "total": 1240,    "page": 1,    "perPage": 20,    "totalPages": 62  }}

Field reference

Field	Type	Description
`id`	string	Unique conversation ID
`visitorId`	string	Anonymous visitor identifier
`messageCount`	number	Total messages in the conversation
`tokensUsed`	number	Total LLM tokens consumed
`firstMessage`	string	The visitor's first message (truncated to 200 chars)
`satisfied`	boolean \| null	Whether the visitor rated the conversation positively, null if unrated
`createdAt`	string	ISO 8601 timestamp of conversation start
`lastMessageAt`	string	ISO 8601 timestamp of most recent message

Usage Statistics

Retrieve aggregated AI assistant usage for billing and monitoring. Requires authentication.

bash
GET /api/v1/assistant/:projectId/usage

Query Parameters

Parameter	Type	Required	Description
`period`	`7d` \| `30d` \| `90d` \| `billing`	No	Time period (default: `billing`, which uses current billing cycle)

Response

json
{  "data": {    "period": "billing",    "startDate": "2026-04-01T00:00:00Z",    "endDate": "2026-04-30T23:59:59Z",    "totalConversations": 1240,    "totalMessages": 5680,    "totalTokens": 2450000,    "tokensLimit": 5000000,    "tokensRemaining": 2550000,    "usageByDay": [      { "date": "2026-04-10", "conversations": 45, "messages": 198, "tokens": 89000 },      { "date": "2026-04-11", "conversations": 38, "messages": 165, "tokens": 74000 }    ],    "model": "anthropic/claude-sonnet-4",    "avgResponseTime": 2.3,    "satisfactionRate": 0.87  }}

bash
curl "https://api.holydocs.com/api/v1/assistant/proj_abc123/usage?period=billing" \  -H "Authorization: Bearer hd_a1b2c3d4e5f67890a1b2c3d4e5f67890"

javascript
const response = await fetch(  'https://api.holydocs.com/api/v1/assistant/proj_abc123/usage?period=billing',  {    headers: {      'Authorization': 'Bearer hd_a1b2c3d4e5f67890a1b2c3d4e5f67890'    }  });const { data } = await response.json();const pctUsed = ((data.totalTokens / data.tokensLimit) * 100).toFixed(1);console.log(`Token usage: ${pctUsed}% (${data.tokensRemaining.toLocaleString()} remaining)`);

Token Limits by Plan

Plan	Monthly Token Limit	Model
Free	100,000	anthropic/claude-sonnet-4
Starter	500,000	anthropic/claude-sonnet-4
Pro	2,000,000	anthropic/claude-sonnet-4
Business	5,000,000	anthropic/claude-sonnet-4
Enterprise	Custom	Configurable

Token usage is metered on the LLM's input and output tokens combined. Tool calls (search, page retrieval) do not count against your token limit, but the content injected into the LLM context from those tool results does.

Embedding the Assistant

You can embed the AI assistant in your own application using the public chat endpoint:

Generate a visitor ID

Create a unique visitor ID for each user session. This can be a UUID, a hash of the user's session, or any unique string. It is used for rate limiting and conversation continuity.

javascript
const visitorId = `v_${crypto.randomUUID()}`;

Open a chat stream

POST to the chat endpoint with the visitor's message. Parse the SSE stream to display tokens in real time.

Continue the conversation

Pass the conversationId from the start event in subsequent requests to maintain context across multiple turns.

javascript
let conversationId = null;async function sendMessage(message) {  const response = await fetch(    `https://api.holydocs.com/api/v1/assistant/${projectId}/chat`,    {      method: 'POST',      headers: { 'Content-Type': 'application/json' },      body: JSON.stringify({ message, visitorId, conversationId })    }  );  // Parse SSE stream, capture conversationId from 'start' event}

Display citations

When you receive the citations event, render links to the source pages so readers can verify the answer.

Rate Limits

Endpoint	Authentication	Rate Limit
`POST /chat`	Public	10 messages/minute per visitorId
`GET /conversations`	Required	Plan-based (see Authentication)
`GET /usage`	Required	Plan-based

Error Codes

Code	Status	Description
`NOT_FOUND`	404	Project not found or AI assistant is not enabled
`VALIDATION_ERROR`	400	Invalid request body (missing message, visitorId too long)
`LIMIT_EXCEEDED`	429	Rate limit or monthly token limit exceeded
`AI_UNAVAILABLE`	503	LLM provider is temporarily unavailable
`CONTENT_FILTERED`	400	Message was flagged by content safety filters

Was this page helpful?

Previous Translations API

Next Agent API