AI Search

HolyDocs provides hybrid AI-powered search combining vector similarity and keyword matching for fast, accurate documentation search.

Overview

HolyDocs includes a built-in search system that combines traditional keyword matching with AI-powered semantic search. This hybrid approach ensures readers find what they need, whether they search using exact terms or natural language questions.

Readers access search by pressing Cmd+K (Mac) or Ctrl+K (Windows), or by clicking the search icon in the header. The search modal provides:

Instant results as you type
Highlighted matching text in results
Keyboard navigation (arrow keys + Enter)
Page title, description, and section headings in results

How Hybrid Search Works

Keyword Search

The keyword search engine runs entirely on the edge using a pre-built search index:

During each deployment, page titles, headings, and content are indexed
The index is stored on the edge as JSON for instant lookup
Queries are scored against titles (highest weight), headings, and body content
Results are ranked by relevance score and returned instantly

Semantic Search

Semantic search uses vector embeddings to understand the meaning behind a query:

The search query is embedded into a 1536-dimensional vector using text-embedding-3-small
The vector is compared against your documentation embeddings in our managed vector index
The most similar content chunks are returned, even if they do not contain the exact query terms

Reciprocal Rank Fusion (RRF)

The keyword and semantic results are merged using Reciprocal Rank Fusion with k=60. This ensures that:

Pages ranked highly by both methods appear at the top
Semantic results are not drowned out by exact keyword matches
Keyword results still surface when a query uses precise technical terminology
The final ranking benefits from both approaches without requiring manual weight tuning

The fused results are returned to the search modal in under 100ms for most documentation sites.

Configuration

Customize the search experience in docs.json:

json
{  "search": {    "prompt": "Search our documentation...",    "hotkey": "k",    "placeholder": "Search documentation..."  }}

Search Options

Option	Type	Default	Description
`prompt`	string	—	Custom prompt text for the search modal
`hotkey`	string	`"k"`	Keyboard shortcut key (used with Cmd/Ctrl)
`placeholder`	string	`"Search documentation..."`	Placeholder text in the search input

Search API

Use the search API for custom search integrations:

Keyword Search

bash
curl "https://api.holydocs.com/api/v1/docs/PROJECT_ID/search?q=custom+domains&limit=10"

Response:

json
{  "data": {    "results": [      {        "title": "Custom Domains",        "path": "/custom-domains",        "description": "Configure a custom domain for your documentation site",        "score": 95,        "highlights": ["...configure a <mark>custom domain</mark> with automatic SSL..."]      }    ]  }}

Semantic Search

bash
curl "https://api.holydocs.com/api/v1/docs/PROJECT_ID/search/semantic?q=how+do+I+add+my+own+domain"

Semantic search understands intent — the query "how do I add my own domain" will match the "Custom Domains" page even without those exact words appearing in the content.

Content Indexing

Search indices are built automatically after each production deployment:

Page Processing

Each MDX page is processed to extract the title, description, headings, and body text.

Chunking

Content is split into semantic chunks optimized for embedding. Each chunk preserves context about which page and section it belongs to.

Embedding

Chunks are converted to vector embeddings via OpenAI's text-embedding-3-small model (1536 dimensions).

Indexing

Embeddings are upserted into our managed vector index with deterministic IDs (projectId:pagePath:chunkIndex) for efficient updates.

Keyword Index

A separate keyword search index is built and stored on the edge for instant text-based search.

Content indexing is differential — only chunks whose SHA-256 checksum has changed since the last index are re-embedded. This makes indexing fast even for large documentation sites.

Search with Auth-Protected Content

If your documentation uses content authentication, search results respect access controls:

Public pages appear in search for all users
Protected pages only appear for authenticated users with the appropriate group membership
The search API respects the same JWT-based auth used for page access

Auth-enabled projects skip search result caching entirely to prevent protected content from leaking to unauthorized users.

Tuning Search Quality

If search results are not meeting expectations, use these techniques to improve quality:

Write descriptive page titles and descriptions

The keyword index weights page titles highest. A clear, descriptive title like "Configuring Custom Domains with SSL" will rank better than "Domains" for relevant queries. The description frontmatter field is also indexed and influences rankings.

Use section headings strategically

Headings (h2, h3) are indexed with higher weight than body text. Structure your pages with clear, descriptive headings that match the terms readers are likely to search for.

Add frontmatter keywords

You can add a keywords field to your page frontmatter to boost discoverability for specific terms:

yaml
---title: Authenticationdescription: Set up API key authenticationkeywords: ["api key", "bearer token", "auth header", "credentials"]---

Check zero-result queries

The analytics dashboard shows queries that returned no results. These are direct signals of content gaps or terminology mismatches. Either create content for those queries or add the terms as keywords to existing pages.

Keep content focused per page

Pages that cover too many topics produce noisy embeddings. If a page is a catch-all, consider splitting it into focused pages — each will produce better search results.

Search Analytics

Monitor search performance from the Analytics > Search tab in the dashboard:

Metric	Description	Action
Top queries	Most frequently searched terms	Ensure top queries lead to high-quality results
Zero-result queries	Searches with no results	Create content or add keywords for these terms
Click-through rate	% of searches where a result was clicked	Low CTR suggests results are not relevant — review ranking
Time to first click	How quickly readers find what they need	A high time suggests too many results or poor ranking
Queries per session	Average searches per reader visit	High values may indicate navigation issues

Set up a weekly routine to review zero-result queries. Each one represents a reader who could not find what they needed — addressing these systematically improves documentation quality over time.

Previous AI Assistant

Next AI Agent

AI Search

Overview

How Hybrid Search Works

Keyword Search

Semantic Search

Reciprocal Rank Fusion (RRF)

Configuration

Search Options

Search API

Keyword Search

Semantic Search

Content Indexing

Page Processing

Chunking

Embedding

Indexing

Keyword Index

Search with Auth-Protected Content

Tuning Search Quality

Search Analytics

Indexing Pipeline (Deep Dive)

Content Extraction

Deduplication Check

Batch Embedding

Vector Upsert

Edge Index Update