RetrievalService¶
RetrievalService is the primary interface for knowledge retrieval in the RAG pipeline. It wraps GeminiFileSearchStores.search() with store name resolution, LRU caching (both search results and store names), and structured source extraction from Gemini's grounding metadata.
Import¶
Constructor¶
const retrieval = new RetrievalService(
fileSearchStores: GeminiFileSearchStores,
opts?: {
searchCacheSize?: number; // default: 100
storeCacheSize?: number; // default: 10
defaultTTLms?: number; // default: 300000 (5 min)
},
);
Constructor Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
fileSearchStores | GeminiFileSearchStores | required | The File Search Stores client, obtained via client.fileSearchStores(). |
opts.searchCacheSize | number | 100 | Maximum number of cached search results (LRU eviction). |
opts.storeCacheSize | number | 10 | Maximum number of cached store name resolutions. |
opts.defaultTTLms | number | 300000 | Default TTL for search result cache entries (5 minutes). Store name cache uses a fixed 1-hour TTL. |
Creating a retrieval service
import { GeminiClient, RetrievalService } from "@modernpath/agent-framework";
const client = new GeminiClient({ apiKey: process.env.GEMINI_API_KEY! });
const stores = client.fileSearchStores();
const retrieval = new RetrievalService(stores, {
searchCacheSize: 200,
defaultTTLms: 600_000, // 10 minutes
});
Methods¶
search()¶
Execute a semantic search query against a File Search Store.
async search(
query: string,
storeDisplayName: string,
options?: KnowledgeSearchOptions,
): Promise<KnowledgeAnswer>
| Parameter | Type | Description |
|---|---|---|
query | string | Natural language search query. |
storeDisplayName | string | Store display name, short name, or fully-qualified resource name. |
options | KnowledgeSearchOptions | Optional search parameters (see below). |
Behavior:
- Resolves
storeDisplayNameto a fully-qualified store name (with caching). - Checks the search cache for an identical query + store + options combination.
- If not cached, calls
GeminiFileSearchStores.search(). - Extracts structured
KnowledgeSource[]from the grounding metadata. - Caches the result and returns the
KnowledgeAnswer.
Basic search
const answer = await retrieval.search(
"What is the procedure for handling cold complaints?",
"hvac-knowledge-base",
);
console.log(answer.text);
console.log(`Found ${answer.sources.length} sources`);
for (const source of answer.sources) {
console.log(` - ${source.documentName} (score: ${source.score})`);
}
Search with options
resolveStoreName()¶
Resolve a store display name or short name to a fully-qualified resource name.
Resolution order:
- Check the store name cache.
- List all stores and match by
displayName. - Match by exact
name. - Match by pattern
fileSearchStores/{storeDisplayName}. - Fall back to
fileSearchStores/{storeDisplayName}if no match found.
Results are cached for 1 hour.
Flexible naming
You can pass any of these forms to search() or resolveStoreName():
- Display name:
"My HVAC Knowledge Base" - Short name:
"hvac-kb" - Fully-qualified:
"fileSearchStores/abc123def"
The service will resolve all forms correctly.
Types¶
KnowledgeSearchOptions¶
| Property | Type | Default | Description |
|---|---|---|---|
topK | number | -- | Maximum number of retrieval chunks to return. |
metadataFilter | string | -- | Metadata filter expression (Gemini filter syntax). |
cacheTTLms | number | defaultTTLms | Override cache TTL for this specific search. |
systemInstruction | string | -- | System instruction for the file-search LLM call. |
maxOutputTokens | number | 256 | Max output tokens for the synthesized answer. |
KnowledgeAnswer¶
The result of a retrieval search.
interface KnowledgeAnswer {
text: string; // Synthesized answer from the model
sources: KnowledgeSource[]; // Extracted source references
groundingMetadata?: any; // Raw Gemini grounding metadata
}
KnowledgeSource¶
A single source reference extracted from grounding metadata.
interface KnowledgeSource {
documentId?: string; // Document resource ID
documentName?: string; // Document display name (title)
uri?: string; // Source URI
chunk?: string; // Retrieved text chunk content
score?: number; // Relevance score
storeName?: string; // File Search Store name (e.g. "fileSearchStores/xyz")
}
Source extraction
Sources are extracted from Gemini's groundingMetadata.groundingChunks array. The service handles multiple response formats:
retrievedContext(File Search / RAG)web(Google Search grounding)- Legacy
chunk/chunkDatastructures
All chunks are preserved, including duplicates from the same document, since each chunk contains different retrieved content.
Caching Strategy¶
The service implements two separate LRU caches:
| Cache | Key | TTL | Size | Purpose |
|---|---|---|---|---|
| Search cache | {storeName}:{query}:{metadataFilter}:{topK} | 5 min (configurable) | 100 (configurable) | Avoid redundant API calls for identical queries. |
| Store name cache | {storeDisplayName} | 1 hour (fixed) | 10 (configurable) | Avoid repeated store listing API calls. |
Cache entries are evicted on LRU basis when the cache is full.
Debug Logging¶
Enable with:
Logs include:
- Store name resolution steps and cache hits/misses
- Search API call timing and response sizes
- Source extraction details
- Cache key generation
Related Pages¶
- Knowledge Base Overview -- full RAG pipeline
- Ingestion -- uploading documents into stores
- Canonical Grounding -- resolving full documents from search sources
- Grounding Policy -- controlling what reaches the LLM
- GeminiFileSearchStores -- underlying store API