RetrievalService¶

RetrievalService is the primary interface for knowledge retrieval in the RAG pipeline. It wraps GeminiFileSearchStores.search() with store name resolution, LRU caching (both search results and store names), and structured source extraction from Gemini's grounding metadata.

Import¶

import { RetrievalService } from "@modernpath/agent-framework";

Constructor¶

const retrieval = new RetrievalService(
  fileSearchStores: GeminiFileSearchStores,
  opts?: {
    searchCacheSize?: number;  // default: 100
    storeCacheSize?: number;   // default: 10
    defaultTTLms?: number;     // default: 300000 (5 min)
  },
);

Constructor Parameters¶

Parameter	Type	Default	Description
`fileSearchStores`	`GeminiFileSearchStores`	required	The File Search Stores client, obtained via `client.fileSearchStores()`.
`opts.searchCacheSize`	`number`	`100`	Maximum number of cached search results (LRU eviction).
`opts.storeCacheSize`	`number`	`10`	Maximum number of cached store name resolutions.
`opts.defaultTTLms`	`number`	`300000`	Default TTL for search result cache entries (5 minutes). Store name cache uses a fixed 1-hour TTL.

Creating a retrieval service

import { GeminiClient, RetrievalService } from "@modernpath/agent-framework";

const client = new GeminiClient({ apiKey: process.env.GEMINI_API_KEY! });
const stores = client.fileSearchStores();

const retrieval = new RetrievalService(stores, {
  searchCacheSize: 200,
  defaultTTLms: 600_000, // 10 minutes
});

Methods¶

`search()`¶

Execute a semantic search query against a File Search Store.

async search(
  query: string,
  storeDisplayName: string,
  options?: KnowledgeSearchOptions,
): Promise<KnowledgeAnswer>

Parameter	Type	Description
`query`	`string`	Natural language search query.
`storeDisplayName`	`string`	Store display name, short name, or fully-qualified resource name.
`options`	`KnowledgeSearchOptions`	Optional search parameters (see below).

Behavior:

Resolves storeDisplayName to a fully-qualified store name (with caching).
Checks the search cache for an identical query + store + options combination.
If not cached, calls GeminiFileSearchStores.search().
Extracts structured KnowledgeSource[] from the grounding metadata.
Caches the result and returns the KnowledgeAnswer.

Basic search

const answer = await retrieval.search(
  "What is the procedure for handling cold complaints?",
  "hvac-knowledge-base",
);

console.log(answer.text);
console.log(`Found ${answer.sources.length} sources`);

for (const source of answer.sources) {
  console.log(`  - ${source.documentName} (score: ${source.score})`);
}

Search with options

const answer = await retrieval.search(
  "escalation procedure for unresolved temperature issues",
  "hvac-knowledge-base",
  {
    topK: 10,
    metadataFilter: "source_type = 's3'",
    cacheTTLms: 60_000, // cache for 1 minute
    systemInstruction: "Answer concisely in one paragraph.",
    maxOutputTokens: 512,
  },
);

`resolveStoreName()`¶

Resolve a store display name or short name to a fully-qualified resource name.

async resolveStoreName(storeDisplayName: string): Promise<string>

Resolution order:

Check the store name cache.
List all stores and match by displayName.
Match by exact name.
Match by pattern fileSearchStores/{storeDisplayName}.
Fall back to fileSearchStores/{storeDisplayName} if no match found.

Results are cached for 1 hour.

Flexible naming

You can pass any of these forms to search() or resolveStoreName():

Display name: "My HVAC Knowledge Base"
Short name: "hvac-kb"
Fully-qualified: "fileSearchStores/abc123def"

The service will resolve all forms correctly.

Types¶

KnowledgeSearchOptions¶

Property	Type	Default	Description
`topK`	`number`	--	Maximum number of retrieval chunks to return.
`metadataFilter`	`string`	--	Metadata filter expression (Gemini filter syntax).
`cacheTTLms`	`number`	`defaultTTLms`	Override cache TTL for this specific search.
`systemInstruction`	`string`	--	System instruction for the file-search LLM call.
`maxOutputTokens`	`number`	`256`	Max output tokens for the synthesized answer.

KnowledgeAnswer¶

The result of a retrieval search.

interface KnowledgeAnswer {
  text: string;                // Synthesized answer from the model
  sources: KnowledgeSource[];  // Extracted source references
  groundingMetadata?: any;     // Raw Gemini grounding metadata
}

KnowledgeSource¶

A single source reference extracted from grounding metadata.

interface KnowledgeSource {
  documentId?: string;    // Document resource ID
  documentName?: string;  // Document display name (title)
  uri?: string;           // Source URI
  chunk?: string;         // Retrieved text chunk content
  score?: number;         // Relevance score
  storeName?: string;     // File Search Store name (e.g. "fileSearchStores/xyz")
}

Source extraction

Sources are extracted from Gemini's groundingMetadata.groundingChunks array. The service handles multiple response formats:

retrievedContext (File Search / RAG)
web (Google Search grounding)
Legacy chunk / chunkData structures

All chunks are preserved, including duplicates from the same document, since each chunk contains different retrieved content.

Caching Strategy¶

The service implements two separate LRU caches:

Cache	Key	TTL	Size	Purpose
Search cache	`{storeName}:{query}:{metadataFilter}:{topK}`	5 min (configurable)	100 (configurable)	Avoid redundant API calls for identical queries.
Store name cache	`{storeDisplayName}`	1 hour (fixed)	10 (configurable)	Avoid repeated store listing API calls.

Cache entries are evicted on LRU basis when the cache is full.

Debug Logging¶

Enable with:

DEBUG=retrieval-service node app.js

Logs include:

Store name resolution steps and cache hits/misses
Search API call timing and response sizes
Source extraction details
Cache key generation

Knowledge Base Overview -- full RAG pipeline
Ingestion -- uploading documents into stores
Canonical Grounding -- resolving full documents from search sources
Grounding Policy -- controlling what reaches the LLM
GeminiFileSearchStores -- underlying store API