Skip to content

RetrievalService

RetrievalService is the primary interface for knowledge retrieval in the RAG pipeline. It wraps GeminiFileSearchStores.search() with store name resolution, LRU caching (both search results and store names), and structured source extraction from Gemini's grounding metadata.

Import

import { RetrievalService } from "@modernpath/agent-framework";

Constructor

const retrieval = new RetrievalService(
  fileSearchStores: GeminiFileSearchStores,
  opts?: {
    searchCacheSize?: number;  // default: 100
    storeCacheSize?: number;   // default: 10
    defaultTTLms?: number;     // default: 300000 (5 min)
  },
);

Constructor Parameters

Parameter Type Default Description
fileSearchStores GeminiFileSearchStores required The File Search Stores client, obtained via client.fileSearchStores().
opts.searchCacheSize number 100 Maximum number of cached search results (LRU eviction).
opts.storeCacheSize number 10 Maximum number of cached store name resolutions.
opts.defaultTTLms number 300000 Default TTL for search result cache entries (5 minutes). Store name cache uses a fixed 1-hour TTL.

Creating a retrieval service

import { GeminiClient, RetrievalService } from "@modernpath/agent-framework";

const client = new GeminiClient({ apiKey: process.env.GEMINI_API_KEY! });
const stores = client.fileSearchStores();

const retrieval = new RetrievalService(stores, {
  searchCacheSize: 200,
  defaultTTLms: 600_000, // 10 minutes
});

Methods

Execute a semantic search query against a File Search Store.

async search(
  query: string,
  storeDisplayName: string,
  options?: KnowledgeSearchOptions,
): Promise<KnowledgeAnswer>
Parameter Type Description
query string Natural language search query.
storeDisplayName string Store display name, short name, or fully-qualified resource name.
options KnowledgeSearchOptions Optional search parameters (see below).

Behavior:

  1. Resolves storeDisplayName to a fully-qualified store name (with caching).
  2. Checks the search cache for an identical query + store + options combination.
  3. If not cached, calls GeminiFileSearchStores.search().
  4. Extracts structured KnowledgeSource[] from the grounding metadata.
  5. Caches the result and returns the KnowledgeAnswer.

Basic search

const answer = await retrieval.search(
  "What is the procedure for handling cold complaints?",
  "hvac-knowledge-base",
);

console.log(answer.text);
console.log(`Found ${answer.sources.length} sources`);

for (const source of answer.sources) {
  console.log(`  - ${source.documentName} (score: ${source.score})`);
}

Search with options

const answer = await retrieval.search(
  "escalation procedure for unresolved temperature issues",
  "hvac-knowledge-base",
  {
    topK: 10,
    metadataFilter: "source_type = 's3'",
    cacheTTLms: 60_000, // cache for 1 minute
    systemInstruction: "Answer concisely in one paragraph.",
    maxOutputTokens: 512,
  },
);

resolveStoreName()

Resolve a store display name or short name to a fully-qualified resource name.

async resolveStoreName(storeDisplayName: string): Promise<string>

Resolution order:

  1. Check the store name cache.
  2. List all stores and match by displayName.
  3. Match by exact name.
  4. Match by pattern fileSearchStores/{storeDisplayName}.
  5. Fall back to fileSearchStores/{storeDisplayName} if no match found.

Results are cached for 1 hour.

Flexible naming

You can pass any of these forms to search() or resolveStoreName():

  • Display name: "My HVAC Knowledge Base"
  • Short name: "hvac-kb"
  • Fully-qualified: "fileSearchStores/abc123def"

The service will resolve all forms correctly.

Types

KnowledgeSearchOptions

Property Type Default Description
topK number -- Maximum number of retrieval chunks to return.
metadataFilter string -- Metadata filter expression (Gemini filter syntax).
cacheTTLms number defaultTTLms Override cache TTL for this specific search.
systemInstruction string -- System instruction for the file-search LLM call.
maxOutputTokens number 256 Max output tokens for the synthesized answer.

KnowledgeAnswer

The result of a retrieval search.

interface KnowledgeAnswer {
  text: string;                // Synthesized answer from the model
  sources: KnowledgeSource[];  // Extracted source references
  groundingMetadata?: any;     // Raw Gemini grounding metadata
}

KnowledgeSource

A single source reference extracted from grounding metadata.

interface KnowledgeSource {
  documentId?: string;    // Document resource ID
  documentName?: string;  // Document display name (title)
  uri?: string;           // Source URI
  chunk?: string;         // Retrieved text chunk content
  score?: number;         // Relevance score
  storeName?: string;     // File Search Store name (e.g. "fileSearchStores/xyz")
}

Source extraction

Sources are extracted from Gemini's groundingMetadata.groundingChunks array. The service handles multiple response formats:

  • retrievedContext (File Search / RAG)
  • web (Google Search grounding)
  • Legacy chunk / chunkData structures

All chunks are preserved, including duplicates from the same document, since each chunk contains different retrieved content.

Caching Strategy

The service implements two separate LRU caches:

Cache Key TTL Size Purpose
Search cache {storeName}:{query}:{metadataFilter}:{topK} 5 min (configurable) 100 (configurable) Avoid redundant API calls for identical queries.
Store name cache {storeDisplayName} 1 hour (fixed) 10 (configurable) Avoid repeated store listing API calls.

Cache entries are evicted on LRU basis when the cache is full.

Debug Logging

Enable with:

DEBUG=retrieval-service node app.js

Logs include:

  • Store name resolution steps and cache hits/misses
  • Search API call timing and response sizes
  • Source extraction details
  • Cache key generation