Gemini Integration¶
The Gemini integration module provides a serverless-friendly TypeScript wrapper around Google's @google/genai SDK. It adds production-hardening features -- automatic retries with exponential backoff, document size and MIME validation, streaming support, and structured output -- while keeping a minimal, testable API surface.
Architecture¶
graph LR
A[Your Agent] --> B[GeminiClient]
B --> C["@google/genai SDK"]
B --> D[GeminiFiles]
B --> E[GeminiFileSearchStores]
B --> F[GeminiChatSession]
C --> G[Gemini API]
D --> G
E --> G
F --> G Module Overview¶
| Class | Purpose |
|---|---|
GeminiClient | Main entry point. Generates content (text, JSON), streams responses, manages documents. |
GeminiChatSession | Stateful multi-turn chat within a single invocation. |
GeminiFileSearchStores | CRUD operations for File Search Stores, document upload, and semantic search. |
GeminiFiles | Upload and manage files via the Gemini Files API (48-hour retention). |
Installation¶
The Gemini module is included in the framework package:
The underlying @google/genai SDK is a peer dependency and installed automatically.
Quick Start¶
import { GeminiClient } from "@modernpath/agent-framework";
const client = new GeminiClient({
apiKey: process.env.GEMINI_API_KEY!,
model: "gemini-3-flash-preview",
temperature: 0.7,
});
// Simple text generation
const result = await client.generateContent("Summarize this incident report.");
console.log(result.text);
// Streaming
for await (const chunk of client.generateContentStream("Explain the root cause.")) {
process.stdout.write(chunk);
}
Key Features¶
Automatic Retries¶
All generation methods retry up to 3 times with exponential backoff on rate-limit errors (HTTP 429) and empty model responses. No configuration is needed -- this is built in.
Document Handling¶
The client validates documents before sending them to the API:
- MIME type validation -- only supported types are accepted (PDF, images, Office formats, text, CSV, HTML, JSON, XML).
- Size validation -- inline documents are capped at the configured
maxInlineDocumentSizeBytes(default 20 MB). - File references -- for larger files, upload via the Files API first and pass
GeminiFileReferenceobjects.
Structured Output¶
Request JSON responses with schema enforcement:
const result = await client.generateContent("Extract entities.", {
responseMimeType: "application/json",
responseSchema: {
type: "object",
properties: {
entities: { type: "array", items: { type: "string" } },
},
required: ["entities"],
},
});
const parsed = JSON.parse(result.text);
Testability¶
Every class accepts injectable dependencies for unit testing. GeminiClient takes an optional GenAIClientLike interface, so you can stub the entire Google SDK without network calls.
Supported MIME Types¶
| Category | MIME Types |
|---|---|
| Documents | application/pdf |
| Text | text/plain, text/csv, text/html, text/markdown |
| Data | application/json, application/xml |
| Office | .docx, .xlsx, .pptx (OpenXML MIME types) |
| Images | image/png, image/jpeg, image/webp, image/heic, image/heif |
Debug Logging¶
Enable verbose logging for any Gemini module by setting the DEBUG environment variable:
# All Gemini modules
DEBUG=gemini-client,file-search-stores node app.js
# Everything
DEBUG=* node app.js
Related Pages¶
- GeminiClient -- full API reference
- Chat Sessions -- multi-turn conversations
- File Search Stores -- RAG via semantic search
- Files API -- large file uploads
- Knowledge Base -- higher-level RAG pipeline built on these primitives