Skip to content

Gemini Integration

The Gemini integration module provides a serverless-friendly TypeScript wrapper around Google's @google/genai SDK. It adds production-hardening features -- automatic retries with exponential backoff, document size and MIME validation, streaming support, and structured output -- while keeping a minimal, testable API surface.

Architecture

graph LR
    A[Your Agent] --> B[GeminiClient]
    B --> C["@google/genai SDK"]
    B --> D[GeminiFiles]
    B --> E[GeminiFileSearchStores]
    B --> F[GeminiChatSession]
    C --> G[Gemini API]
    D --> G
    E --> G
    F --> G

Module Overview

Class Purpose
GeminiClient Main entry point. Generates content (text, JSON), streams responses, manages documents.
GeminiChatSession Stateful multi-turn chat within a single invocation.
GeminiFileSearchStores CRUD operations for File Search Stores, document upload, and semantic search.
GeminiFiles Upload and manage files via the Gemini Files API (48-hour retention).

Installation

The Gemini module is included in the framework package:

npm install @modernpath/agent-framework

The underlying @google/genai SDK is a peer dependency and installed automatically.

Quick Start

import { GeminiClient } from "@modernpath/agent-framework";

const client = new GeminiClient({
  apiKey: process.env.GEMINI_API_KEY!,
  model: "gemini-3-flash-preview",
  temperature: 0.7,
});

// Simple text generation
const result = await client.generateContent("Summarize this incident report.");
console.log(result.text);

// Streaming
for await (const chunk of client.generateContentStream("Explain the root cause.")) {
  process.stdout.write(chunk);
}

Key Features

Automatic Retries

All generation methods retry up to 3 times with exponential backoff on rate-limit errors (HTTP 429) and empty model responses. No configuration is needed -- this is built in.

Document Handling

The client validates documents before sending them to the API:

  • MIME type validation -- only supported types are accepted (PDF, images, Office formats, text, CSV, HTML, JSON, XML).
  • Size validation -- inline documents are capped at the configured maxInlineDocumentSizeBytes (default 20 MB).
  • File references -- for larger files, upload via the Files API first and pass GeminiFileReference objects.

Structured Output

Request JSON responses with schema enforcement:

const result = await client.generateContent("Extract entities.", {
  responseMimeType: "application/json",
  responseSchema: {
    type: "object",
    properties: {
      entities: { type: "array", items: { type: "string" } },
    },
    required: ["entities"],
  },
});

const parsed = JSON.parse(result.text);

Testability

Every class accepts injectable dependencies for unit testing. GeminiClient takes an optional GenAIClientLike interface, so you can stub the entire Google SDK without network calls.

Supported MIME Types

Category MIME Types
Documents application/pdf
Text text/plain, text/csv, text/html, text/markdown
Data application/json, application/xml
Office .docx, .xlsx, .pptx (OpenXML MIME types)
Images image/png, image/jpeg, image/webp, image/heic, image/heif

Debug Logging

Enable verbose logging for any Gemini module by setting the DEBUG environment variable:

# All Gemini modules
DEBUG=gemini-client,file-search-stores node app.js

# Everything
DEBUG=* node app.js