GeminiClient¶

GeminiClient is the primary interface for interacting with Google's Gemini models. It wraps the @google/genai SDK with production-ready features: automatic retries with exponential backoff, document and MIME validation, streaming, structured JSON output, and injectable dependencies for testing.

Import¶

import { GeminiClient } from "@modernpath/agent-framework";

Constructor¶

const client = new GeminiClient(config: GeminiClientConfig, deps?: {
  ai?: GenAIClientLike;
  sleep?: (ms: number) => Promise<void>;
});

GeminiClientConfig¶

Property	Type	Default	Description
`apiKey`	`string`	required	Google AI API key.
`model`	`string`	`"gemini-3-flash-preview"`	Gemini model identifier.
`maxOutputTokens`	`number`	`8192`	Default max output tokens per request.
`temperature`	`number`	`0.7`	Default sampling temperature (0.0 -- 2.0).
`maxInlineDocumentSizeBytes`	`number`	`20971520` (20 MB)	Max size for inline (base64) document attachments.

Dependencies (optional)¶

Property	Type	Description
`ai`	`GenAIClientLike`	Injectable Google GenAI client interface for testing.
`sleep`	`(ms: number) => Promise<void>`	Injectable sleep function for deterministic retry testing.

Properties¶

`modelName`¶

get modelName(): string

Returns the configured model identifier (e.g. "gemini-3-flash-preview").

`getModelInfo()`¶

getModelInfo(): { maxInlineDocumentSizeBytes: number }

Returns model configuration metadata.

Methods¶

`generateContent()`¶

Generates text from a prompt with optional documents, system instructions, and structured output.

async generateContent(
  prompt: string,
  options?: GeminiOptions,
): Promise<{
  text: string;
  usage?: GeminiUsage;
  finishReason?: string;
  groundingMetadata?: any;
}>

Retry behavior: Automatically retries up to 3 times with exponential backoff on rate-limit errors (HTTP 429, "resource exhausted", "quota exceeded") and empty model responses.

JSON recovery: When responseMimeType is "application/json" and the non-streaming response is not valid JSON, the method automatically retries via streaming and concatenates chunks to recover a valid response.

Basic usage

const result = await client.generateContent(
  "What are the top 3 causes of heating failures?",
);
console.log(result.text);
console.log(result.usage); // { input: 42, output: 150, total: 192 }

With system prompt and documents

import { readFileSync } from "fs";

const result = await client.generateContent(
  "Summarize the attached document.",
  {
    systemPrompt: "You are a technical document analyst.",
    documents: [{
      name: "report.pdf",
      mimeType: "application/pdf",
      content: readFileSync("./report.pdf"),
    }],
    temperature: 0.3,
    maxOutputTokens: 4096,
  },
);

Structured JSON output

const result = await client.generateContent(
  "Extract all person names and their roles.",
  {
    responseMimeType: "application/json",
    responseSchema: {
      type: "object",
      properties: {
        people: {
          type: "array",
          items: {
            type: "object",
            properties: {
              name: { type: "string" },
              role: { type: "string" },
            },
            required: ["name", "role"],
          },
        },
      },
      required: ["people"],
    },
  },
);

const data = JSON.parse(result.text);

`generateContentStream()`¶

Streams text chunks as an async generator. Same retry and backoff behavior as generateContent().

async *generateContentStream(
  prompt: string,
  options?: GeminiOptions,
): AsyncGenerator<string>

Streaming to stdout

for await (const chunk of client.generateContentStream("Explain step by step.")) {
  process.stdout.write(chunk);
}

Collecting streamed output

let fullText = "";
for await (const chunk of client.generateContentStream(prompt, options)) {
  fullText += chunk;
  onChunk(chunk); // e.g. send to SSE client
}

`createChat()`¶

Creates a stateful multi-turn chat session.

createChat(options?: GeminiChatCreateOptions): GeminiChatSession

See Chat Sessions for details.

`files()`¶

Returns a GeminiFiles instance for uploading and managing files via the Files API.

files(): GeminiFiles

`fileSearchStores()`¶

Returns a GeminiFileSearchStores instance for managing File Search Stores and executing semantic search.

fileSearchStores(): GeminiFileSearchStores

GeminiOptions¶

Options passed to generateContent() and generateContentStream().

Property	Type	Description
`systemPrompt`	`string`	System-level instruction prepended to the prompt.
`documents`	`GeminiDocument[]`	Inline documents (base64-encoded in the request). Max size governed by `maxInlineDocumentSizeBytes`.
`fileReferences`	`GeminiFileReference[]`	References to files uploaded via the Files API. Use for files > 20 MB or for reuse across requests.
`maxOutputTokens`	`number`	Override the default max output tokens for this request.
`temperature`	`number`	Override the default temperature for this request.
`responseMimeType`	`"text/plain" \\| "application/json"`	Request plain text or structured JSON output.
`responseSchema`	`Record<string, any>`	JSON Schema for structured output (requires `responseMimeType: "application/json"`).
`responseJsonSchema`	`Record<string, any>`	Alias for `responseSchema` used by some SDK tooling.
`tools`	`Array<Record<string, any>>`	Gemini tools (e.g. `google_search`, `url_context`, `fileSearch`). Passed through to the API as-is.
`mediaResolution`	`"low" \\| "medium" \\| "high"`	Controls image/PDF processing resolution.

Types¶

GeminiDocument¶

Inline document attachment (base64-encoded in the request body).

interface GeminiDocument {
  content: Buffer;   // Raw file bytes
  mimeType: string;  // e.g. "application/pdf"
  name: string;      // Display name
}

GeminiFileReference¶

Reference to a file previously uploaded via the Files API.

interface GeminiFileReference {
  fileUri: string;       // Files API URI
  mimeType: string;      // MIME type of the file
  displayName?: string;  // Optional display name
}

GeminiUsage¶

Token usage metadata returned from generation calls.

interface GeminiUsage {
  input?: number;    // Prompt token count
  output?: number;   // Output (candidates) token count
  total?: number;    // Total token count
  thoughts?: number; // Internal chain-of-thought tokens (Gemini thinking models)
}

Thinking tokens and output limits

Gemini thinking models count thoughts tokens against maxOutputTokens. If thoughts + output >= maxOutputTokens, the visible response may be truncated. Monitor usage.thoughts in production to tune your token budget.

Error Handling¶

The client throws descriptive errors in these scenarios:

Scenario	Error Message Pattern
Missing API key	`"GeminiClient: apiKey is required"`
Empty prompt	`"GeminiClient: prompt must be a non-empty string"`
Unsupported MIME type	`"Unsupported MIME type: {type}"`
Document too large	`"Document too large for inline upload: {name} ({size} bytes > {max} bytes)"`
All retries exhausted	`"Failed to generate content after 3 attempts: {reason}"`
Empty response with finish reason	`"Empty response from model. Finish reason: {reason}"`

Debug Logging¶

Enable with DEBUG=gemini-client or DEBUG=gemini-api:

DEBUG=gemini-client node app.js

Logs include request IDs, timing, token usage, retry attempts, and response previews.

Gemini Integration Overview
Chat Sessions
File Search Stores
Files API
Knowledge Base Retrieval -- uses GeminiFileSearchStores for RAG