GeminiClient¶
GeminiClient is the primary interface for interacting with Google's Gemini models. It wraps the @google/genai SDK with production-ready features: automatic retries with exponential backoff, document and MIME validation, streaming, structured JSON output, and injectable dependencies for testing.
Import¶
Constructor¶
const client = new GeminiClient(config: GeminiClientConfig, deps?: {
ai?: GenAIClientLike;
sleep?: (ms: number) => Promise<void>;
});
GeminiClientConfig¶
| Property | Type | Default | Description |
|---|---|---|---|
apiKey | string | required | Google AI API key. |
model | string | "gemini-3-flash-preview" | Gemini model identifier. |
maxOutputTokens | number | 8192 | Default max output tokens per request. |
temperature | number | 0.7 | Default sampling temperature (0.0 -- 2.0). |
maxInlineDocumentSizeBytes | number | 20971520 (20 MB) | Max size for inline (base64) document attachments. |
Dependencies (optional)¶
| Property | Type | Description |
|---|---|---|
ai | GenAIClientLike | Injectable Google GenAI client interface for testing. |
sleep | (ms: number) => Promise<void> | Injectable sleep function for deterministic retry testing. |
Properties¶
modelName¶
Returns the configured model identifier (e.g. "gemini-3-flash-preview").
getModelInfo()¶
Returns model configuration metadata.
Methods¶
generateContent()¶
Generates text from a prompt with optional documents, system instructions, and structured output.
async generateContent(
prompt: string,
options?: GeminiOptions,
): Promise<{
text: string;
usage?: GeminiUsage;
finishReason?: string;
groundingMetadata?: any;
}>
Retry behavior: Automatically retries up to 3 times with exponential backoff on rate-limit errors (HTTP 429, "resource exhausted", "quota exceeded") and empty model responses.
JSON recovery: When responseMimeType is "application/json" and the non-streaming response is not valid JSON, the method automatically retries via streaming and concatenates chunks to recover a valid response.
Basic usage
With system prompt and documents
import { readFileSync } from "fs";
const result = await client.generateContent(
"Summarize the attached document.",
{
systemPrompt: "You are a technical document analyst.",
documents: [{
name: "report.pdf",
mimeType: "application/pdf",
content: readFileSync("./report.pdf"),
}],
temperature: 0.3,
maxOutputTokens: 4096,
},
);
Structured JSON output
const result = await client.generateContent(
"Extract all person names and their roles.",
{
responseMimeType: "application/json",
responseSchema: {
type: "object",
properties: {
people: {
type: "array",
items: {
type: "object",
properties: {
name: { type: "string" },
role: { type: "string" },
},
required: ["name", "role"],
},
},
},
required: ["people"],
},
},
);
const data = JSON.parse(result.text);
generateContentStream()¶
Streams text chunks as an async generator. Same retry and backoff behavior as generateContent().
Streaming to stdout
Collecting streamed output
createChat()¶
Creates a stateful multi-turn chat session.
See Chat Sessions for details.
files()¶
Returns a GeminiFiles instance for uploading and managing files via the Files API.
fileSearchStores()¶
Returns a GeminiFileSearchStores instance for managing File Search Stores and executing semantic search.
GeminiOptions¶
Options passed to generateContent() and generateContentStream().
| Property | Type | Description |
|---|---|---|
systemPrompt | string | System-level instruction prepended to the prompt. |
documents | GeminiDocument[] | Inline documents (base64-encoded in the request). Max size governed by maxInlineDocumentSizeBytes. |
fileReferences | GeminiFileReference[] | References to files uploaded via the Files API. Use for files > 20 MB or for reuse across requests. |
maxOutputTokens | number | Override the default max output tokens for this request. |
temperature | number | Override the default temperature for this request. |
responseMimeType | "text/plain" \| "application/json" | Request plain text or structured JSON output. |
responseSchema | Record<string, any> | JSON Schema for structured output (requires responseMimeType: "application/json"). |
responseJsonSchema | Record<string, any> | Alias for responseSchema used by some SDK tooling. |
tools | Array<Record<string, any>> | Gemini tools (e.g. google_search, url_context, fileSearch). Passed through to the API as-is. |
mediaResolution | "low" \| "medium" \| "high" | Controls image/PDF processing resolution. |
Types¶
GeminiDocument¶
Inline document attachment (base64-encoded in the request body).
interface GeminiDocument {
content: Buffer; // Raw file bytes
mimeType: string; // e.g. "application/pdf"
name: string; // Display name
}
GeminiFileReference¶
Reference to a file previously uploaded via the Files API.
interface GeminiFileReference {
fileUri: string; // Files API URI
mimeType: string; // MIME type of the file
displayName?: string; // Optional display name
}
GeminiUsage¶
Token usage metadata returned from generation calls.
interface GeminiUsage {
input?: number; // Prompt token count
output?: number; // Output (candidates) token count
total?: number; // Total token count
thoughts?: number; // Internal chain-of-thought tokens (Gemini thinking models)
}
Thinking tokens and output limits
Gemini thinking models count thoughts tokens against maxOutputTokens. If thoughts + output >= maxOutputTokens, the visible response may be truncated. Monitor usage.thoughts in production to tune your token budget.
Error Handling¶
The client throws descriptive errors in these scenarios:
| Scenario | Error Message Pattern |
|---|---|
| Missing API key | "GeminiClient: apiKey is required" |
| Empty prompt | "GeminiClient: prompt must be a non-empty string" |
| Unsupported MIME type | "Unsupported MIME type: {type}" |
| Document too large | "Document too large for inline upload: {name} ({size} bytes > {max} bytes)" |
| All retries exhausted | "Failed to generate content after 3 attempts: {reason}" |
| Empty response with finish reason | "Empty response from model. Finish reason: {reason}" |
Debug Logging¶
Enable with DEBUG=gemini-client or DEBUG=gemini-api:
Logs include request IDs, timing, token usage, retry attempts, and response previews.
Related Pages¶
- Gemini Integration Overview
- Chat Sessions
- File Search Stores
- Files API
- Knowledge Base Retrieval -- uses
GeminiFileSearchStoresfor RAG