IngestionService¶
IngestionService handles uploading documents into Gemini File Search Stores with proper metadata for later canonical document resolution. It supports local file ingestion and buffer-based ingestion, with automatic store creation and MIME type normalization.
Import¶
Constructor¶
| Parameter | Type | Description |
|---|---|---|
stores | GeminiFileSearchStores | File Search Stores client for store management and uploads. |
files | GeminiFiles | Optional Files API client (reserved for future use). |
Creating an ingestion service
Methods¶
ensureStore()¶
Ensure a File Search Store exists, optionally creating it if missing.
async ensureStore(
storeDisplayNameOrName: string,
createIfMissing?: boolean, // default: true
): Promise<FileSearchStoreSummary>
| Parameter | Type | Default | Description |
|---|---|---|---|
storeDisplayNameOrName | string | required | Store display name, short name, or fully-qualified name. |
createIfMissing | boolean | true | Create the store if it does not exist. Throws if false and the store is not found. |
Resolution order:
- If the name starts with
"fileSearchStores/", tryget()directly. - List all stores and match by
displayName,name, or patternfileSearchStores/{name}. - If no match and
createIfMissingistrue, create a new store with the given display name.
Example
ingestLocalFile()¶
Ingest a local file from disk into a File Search Store.
async ingestLocalFile(
filePath: string,
storeDisplayNameOrName: string,
opts?: KnowledgeBaseIngestOptions,
): Promise<KnowledgeBaseIngestResult>
| Parameter | Type | Description |
|---|---|---|
filePath | string | Absolute or relative path to the file on disk. |
storeDisplayNameOrName | string | Target store (display name, short name, or fully-qualified). |
opts | KnowledgeBaseIngestOptions | Optional ingestion configuration. |
The method reads the file synchronously, auto-detects the MIME type from the file extension, and appends source: "local-file" and filename metadata automatically.
Ingesting a local file
const result = await ingestion.ingestLocalFile(
"/data/sops/cold-complaint-procedure.pdf",
"hvac-knowledge-base",
{
metadata: [
{ key: "source_type", string_value: "s3" },
{ key: "s3_bucket", string_value: "acme-sops" },
{ key: "s3_key", string_value: "sops/cold-complaint-procedure.pdf" },
],
},
);
console.log(result.storeName); // "fileSearchStores/abc123"
console.log(result.documentName); // document resource name
console.log(result.displayName); // "cold-complaint-procedure.pdf"
console.log(result.mimeType); // "application/pdf"
console.log(result.done); // true if indexing complete
ingestBuffer()¶
Ingest raw content (Buffer) into a File Search Store.
async ingestBuffer(
content: Buffer,
displayName: string,
storeDisplayNameOrName: string,
opts?: KnowledgeBaseIngestOptions,
): Promise<KnowledgeBaseIngestResult>
| Parameter | Type | Description |
|---|---|---|
content | Buffer | Raw file content. |
displayName | string | Display name for the document in the store. |
storeDisplayNameOrName | string | Target store. |
opts | KnowledgeBaseIngestOptions | Optional ingestion configuration. |
Appends source: "buffer" and filename metadata automatically.
Ingesting from a buffer
import { readFileSync } from "fs";
const content = readFileSync("./escalation-policy.md");
const result = await ingestion.ingestBuffer(
content,
"escalation-policy.md",
"hvac-knowledge-base",
{
mimeType: "text/markdown",
metadata: [
{ key: "source_type", string_value: "s3" },
{ key: "s3_bucket", string_value: "acme-sops" },
{ key: "s3_key", string_value: "sops/escalation-policy.md" },
],
},
);
Ingesting with store auto-creation
Types¶
KnowledgeBaseIngestOptions¶
| Property | Type | Default | Description |
|---|---|---|---|
createStoreIfMissing | boolean | true | Create the store if it does not exist. Set to false to require the store to already exist. |
metadata | Array<{ key, string_value?, numeric_value? }> | [] | Custom metadata key-value pairs attached to the document. Used for canonical document resolution and filtering. |
metadataFilterHint | string | -- | A filterable metadata string for application-level filtering. Stored as a metadataFilterHint metadata key. |
mimeType | string | Auto-detected | Override MIME type. Auto-detected from file extension when omitted. |
KnowledgeBaseIngestResult¶
interface KnowledgeBaseIngestResult {
storeName: string; // Fully-qualified store name
documentName: string; // Document resource name in the store
done: boolean; // Whether indexing is complete
displayName: string; // Document display name
mimeType: string; // Resolved MIME type
}
Metadata for Canonical Resolution¶
When ingesting documents that should be resolvable back to their ground-truth source (for Canonical Grounding), include the appropriate metadata keys:
metadata: [
{ key: "source_type", string_value: "s3" },
{ key: "s3_bucket", string_value: "my-bucket" },
{ key: "s3_key", string_value: "path/to/document.pdf" },
// Optional:
{ key: "s3_region", string_value: "us-east-1" },
{ key: "s3_endpoint", string_value: "http://minio:9000" },
{ key: "s3_force_path_style", string_value: "true" },
]
Automatic Metadata¶
The ingestion service automatically appends the following metadata to every document:
| Key | Value | Added by |
|---|---|---|
source | "local-file" or "buffer" | ingestLocalFile() / ingestBuffer() |
filename | The document display name | Both methods |
metadataFilterHint | Value of opts.metadataFilterHint | When provided |
Related Pages¶
- Knowledge Base Overview -- full RAG pipeline
- Retrieval -- searching ingested documents
- Canonical Grounding -- resolving full documents from metadata
- GeminiFileSearchStores -- underlying upload API