Skip to content

DocumentQuery

DocumentQuery provides natural-language querying of structured documents. It loads parsed CSV/Excel data into an in-memory SQLite database and uses Gemini to generate SQL queries, execute them, and format the results in plain language. For structured/nested documents (e.g. financial statements), it falls back to a JSON + LLM direct analysis path.

Import

import { DocumentQuery, DocumentQueryGeminiLike } from "@modernpath/agent-framework";

Constructor

new DocumentQuery(parser: DocumentParser, gemini: DocumentQueryGeminiLike)
Parameter Type Description
parser DocumentParser A DocumentParser instance for converting data to JSON when using the structured analysis path.
gemini DocumentQueryGeminiLike Any object implementing generateContent(prompt, options?): Promise<{ text: string }>. Typically a GeminiClient.

DocumentQueryGeminiLike

interface DocumentQueryGeminiLike {
  generateContent(prompt: string, options?: any): Promise<{ text: string }>;
}

This interface allows you to pass any LLM client that matches this signature, keeping the module testable without a real Gemini dependency.

Methods

queryDocumentData

async queryDocumentData(
  parsedData: ParsedDocumentData,
  query: string,
  documentId?: string,
): Promise<string>

Queries parsed document data using natural language.

Parameter Type Description
parsedData ParsedDocumentData Previously parsed document data (from DocumentParser).
query string Natural-language question about the data.
documentId string Optional identifier for caching the SQLite database across multiple queries on the same document.

Returns: Promise<string> -- a plain-language answer to the query.

Throws: Error if the document type is not CSV or Excel, or if no data is present.

Query Path Selection:

  • Tabular data (structureType === "data-rows"): Data is loaded into SQLite, Gemini generates a SQL query, the query is executed, and results are formatted by Gemini.
  • Structured data (structureType === "structured"): Data is converted to JSON and sent to Gemini for direct analysis (no SQL involved).

dispose

dispose(documentId?: string): void

Closes cached SQLite databases. Call this when done querying a document, especially in serverless environments.

Parameter Type Description
documentId string If provided, closes only the database for that document. Otherwise, closes all cached databases.

Architecture

graph TD
    Q["Natural Language Query"] --> DQ[DocumentQuery]
    DQ -->|structureType = data-rows| SQL_PATH
    DQ -->|structureType = structured| JSON_PATH

    subgraph "SQL Path"
        SQL_PATH[Load SQLite] --> GEN[Gemini generates SQL]
        GEN --> EXEC[Execute SQL]
        EXEC --> FMT[Gemini formats results]
    end

    subgraph "JSON Path"
        JSON_PATH[Convert to JSON] --> DIRECT[Gemini direct analysis]
    end

Safety

The SQL execution layer enforces several safety measures:

  • Only SELECT and PRAGMA queries are allowed
  • Dangerous keywords (DROP, DELETE, INSERT, UPDATE, ALTER, etc.) are blocked
  • SQL comments (--, /* */) are rejected
  • All column values are stored as TEXT in SQLite; numeric operations require explicit CAST

Code Example

import { DocumentParser, DocumentQuery } from "@modernpath/agent-framework";

const parser = new DocumentParser();
const query = new DocumentQuery(parser, geminiClient);

// Parse the document
const parsed = await parser.parseDocument(csvBuffer, "text/csv", "sales.csv");

// Ask questions in natural language
const answer1 = await query.queryDocumentData(
  parsed,
  "What is the total revenue for Q1 2024?",
  "sales-doc-1",
);
console.log(answer1);
// "The total revenue for Q1 2024 is $1,234,567.89 based on 342 transactions."

// Ask follow-up questions (reuses cached SQLite database)
const answer2 = await query.queryDocumentData(
  parsed,
  "Which region had the highest sales?",
  "sales-doc-1",
);

// Clean up when done
query.dispose("sales-doc-1");