Skip to content

Document Processing

The document processing module handles parsing, querying, and attachment preparation for documents used by agents. It supports CSV, Excel, JSON, and plain text files, and provides both SQL-based querying of structured data and intelligent prompt construction for LLM consumption.

Import

import {
  DocumentParser,
  DocumentQuery,
  DocumentAttachment,
  ParsedDocumentData,
  PreparedDocument,
  DocumentContentType,
} from "@modernpath/agent-framework";

Module Map

graph LR
    DP[DocumentParser] -->|produces| PD[ParsedDocumentData]
    DQ[DocumentQuery] -->|uses| DP
    DQ -->|queries via| SQL[SQLite]
    DQ -->|generates SQL via| G[Gemini]
    DA[DocumentAttachment] -->|uses| DP
    DA -->|prepares for| LLM["LLM Prompt"]
    DA -->|uploads via| FA["Files API"]

Components

DocumentParser

Parses CSV, Excel (.xlsx, .xls), and plain text files into a structured ParsedDocumentData format. Includes automatic delimiter detection, structure analysis (tabular vs. structured/financial), and content summarization.

DocumentQuery

Enables natural-language querying of structured documents by loading parsed data into an in-memory SQLite database and using Gemini to generate and explain SQL queries. Supports both tabular (SQL path) and structured (JSON + LLM path) document types.

DocumentAttachment

Bridges document content to LLM consumption. Determines the appropriate handling strategy for each document type (text, tabular, binary attachment, Files API) and provides the static buildPromptWithDocuments() method for constructing prompts with document context.

How They Work Together

import { DocumentParser, DocumentQuery, DocumentAttachment } from "@modernpath/agent-framework";

// 1. Parse a document
const parser = new DocumentParser();
const parsed = await parser.parseDocument(csvBuffer, "text/csv", "sales.csv");

// 2. Query it with natural language
const query = new DocumentQuery(parser, geminiClient);
const answer = await query.queryDocumentData(parsed, "What is the total revenue?");

// 3. Or attach it to a prompt
const prepared = await attachment.prepareInlineDocument(
  csvBuffer, "text/csv", "sales.csv"
);
const { prompt, documents } = DocumentAttachment.buildPromptWithDocuments(
  "Analyze the attached sales data.", prepared
);