Document Processing¶
The document processing module handles parsing, querying, and attachment preparation for documents used by agents. It supports CSV, Excel, JSON, and plain text files, and provides both SQL-based querying of structured data and intelligent prompt construction for LLM consumption.
Import¶
import {
DocumentParser,
DocumentQuery,
DocumentAttachment,
ParsedDocumentData,
PreparedDocument,
DocumentContentType,
} from "@modernpath/agent-framework";
Module Map¶
graph LR
DP[DocumentParser] -->|produces| PD[ParsedDocumentData]
DQ[DocumentQuery] -->|uses| DP
DQ -->|queries via| SQL[SQLite]
DQ -->|generates SQL via| G[Gemini]
DA[DocumentAttachment] -->|uses| DP
DA -->|prepares for| LLM["LLM Prompt"]
DA -->|uploads via| FA["Files API"] Components¶
DocumentParser¶
Parses CSV, Excel (.xlsx, .xls), and plain text files into a structured ParsedDocumentData format. Includes automatic delimiter detection, structure analysis (tabular vs. structured/financial), and content summarization.
DocumentQuery¶
Enables natural-language querying of structured documents by loading parsed data into an in-memory SQLite database and using Gemini to generate and explain SQL queries. Supports both tabular (SQL path) and structured (JSON + LLM path) document types.
DocumentAttachment¶
Bridges document content to LLM consumption. Determines the appropriate handling strategy for each document type (text, tabular, binary attachment, Files API) and provides the static buildPromptWithDocuments() method for constructing prompts with document context.
How They Work Together¶
import { DocumentParser, DocumentQuery, DocumentAttachment } from "@modernpath/agent-framework";
// 1. Parse a document
const parser = new DocumentParser();
const parsed = await parser.parseDocument(csvBuffer, "text/csv", "sales.csv");
// 2. Query it with natural language
const query = new DocumentQuery(parser, geminiClient);
const answer = await query.queryDocumentData(parsed, "What is the total revenue?");
// 3. Or attach it to a prompt
const prepared = await attachment.prepareInlineDocument(
csvBuffer, "text/csv", "sales.csv"
);
const { prompt, documents } = DocumentAttachment.buildPromptWithDocuments(
"Analyze the attached sales data.", prepared
);
Related Pages¶
- SharePoint Integration -- downloading documents from SharePoint for processing
- Gemini Integration -- LLM client used for document queries and Files API
- Knowledge Base -- RAG retrieval from indexed documents