Skip to content

DocumentAnalysisAgent

Agent for performing document operations against SharePoint-hosted documents via the Gemini model. Supports eight actions: listing, selecting, querying, analyzing, comparing, extracting, summarizing, and general-purpose analysis. The agent classifies intent automatically when no explicit action is provided.

Import

import { DocumentAnalysisAgent } from "@modernpath/agent-framework";

Constructor

class DocumentAnalysisAgent extends BaseAgent {
  constructor(
    toolRegistry: ToolRegistry,
    gemini: GeminiClient,
    prompts: PromptTemplate,
  )
}
Parameter Type Required Description
toolRegistry ToolRegistry Yes Must contain the document tools this agent calls (see Required Tools).
gemini GeminiClient Yes Gemini model client for intent classification and direct document analysis.
prompts PromptTemplate Yes Prompt template engine. Must have document-analysis.intent_classification.* templates loaded.

Supported Actions

The agent dispatches to one of eight actions based on context.parameters.action. If no action is specified, the agent uses LLM-based intent classification to determine the appropriate action.

Action Description Required Parameters
list List documents in a SharePoint site/folder siteId
select List documents and select the most relevant ones for a given prompt siteId
query Retrieve a document and query its contents (tabular or free-form) siteId, documentId
analyze Deep analysis of a single document (auto-selects if documentId not provided) siteId
compare Compare two or more documents siteId, documentIds (array, 2+)
extract Extract structured data from a document siteId, documentId
summarize Generate a summary of a document siteId, documentId
general Fallback for unrecognized intents --

Context Parameters

The agent reads these keys from AgentContext.parameters:

Parameter Type Required Default Description
action string No Auto-classified One of list, select, query, analyze, compare, extract, summarize, general. If omitted, the agent classifies intent from the prompt.
siteId string Yes (for most actions) -- SharePoint site ID.
documentId string Depends on action -- Document ID for single-document operations (query, extract, summarize).
documentIds string[] For compare -- Array of document IDs to compare (minimum 2).
folderPath string No undefined SharePoint folder path to scope the listing.
fileTypes string[] No undefined File type filter (e.g. ["pdf", "xlsx"]).
pageSize number No Varies Number of documents to return in list operations.
skipToken string No undefined Pagination token for the list action.
maxSelection number No undefined Maximum number of documents to select in the select action.
analysisType string No undefined Type of analysis for the analyze action.
extractionSchema object No undefined JSON schema defining the extraction structure for the extract action.
comparisonType string No undefined Type of comparison for the compare action.
systemPrompt string No undefined Custom system prompt for the query action when using direct LLM analysis.

Required Tools

The agent invokes these tools via the ToolRegistry. Register them before constructing the agent:

Tool Name Used By Actions Description
documentList list, select, analyze, compare Lists documents in a SharePoint site.
documentSelect select, analyze Uses the LLM to select the most relevant documents for a prompt.
documentRetrieve query Downloads and parses a document for querying.
documentQuery query Queries tabular document data (CSV, Excel).
documentAnalyze analyze Performs deep analysis of a single document.
documentSummarize summarize Generates a document summary.
documentExtract extract Extracts structured data from a document.
documentCompare compare Compares multiple documents.

Not all tools are required

You only need to register the tools for the actions you plan to use. For example, if you only use list and query, register documentList, documentRetrieve, and documentQuery.

Return Value

The structure of AgentResult.data depends on the action:

{
  documents: Array<{
    id: string;
    name: string;
    // ... additional document metadata
  }>;
}
{
  selectedDocumentIds: string[];
  candidates: Array<{ id: string; name: string; ... }>;
}

For tabular documents (CSV, Excel):

{
  // Query result from documentQuery tool
}

For non-tabular documents:

string  // Direct LLM analysis text

{
  // Analysis result from documentAnalyze tool
}
{
  // Comparison result from documentCompare tool
}

If fewer than 2 document IDs are provided:

{
  message: "Provide parameters.documentIds (2+).";
  candidates: Array<{ id: string; name: string; ... }>;
}

{
  // Result from documentExtract or documentSummarize tool
}

Intent Classification

When parameters.action is not set, the agent classifies the user's intent using the Gemini model with low temperature (0.1) and a maximum of 10 output tokens. The classification prompt is rendered from:

  • document-analysis.intent_classification.system -- system prompt (optional)
  • document-analysis.intent_classification.user_template -- user prompt with prompt and history variables

The model returns one of: query, list, select, compare, extract, summarize, analyze, general.

Execution Flow

flowchart TD
    Start[execute] --> HasAction{action provided?}
    HasAction -->|Yes| Dispatch
    HasAction -->|No| Classify[LLM intent classification]
    Classify --> Dispatch{action}
    Dispatch -->|list| List[documentList tool]
    Dispatch -->|select| Select[documentList + documentSelect]
    Dispatch -->|query| Query[documentRetrieve + documentQuery/LLM]
    Dispatch -->|analyze| Analyze[auto-select + documentAnalyze]
    Dispatch -->|compare| Compare[documentCompare]
    Dispatch -->|extract| Extract[documentExtract]
    Dispatch -->|summarize| Summarize[documentSummarize]
    Dispatch -->|general| General[Unsupported action message]

Code Example

List Documents

import { DocumentAnalysisAgent, ToolRegistry } from "@modernpath/agent-framework";

const registry = new ToolRegistry();
// Register document tools...
registry.register("documentList", documentListTool);

const agent = new DocumentAnalysisAgent(registry, geminiClient, prompts);

const result = await agent.execute({
  userId: 42,
  auditingId: 1001,
  prompt: "Show me all documents",
  parameters: {
    action: "list",
    siteId: "site-abc-123",
    folderPath: "/Shared Documents/Finance",
    fileTypes: ["pdf", "xlsx"],
    pageSize: 20,
  },
});

if (result.success) {
  for (const doc of result.data.documents) {
    console.log(`${doc.name} (${doc.id})`);
  }
}

Query a Document

const result = await agent.execute({
  userId: 42,
  auditingId: 1001,
  prompt: "What was the total revenue in Q3?",
  parameters: {
    action: "query",
    siteId: "site-abc-123",
    documentId: "doc-xyz-789",
  },
});

Compare Documents

const result = await agent.execute({
  userId: 42,
  auditingId: 1001,
  prompt: "Compare the budget proposals for Q3 and Q4",
  parameters: {
    action: "compare",
    siteId: "site-abc-123",
    documentIds: ["doc-q3-budget", "doc-q4-budget"],
    comparisonType: "financial",
  },
});

Auto-classify Intent

// No action specified -- the agent classifies intent from the prompt
const result = await agent.execute({
  userId: 42,
  auditingId: 1001,
  prompt: "Summarize the annual report",
  parameters: {
    siteId: "site-abc-123",
    documentId: "doc-annual-report",
  },
});
// Agent classifies intent as "summarize" and calls documentSummarize