DocumentAnalysisAgent¶
Agent for performing document operations against SharePoint-hosted documents via the Gemini model. Supports eight actions: listing, selecting, querying, analyzing, comparing, extracting, summarizing, and general-purpose analysis. The agent classifies intent automatically when no explicit action is provided.
Import¶
Constructor¶
class DocumentAnalysisAgent extends BaseAgent {
constructor(
toolRegistry: ToolRegistry,
gemini: GeminiClient,
prompts: PromptTemplate,
)
}
| Parameter | Type | Required | Description |
|---|---|---|---|
toolRegistry | ToolRegistry | Yes | Must contain the document tools this agent calls (see Required Tools). |
gemini | GeminiClient | Yes | Gemini model client for intent classification and direct document analysis. |
prompts | PromptTemplate | Yes | Prompt template engine. Must have document-analysis.intent_classification.* templates loaded. |
Supported Actions¶
The agent dispatches to one of eight actions based on context.parameters.action. If no action is specified, the agent uses LLM-based intent classification to determine the appropriate action.
| Action | Description | Required Parameters |
|---|---|---|
list | List documents in a SharePoint site/folder | siteId |
select | List documents and select the most relevant ones for a given prompt | siteId |
query | Retrieve a document and query its contents (tabular or free-form) | siteId, documentId |
analyze | Deep analysis of a single document (auto-selects if documentId not provided) | siteId |
compare | Compare two or more documents | siteId, documentIds (array, 2+) |
extract | Extract structured data from a document | siteId, documentId |
summarize | Generate a summary of a document | siteId, documentId |
general | Fallback for unrecognized intents | -- |
Context Parameters¶
The agent reads these keys from AgentContext.parameters:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
action | string | No | Auto-classified | One of list, select, query, analyze, compare, extract, summarize, general. If omitted, the agent classifies intent from the prompt. |
siteId | string | Yes (for most actions) | -- | SharePoint site ID. |
documentId | string | Depends on action | -- | Document ID for single-document operations (query, extract, summarize). |
documentIds | string[] | For compare | -- | Array of document IDs to compare (minimum 2). |
folderPath | string | No | undefined | SharePoint folder path to scope the listing. |
fileTypes | string[] | No | undefined | File type filter (e.g. ["pdf", "xlsx"]). |
pageSize | number | No | Varies | Number of documents to return in list operations. |
skipToken | string | No | undefined | Pagination token for the list action. |
maxSelection | number | No | undefined | Maximum number of documents to select in the select action. |
analysisType | string | No | undefined | Type of analysis for the analyze action. |
extractionSchema | object | No | undefined | JSON schema defining the extraction structure for the extract action. |
comparisonType | string | No | undefined | Type of comparison for the compare action. |
systemPrompt | string | No | undefined | Custom system prompt for the query action when using direct LLM analysis. |
Required Tools¶
The agent invokes these tools via the ToolRegistry. Register them before constructing the agent:
| Tool Name | Used By Actions | Description |
|---|---|---|
documentList | list, select, analyze, compare | Lists documents in a SharePoint site. |
documentSelect | select, analyze | Uses the LLM to select the most relevant documents for a prompt. |
documentRetrieve | query | Downloads and parses a document for querying. |
documentQuery | query | Queries tabular document data (CSV, Excel). |
documentAnalyze | analyze | Performs deep analysis of a single document. |
documentSummarize | summarize | Generates a document summary. |
documentExtract | extract | Extracts structured data from a document. |
documentCompare | compare | Compares multiple documents. |
Not all tools are required
You only need to register the tools for the actions you plan to use. For example, if you only use list and query, register documentList, documentRetrieve, and documentQuery.
Return Value¶
The structure of AgentResult.data depends on the action:
For tabular documents (CSV, Excel):
For non-tabular documents:
If fewer than 2 document IDs are provided:
Intent Classification¶
When parameters.action is not set, the agent classifies the user's intent using the Gemini model with low temperature (0.1) and a maximum of 10 output tokens. The classification prompt is rendered from:
document-analysis.intent_classification.system-- system prompt (optional)document-analysis.intent_classification.user_template-- user prompt withpromptandhistoryvariables
The model returns one of: query, list, select, compare, extract, summarize, analyze, general.
Execution Flow¶
flowchart TD
Start[execute] --> HasAction{action provided?}
HasAction -->|Yes| Dispatch
HasAction -->|No| Classify[LLM intent classification]
Classify --> Dispatch{action}
Dispatch -->|list| List[documentList tool]
Dispatch -->|select| Select[documentList + documentSelect]
Dispatch -->|query| Query[documentRetrieve + documentQuery/LLM]
Dispatch -->|analyze| Analyze[auto-select + documentAnalyze]
Dispatch -->|compare| Compare[documentCompare]
Dispatch -->|extract| Extract[documentExtract]
Dispatch -->|summarize| Summarize[documentSummarize]
Dispatch -->|general| General[Unsupported action message] Code Example¶
List Documents¶
import { DocumentAnalysisAgent, ToolRegistry } from "@modernpath/agent-framework";
const registry = new ToolRegistry();
// Register document tools...
registry.register("documentList", documentListTool);
const agent = new DocumentAnalysisAgent(registry, geminiClient, prompts);
const result = await agent.execute({
userId: 42,
auditingId: 1001,
prompt: "Show me all documents",
parameters: {
action: "list",
siteId: "site-abc-123",
folderPath: "/Shared Documents/Finance",
fileTypes: ["pdf", "xlsx"],
pageSize: 20,
},
});
if (result.success) {
for (const doc of result.data.documents) {
console.log(`${doc.name} (${doc.id})`);
}
}
Query a Document¶
const result = await agent.execute({
userId: 42,
auditingId: 1001,
prompt: "What was the total revenue in Q3?",
parameters: {
action: "query",
siteId: "site-abc-123",
documentId: "doc-xyz-789",
},
});
Compare Documents¶
const result = await agent.execute({
userId: 42,
auditingId: 1001,
prompt: "Compare the budget proposals for Q3 and Q4",
parameters: {
action: "compare",
siteId: "site-abc-123",
documentIds: ["doc-q3-budget", "doc-q4-budget"],
comparisonType: "financial",
},
});
Auto-classify Intent¶
// No action specified -- the agent classifies intent from the prompt
const result = await agent.execute({
userId: 42,
auditingId: 1001,
prompt: "Summarize the annual report",
parameters: {
siteId: "site-abc-123",
documentId: "doc-annual-report",
},
});
// Agent classifies intent as "summarize" and calls documentSummarize
Related Pages¶
- BaseAgent -- base class and execution lifecycle
- ToolRegistry -- registering document tools
- OrchestratorAgent -- using DocumentAnalysisAgent as an atomic agent
- SharePoint Integration -- the underlying SharePoint access layer
- Document Processing -- parsing and querying documents