Skip to content

Document Composition

The composition system enables modular, multi-document knowledge bases by allowing markdown documents to declare dependencies via YAML frontmatter includes[]. When a canonical document is fetched for grounding, the CompositionLoader performs a bounded BFS traversal to resolve all included documents into a single CompositionBundle.

This enables patterns like a root SOP that includes shared glossaries, escalation policies, and reference data -- all automatically resolved and attached to the LLM context.

How It Works

Frontmatter Syntax

Markdown documents declare their includes using standard YAML frontmatter:

---
title: "Cold Complaint Procedure"
canonical_path: sops/cold-complaint.md
includes:
  - ./shared/glossary.md
  - ./shared/escalation-policy.md
  - ../reference/temperature-thresholds.md
---

# Cold Complaint Procedure

When a tenant reports cold temperatures...

Rules:

  • includes must be a YAML array of strings.
  • Each entry is a relative path resolved against the parent document's location in storage.
  • Absolute paths and URLs are rejected (security measure).
  • Path traversal attempts (e.g. ../../.env) that escape the storage root are blocked.
  • Non-string entries and duplicates are silently skipped.

Traversal

The CompositionLoader performs BFS (breadth-first search) expansion starting from the root document:

graph TD
    A["cold-complaint.md\n(root, depth 0)"] --> B["glossary.md\n(depth 1)"]
    A --> C["escalation-policy.md\n(depth 1)"]
    A --> D["temperature-thresholds.md\n(depth 1)"]
    C --> E["on-call-contacts.md\n(depth 2)"]

Safety guarantees:

Guard Description Default
Cycle detection Documents already visited are skipped (by canonical URI). Always active
Diamond dedup Same document referenced from multiple parents is fetched only once. Always active
Depth limit Maximum BFS depth from the root document. 2
Document budget Maximum total documents in the bundle (including root). 8
Byte budget Maximum total bytes for included documents (excludes root). 5 MB
Timeout Wall-clock limit for the entire traversal phase. 10 seconds
Path safety Relative paths that escape the storage root are rejected. Always active

Only documents with traversable MIME types (default: text/markdown) have their frontmatter parsed for further includes. Non-markdown includes are fetched and added to the bundle but not recursed into.

Import

import { CompositionLoader, type CompositionPolicy } from "@modernpath/agent-framework";

Note

In most cases, you do not use CompositionLoader directly. Instead, configure a compositionPolicy on CanonicalGroundingService and the expansion happens automatically during prepareFromSources().

CompositionPolicy

Configuration controlling the bounded traversal.

interface CompositionPolicy {
  traversableMimeTypes?: string[];  // default: ["text/markdown"]
  maxDepth?: number;                // default: 2
  maxDocs?: number;                 // default: 8
  maxIncludeBytes?: number;         // default: 5242880 (5 MB)
  timeoutMs?: number;               // default: 10000 (10s)
  onFetchError?: "skip" | "abort";  // default: "skip"
}
Property Type Default Description
traversableMimeTypes string[] ["text/markdown"] Only documents with these MIME types have their frontmatter parsed for includes.
maxDepth number 2 Maximum depth of include traversal. Depth 0 = root, depth 1 = root's direct includes, depth 2 = includes of includes.
maxDocs number 8 Maximum total documents in the bundle (including root).
maxIncludeBytes number 5242880 Maximum total bytes for all fetched included documents (excludes root, which is always fetched).
timeoutMs number 10000 Wall-clock timeout for the entire composition traversal phase.
onFetchError "skip" \| "abort" "skip" What to do when a single include fails to fetch. "skip" logs and continues; "abort" stops traversal and returns what was fetched so far.

Composition policy examples

const policy: CompositionPolicy = {
  maxDepth: 1,       // Only direct includes
  maxDocs: 4,        // Small bundle
  timeoutMs: 5_000,  // 5-second timeout
};
const policy: CompositionPolicy = {
  maxDepth: 3,
  maxDocs: 15,
  maxIncludeBytes: 10 * 1024 * 1024, // 10 MB
  timeoutMs: 30_000,
};
const policy: CompositionPolicy = {
  maxDepth: 2,
  maxDocs: 8,
  onFetchError: "abort", // Stop on first failure
};

CompositionLoader

The loader that performs the BFS traversal.

Constructor

const loader = new CompositionLoader(
  fetcher: CanonicalDocumentFetcher,
  defaultPolicy?: CompositionPolicy,
);

loadBundle()

Expand a root document into a composition bundle.

async loadBundle(
  rootDoc: PreparedDocument,
  rootPointer: CanonicalDocumentPointer,
  ctx: { userId: number; auditingId: number },
  policyOverride?: CompositionPolicy,
  alreadyVisited?: Set<string>,
  prepareOptions?: PrepareDocumentOptions,
): Promise<CompositionBundle>
Parameter Type Description
rootDoc PreparedDocument Already-fetched root document.
rootPointer CanonicalDocumentPointer Canonical pointer of the root document.
ctx { userId, auditingId } Auth context for fetching included documents.
policyOverride CompositionPolicy Per-call policy override (merged with constructor default).
alreadyVisited Set<string> Canonical URIs to treat as already fetched (for cross-root deduplication).
prepareOptions PrepareDocumentOptions Options forwarded to CanonicalDocumentFetcher.

Types

CompositionBundle

The result of expanding a root document's include graph.

interface CompositionBundle {
  root: PreparedDocument;                // The root document (always present)
  rootPointer: CanonicalDocumentPointer; // Canonical pointer of the root

  included: PreparedDocument[];          // Included documents in BFS order
  includedPointers: CanonicalDocumentPointer[];

  allDocuments: PreparedDocument[];      // [root, ...included] convenience accessor

  edges: CompositionEdge[];              // Include graph edges (for debug/visualization)
  skipped: SkippedInclude[];             // Includes that were not fetched, with reasons

  policy: ResolvedCompositionPolicy;     // The resolved policy that was used
}

CompositionEdge

A single edge in the include graph.

interface CompositionEdge {
  from: string;         // Canonical URI of the parent document
  to: string;           // Canonical URI of the included document
  relativePath: string; // The raw include path from frontmatter
  depth: number;        // BFS depth of this edge
}

SkippedInclude

Record of an include that was discovered but not fetched.

interface SkippedInclude {
  parentUri: string;     // Canonical URI of the parent document
  relativePath: string;  // The raw include path from frontmatter
  reason: SkipReason;    // Why it was skipped
  detail?: string;       // Additional detail (e.g. error message)
}

type SkipReason =
  | "cycle"                // Already visited (cycle or diamond)
  | "depth_exceeded"       // Beyond maxDepth
  | "doc_budget_exceeded"  // Beyond maxDocs
  | "byte_budget_exceeded" // Beyond maxIncludeBytes
  | "timeout"              // Wall-clock timeout reached
  | "non_traversable_mime" // MIME type not in traversableMimeTypes
  | "resolve_failed"       // Path resolution failed (e.g. path traversal)
  | "fetch_failed";        // Document fetch threw an error

canonical_path Frontmatter

Documents can declare their canonical storage path via canonical_path frontmatter. This ensures correct key assignment during upload and proper relative path resolution during composition:

---
title: "Shared Glossary"
canonical_path: shared/glossary.md
---

When canonical_path is present:

  • Upload tools place the file at the correct storage key.
  • Relative paths in includes[] are resolved correctly regardless of the upload source.
  • Re-uploads always overwrite the same key (idempotent).

Integration with CanonicalGroundingService

The most common way to use composition is through CanonicalGroundingService, which automatically handles composition when a compositionPolicy is configured:

const grounding = new CanonicalGroundingService(stores, fetcher, {
  compositionPolicy: {
    maxDepth: 2,
    maxDocs: 8,
    maxIncludeBytes: 5 * 1024 * 1024,
  },
});

const result = await grounding.prepareFromSources(sources, ctx);

// result.compositionBundles    -> per-root composition details
// result.allDocumentsForAttachment -> deduplicated, ordered document list

The service uses a shared visited set across all root documents, so if document A includes glossary.md, document B will not re-fetch it during its own expansion.

Debug Logging

Enable with:

DEBUG=composition-loader node app.js

Logs include BFS queue state, fetch timing, byte budgets, cycle detection, and skip reasons.