Skip to content

WebResearchAgent

Agent for gathering information from the public web using Google Search grounding and URL context tools. Delegates tool usage decisions to the Gemini model -- the model autonomously decides when to search Google and when to fetch URL content based on the user's question and provided URLs.

Import

import { WebResearchAgent } from "@modernpath/agent-framework";
import type { WebResearchAgentConfig } from "@modernpath/agent-framework";

Constructor

class WebResearchAgent extends BaseAgent {
  constructor(
    toolRegistry: ToolRegistry,
    gemini: GeminiClient,
    cfg?: WebResearchAgentConfig,
  )
}
Parameter Type Required Description
toolRegistry ToolRegistry Yes Tool registry for the agent.
gemini GeminiClient Yes Gemini model client. Must support the tools option for Google Search grounding and URL context.
cfg WebResearchAgentConfig No Optional configuration.

Configuration

WebResearchAgentConfig

interface WebResearchAgentConfig {
  defaultUrls?: string[];
}
Name Type Required Default Description
defaultUrls string[] No [] Default list of URLs to include as context if none are provided at runtime via context.parameters.urls.

Context Parameters

The agent reads these keys from AgentContext.parameters:

Parameter Type Required Default Description
urls string[] No [] (or defaultUrls from config) URLs to include as context for the research. These are appended to the prompt so the model can fetch and analyze their content.

Return Value

On success, AgentResult.data contains:

{
  answer: string;
  groundingMetadata?: any;
  urlContextMetadata?: any;
  urlsUsed: string[];
}
Field Type Description
answer string The research answer text, including citations when available.
groundingMetadata any Metadata from Google Search grounding (search queries used, result snippets, etc.). Present when the model used the Google Search tool.
urlContextMetadata any Metadata from URL context fetching. Present when the model fetched URL content.
urlsUsed string[] The URLs that were provided as context (from parameters.urls or defaultUrls).

Gemini Tools Used

The agent enables two Gemini tools in the generation request:

Tool Purpose
googleSearch Grounding with Google Search. The model searches Google for relevant information and uses the results to inform its answer. Provides citations and source links.
urlContext URL context fetching. The model retrieves and analyzes the content of provided URLs.

Both tools are passed to GeminiClient.generateContent() via the tools option:

tools: [{ googleSearch: {} }, { urlContext: {} }]

The model decides autonomously which tools to use based on the prompt content and provided URLs. If no URLs are provided, the model primarily uses Google Search. If URLs are provided, the model can fetch and analyze their content directly.

Gemini tool requirements

Google Search grounding and URL context are Gemini API features. Ensure your Gemini API key or service account has access to these capabilities. See the Gemini documentation for details.

Execution Flow

sequenceDiagram
    participant Caller
    participant WebResearchAgent
    participant GeminiClient
    participant GoogleSearch
    participant URLContext

    Caller->>WebResearchAgent: execute(context)
    WebResearchAgent->>WebResearchAgent: Build prompt with URLs
    WebResearchAgent->>GeminiClient: generateContent(prompt, tools)

    alt Model decides to search
        GeminiClient->>GoogleSearch: search query
        GoogleSearch-->>GeminiClient: search results
    end

    alt Model decides to fetch URLs
        GeminiClient->>URLContext: fetch URLs
        URLContext-->>GeminiClient: page content
    end

    GeminiClient-->>WebResearchAgent: { text, groundingMetadata, urlContextMetadata }
    WebResearchAgent-->>Caller: { answer, groundingMetadata, urlContextMetadata, urlsUsed }

Prompt Construction

The agent builds the prompt by combining the user's question with any provided URLs:

  • If no URLs are provided, the prompt is the user's question as-is
  • If URLs are provided, they are appended:
<user question>

URLs to use as context:
- https://example.com/page1
- https://example.com/page2

The system prompt instructs the model to:

  • Act as a careful web research assistant
  • Use citations when available
  • Acknowledge uncertainty
  • Summarize key findings and list sources

Code Example

Basic Web Research

import { WebResearchAgent, ToolRegistry } from "@modernpath/agent-framework";

const registry = new ToolRegistry();

const researcher = new WebResearchAgent(registry, geminiClient);

const result = await researcher.execute({
  userId: 42,
  auditingId: 1001,
  prompt: "What are the latest developments in quantum computing?",
  parameters: {},
});

if (result.success) {
  console.log("Answer:", result.data.answer);
  console.log("Grounding:", result.data.groundingMetadata);
}

Research with Specific URLs

const result = await researcher.execute({
  userId: 42,
  auditingId: 1001,
  prompt: "Summarize the key findings from these research papers",
  parameters: {
    urls: [
      "https://arxiv.org/abs/2301.12345",
      "https://arxiv.org/abs/2302.67890",
    ],
  },
});

if (result.success) {
  console.log("Answer:", result.data.answer);
  console.log("URLs analyzed:", result.data.urlsUsed);
}

With Default URLs

const researcher = new WebResearchAgent(registry, geminiClient, {
  defaultUrls: [
    "https://docs.example.com/api-reference",
    "https://docs.example.com/changelog",
  ],
});

// These URLs are always included unless overridden by context.parameters.urls
const result = await researcher.execute({
  userId: 42,
  auditingId: 1001,
  prompt: "What new features were added in the latest release?",
  parameters: {},
});

As Part of an Orchestrator Workflow

The WebResearchAgent can be registered as an atomic agent with the orchestrator. While the built-in orchestrator uses documentAnalysis and qaChat as its named agents, you can extend the pattern:

// Use WebResearchAgent directly for web-based questions
const result = await researcher.execute({
  userId: 42,
  auditingId: 1001,
  prompt: "What is the current market cap of Tesla?",
  parameters: {},
});

// Or wrap it as a dynamic tool for the orchestrator
registerDynamicTool(registry, {
  name: "web-research",
  description: "Research a topic on the web using Google Search and URL analysis",
  category: "external",
  tags: ["research", "external"],
  parameters: {
    query: { type: "string", description: "Research question" },
    urls: {
      type: "array",
      description: "Optional URLs to analyze",
      required: false,
      items: { type: "string", description: "URL" },
    },
  },
  execute: async (params, ctx) => {
    const agentResult = await researcher.execute({
      userId: ctx.userId,
      auditingId: ctx.auditingId,
      prompt: params.query,
      parameters: { urls: params.urls },
    });
    return agentResult.data;
  },
});