deep-research

Multi-agent deep research pipeline for complex questions (EIP analysis, architecture decisions, cross-client comparisons, protocol design). Use when single-shot answers are insufficient and you need decomposition, parallel investigation, adversarial critique, and a formal output document.

lodekeeper 4 1 Updated 4mo ago

GitHub

Install

npx skillscat add lodekeeper/dotfiles/deep-research

Install via the SkillsCat registry.

SKILL.md

Deep Research Skill

Multi-agent deep research pipeline for complex topics. Produces formalized research documents (specs, analyses, proposals) through iterative investigation, synthesis, and adversarial critique.

When to use: Complex questions requiring genuine research — EIP analysis, implementation strategies, novel ideas, cross-client comparisons, protocol design, or any topic where a single-shot answer isn't good enough.

Expected duration: 30-90 minutes depending on complexity.

Prerequisites

Oracle CLI: oracle (GPT-5 Pro access for deep reasoning)
Oracle Bridge: See skills/oracle-bridge/SKILL.md — required for browser mode on this server
Sub-agents: Available via sessions_spawn (explorer, specialist, adversary roles)
Web search: For prior art, papers, existing implementations
File access: For reading specs, code, EIPs locally

Related Skills

skills/oracle-bridge/SKILL.md — starts and troubleshoots Oracle browser mode (default path for GPT-5.2-pro reasoning).
skills/web-scraping/SKILL.md — use when web_search/web_fetch are insufficient or blocked and you need robust page acquisition before synthesis.
skills/dev-workflow/SKILL.md — use when research is feeding directly into a Lodestar implementation plan/PR.

Check Oracle is available:

source ~/.nvm/nvm.sh && nvm use 22 && oracle --version

⚠️ Always Save Oracle Output to File

Oracle runs take minutes+. Context can compact mid-run, losing stdout. Every Oracle call MUST pipe to a file via | tee <path>. No exceptions. This ensures output survives compaction and can be read back later.

Oracle Engine Priority

Oracle has two engines. Always use browser mode first (uses ChatGPT Pro subscription, no per-query cost).

Engine	Command	Cost	Reliability
Browser (default)	`ORACLE_REUSE_TAB=1 oracle --engine browser --remote-chrome localhost:9222`	Free (Pro sub)	Requires bridge running + valid session token
API (fallback)	`oracle --engine api`	~$0.09/query	Always works if API key set

⚠️ CRITICAL: Do NOT silently fall back to API mode. If browser mode fails (expired token, bridge down):

Stop — do not continue research
Alert user: "ChatGPT session token expired. Need fresh __Secure-next-auth.session-token from chatgpt.com, or explicit approval to use API mode."
Only switch to API if user explicitly approves

Starting the Oracle Bridge

Before any Oracle browser-mode call, ensure the bridge is running:

# Check if bridge is already running
curl -s http://localhost:9222/json/version && echo "Bridge running" || echo "Bridge not running"

# Start bridge (if not running)
source ~/camoufox-env/bin/activate
python3 ~/.openclaw/workspace/research/oracle-bridge-v3.py \
  --cookies ~/.oracle/chatgpt-cookies.json &
sleep 15  # wait for browser + CF bypass + login

# Verify
curl -s http://localhost:9222/json/version | grep -q Chrome && echo "Ready"

For full bridge documentation, see skills/oracle-bridge/SKILL.md.

Research Type Classification

Different research questions need different tools. Classify each sub-question during Phase 1 and route accordingly:

Type A: Web Literature / Ecosystem Survey

"What tools exist for X?", "Compare approaches to Y", "Find prior art on Z"

Best tool: o3-deep-research or o4-mini-deep-research (OpenAI API)

Purpose-built for multi-source web browsing + synthesis with citations
Automatically searches, reads, reconciles, and produces documented reports
Far superior to manual web_search + sub-agent for broad surveys

source ~/.nvm/nvm.sh && nvm use 22
oracle --engine api \
  -p "Research [topic]. Browse multiple sources, compare approaches, and produce a cited report covering: [specific questions]" \
  --model o4-mini-deep-research --wait \
  2>&1 | tee ~/research/<topic>/findings/web-survey.md

Cost: o4-mini-deep-research ~$1.10/$4.40 per 1M tokens (cheaper). o3-deep-research ~$10/$40 per 1M tokens (most powerful).

⚠️ MANDATORY: Ask Nico before using ANY deep research API model. Explain why web_search + sub-agents weren't sufficient. Only proceed after explicit "yes".

Fallback (free): Sub-agent + web_search + web_fetch; if sources are blocked/partial, switch to skills/web-scraping/SKILL.md for robust acquisition before synthesis.

sessions_spawn task:"Research [sub-question]. Start with web_search/web_fetch for prior art and papers. If key pages are blocked or JS-rendered, use skills/web-scraping/SKILL.md tiered scraper. Write findings to ~/research/<topic>/findings/web-research.md"

Type B: Codebase / Spec Analysis

"How does Lodestar handle X?", "What does the spec say about Y?", "Find the bug in Z"

Best tool: Codex CLI (xhigh reasoning) or Claude CLI + sub-agents

Needs local file access (repos, specs, code)
Can run tests, grep codebases, read large files
Deep research API models can't do this

# Codex for focused code investigation
codex exec --full-auto "Analyze [question] in ~/lodestar/packages/... Write findings to ~/research/<topic>/findings/code-analysis.md"

# Or Claude CLI for broader reasoning
claude "Read [files] and analyze [question]. Write to ~/research/<topic>/findings/code-analysis.md"

Or via sub-agent:

sessions_spawn task:"Analyze [sub-question] by reading:
- Relevant consensus specs: ~/consensus-specs/specs/...
- Lodestar implementation: ~/lodestar/packages/...
- Other client implementations (search GitHub)
Write findings to ~/research/<topic>/findings/spec-analysis.md"

Type C: Deep Reasoning / Novel Analysis

"What are the tradeoffs of X?", "Design an approach for Y", "What's the best architecture for Z?"

Best tool: GPT-5.2 Pro (via Oracle browser mode)

Strongest reasoning for novel analysis and synthesis
Best when you already have the materials and need deep thinking
Also excellent for adversarial critique

ORACLE_REUSE_TAB=1 oracle --engine browser \
  --remote-chrome localhost:9222 \
  -p "[Your reasoning prompt]" \
  --file ~/research/<topic>/plan.md \
  --model gpt-5.2-pro --wait \
  2>&1 | tee ~/research/<topic>/findings/analysis.md

Type D: Cross-Client Comparison

"How do other clients implement X?"

Best tool: Sub-agent (surveyor) — can search GitHub, read code

sessions_spawn task:"Survey how Prysm, Lighthouse, Teku, and Nimbus handle [topic].
Compare approaches, identify patterns. Write to ~/research/<topic>/findings/cross-client.md"

Workflow

Phase 0: Scoping (5-10 min) — MANDATORY

Before any research begins, return to the human with:

Problem statement — your understanding of what's being asked
Decomposition — 3-5 sub-questions, each classified by type (A/B/C/D)
Tool routing — which model/agent handles each sub-question and why
Assumptions — anything you'd need to assume if not clarified
Cost estimate — if using API models (deep research, GPT-5.2 Pro API), estimate token cost
Estimated time — rough estimate based on complexity
Clarifying questions — anything ambiguous or underspecified

Wait for approval before proceeding. Especially important when API-cost models are proposed.

Phase 1: Decomposition (5 min)

Once approved, finalize the research plan:

Break the topic into 3-5 independent sub-questions
Classify each sub-question by research type (A/B/C/D — see above)
Assign each to the best agent/tool based on classification

Create the research workspace:

mkdir -p ~/research/<topic-slug>/{findings,drafts}

Write the research plan to ~/research/<topic-slug>/plan.md

Phase 2: Parallel Investigation (15-30 min)

Launch all sub-questions simultaneously. Use the routing from Phase 1.

Example mixed investigation:

# Type A: Web survey (sub-agent with web_search, or deep research API if approved)
sessions_spawn task:"Research [web question]. Write to ~/research/<topic>/findings/web-survey.md"

# Type B: Code analysis (Codex or sub-agent)
sessions_spawn task:"Analyze [code question] in ~/lodestar/... Write to ~/research/<topic>/findings/code-analysis.md"

# Type C: Deep reasoning (Oracle browser mode)
ORACLE_REUSE_TAB=1 oracle --engine browser --remote-chrome localhost:9222 \
  -p "[reasoning question]" --model gpt-5.2-pro --wait \
  2>&1 | tee ~/research/<topic>/findings/oracle-analysis.md

# Type D: Cross-client survey (sub-agent)
sessions_spawn task:"Survey other clients on [topic]. Write to ~/research/<topic>/findings/cross-client.md"

Wait for all agents to complete before proceeding.

Phase 3: Synthesis (10-15 min)

Read all findings from ~/research/<topic>/findings/
Identify:
- Common themes across sources
- Contradictions or disagreements
- Gaps in coverage
- Surprising or novel findings
Write a draft document to ~/research/<topic>/drafts/v1.md using the output template (see below)

Phase 4: Adversarial Critique (10-15 min)

Send the draft through adversarial review. Use two different perspectives:

Adversary #1 — GPT-5.2 Pro (via Oracle):

ORACLE_REUSE_TAB=1 oracle --engine browser \
  --remote-chrome localhost:9222 \
  -p "You are a rigorous adversarial reviewer. Find weaknesses, gaps, and flawed reasoning.

For each section:
1. Challenge the key claims — are they well-supported?
2. Identify missing perspectives or counterarguments
3. Point out logical gaps or unsupported leaps
4. Suggest what additional evidence would strengthen weak points
5. Rate confidence: HIGH / MEDIUM / LOW for each major conclusion

Be constructive but ruthless." \
  --file ~/research/<topic>/drafts/v1.md \
  --model gpt-5.2-pro --wait \
  2>&1 | tee ~/research/<topic>/drafts/critique.md

Adversary #2 — Claude Sonnet (different model family):

sessions_spawn task:"Review this research document as a devil's advocate.
Read ~/research/<topic>/drafts/v1.md
Challenge every assumption. Find what's missing. Identify risks.
Write critique to ~/research/<topic>/drafts/critique-2.md"
model:"anthropic/claude-sonnet-4-5" thinking:"high"

Phase 5: Revision (5-10 min)

Read both critiques
Address valid criticisms — strengthen weak arguments, add missing perspectives
Mark unresolvable disagreements as "Open Questions"
Write final document to ~/research/<topic>/output.md
If critiques revealed fundamental gaps, loop back to Phase 2 for targeted investigation

Phase 6: Delivery

Present the final document to the human
Highlight:
- Key findings / recommendations
- Confidence levels for major conclusions
- Open questions that need human judgment
- Suggested next steps
Save to ~/research/<topic>/output.md (and any supplementary materials)

ChatGPT Deep Research (Manual Mode)

ChatGPT's built-in Deep Research feature is the most powerful option for web-based research with citations. It's included in the Pro subscription (free to use) but cannot be automated via Oracle — it requires manual interaction in the browser.

When to suggest it: For major web literature reviews where the human is available to trigger it manually.

How it works:

Open chatgpt.com → select "Deep Research" from the model/agent dropdown
Enter the research question
Review and optionally edit the research plan
Wait 5-30 min for results (it browses, reads, synthesizes automatically)
Get a documented report with citations

When our skill is better: When research involves local code/specs, needs code execution, or requires custom agent coordination. ChatGPT Deep Research can't read our repos or run tests.

Hybrid approach: For mixed research, suggest the human triggers Deep Research for the web survey portion, then feed those results into our Phase 3 synthesis alongside our code/spec findings.

Output Template

# Research: [Topic Title]

**Date:** YYYY-MM-DD
**Requested by:** [who]
**Duration:** [time spent]
**Confidence:** HIGH / MEDIUM / LOW
**Models used:** [list models/tools used for each phase]

## Executive Summary
[2-3 paragraph summary of findings and recommendations]

## Problem Statement
[Clear definition of what was researched and why]

## Prior Art / Related Work
[What exists, who's done what, relevant papers/EIPs/implementations]

## Analysis
### [Sub-topic 1]
[Findings, evidence, reasoning]

### [Sub-topic 2]
[Findings, evidence, reasoning]

### [Sub-topic N]
[Findings, evidence, reasoning]

## Cross-Client Comparison (if applicable)
| Aspect | Lodestar | Lighthouse | Prysm | Teku |
|--------|----------|------------|-------|------|
| ...    | ...      | ...        | ...   | ...  |

## Proposed Approach
[Recommended solution/direction with justification]

### Alternatives Considered
[Other approaches and why they were rejected]

### Tradeoffs
[Explicit tradeoffs of the proposed approach]

## Implementation Sketch (if applicable)
[High-level design, key interfaces, data flow]

## Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
|------|-----------|--------|------------|
| ...  | ...       | ...    | ...        |

## Open Questions
[Things that couldn't be resolved and need human judgment or further research]

## Sources
[Links, references, citations]

Model Selection Guide

Role	Best Model	Fallback	Why
Scoping	Opus (me)	—	Needs judgment about what matters
Web survey	`o4-mini-deep-research` (API)	Sub-agent + web_search (free)	Purpose-built for web research with citations
Deep web research	`o3-deep-research` (API)	GPT-5.2 Pro (browser)	Most powerful web research model
Code/spec analysis	Codex CLI (xhigh)	Claude CLI / sub-agent	Best for long-horizon code investigation
Deep reasoning	GPT-5.2 Pro (Oracle browser)	GPT-5.2 Pro (API, with approval)	Strongest reasoning for novel analysis
Cross-client survey	Sub-agent (surveyor)	—	Needs GitHub access, code reading
Adversary #1	GPT-5.2 Pro (Oracle browser)	—	Strongest adversarial reasoning
Adversary #2	Claude Sonnet (thinking:high)	—	Different model family = different blind spots
Synthesis	Opus (me)	—	Quality control, coherent narrative
Manual deep research	ChatGPT Deep Research (browser)	—	Most powerful but requires human to trigger

Cost Reference

Model	Input	Output	Notes
GPT-5.2 Pro (browser)	Free	Free	Pro subscription, via Oracle bridge
GPT-5.2 Pro (API)	~$0.03/query	~$0.09/query	Needs user approval
`o4-mini-deep-research`	$1.10/1M	$4.40/1M	Cheaper deep research
`o3-deep-research`	$10/1M	$40/1M	Most powerful deep research
Sub-agents (Claude)	Session cost	Session cost	Included in OpenClaw

Rules:

Always try free options first: web_search + web_fetch + sub-agents. Only escalate to API models if manual search is genuinely insufficient.
o3-deep-research and o4-mini-deep-research require explicit approval from Nico before every use. These are expensive — ask first, explain why manual search wasn't enough, and get a "yes" before running.
GPT-5.2 Pro browser mode is free (Pro subscription) — use it freely for reasoning tasks.
GPT-5.2 Pro API mode (~$0.09/query) — acceptable for occasional use but prefer browser mode.

Self-Healing

If something fails during research:

Oracle browser mode fails (token expired): Alert user immediately. Do NOT silently fall back to API. Only use --engine api with explicit user approval.
Oracle bridge won't start: Kill stale processes (pkill -f "chromium.*headless"), check ~/.oracle/chatgpt-cookies.json exists, reinstall browser if needed (python3 -m rebrowser_playwright install chromium). See skills/oracle-bridge/SKILL.md for full troubleshooting.
Oracle completely unavailable (no bridge, no API key): Fall back to sub-agents with thinking:high for deep reasoning.
Deep research API model fails: Fall back to sub-agent + web_search approach (free, just slower).
Web search returns nothing: Try alternative search queries, check specific repos/forums directly.
Sub-agent times out: Retry with a narrower scope or split the task.
Source contradictions: Document both perspectives, flag for human judgment.
Scope creep: If a sub-question opens up a rabbit hole, note it in "Open Questions" rather than derailing the main research.

After each research run, update this skill:

If a tool/approach consistently fails, document the failure and alternative
If a new tool or source proves valuable, add it to the workflow
If the output template needs adjustment based on feedback, update it

Iteration

Research is rarely one-shot. The skill supports iterative deepening:

"Go Deeper" Loop

When the human says "go deeper on X":

Extract the specific area from the previous output
Re-enter at Phase 1 with a narrowed scope focused on X
Use previous findings as context for the new investigation
Produce an updated document that integrates both rounds

Follow-up Research

When new information emerges after initial research:

Read the previous output from ~/research/<topic>/output.md
Identify what's changed or what new information is available
Run targeted Phase 2 investigation on the delta
Revise the document (don't start from scratch)

Research Chains

Some topics naturally lead to follow-up questions:

After delivering output, explicitly note "This research suggests the following follow-up investigations: ..."
The human can trigger any of these as new research tasks
Link related research documents together via references

Oracle Quick Reference

source ~/.nvm/nvm.sh && nvm use 22

# --- BROWSER MODE (default — uses ChatGPT Pro subscription, free) ---

# 1. Ensure bridge is running (see skills/oracle-bridge/SKILL.md)
curl -s http://localhost:9222/json/version | grep -q Chrome || {
  echo "Start bridge first!"
  echo "source ~/camoufox-env/bin/activate"
  echo "python3 ~/.openclaw/workspace/research/oracle-bridge-v3.py --cookies ~/.oracle/chatgpt-cookies.json &"
}

# 2. Run queries
ORACLE_REUSE_TAB=1 oracle --engine browser \
  --remote-chrome localhost:9222 \
  -p "Your prompt" --file path/to/context.md \
  --model gpt-5.2-pro --wait

# --- API MODE (fallback — costs per query, needs user approval) ---

# Standard reasoning
oracle --engine api -p "Your prompt" --file context.md --model gpt-5.2-pro

# Deep research (web survey with citations — needs approval for API cost)
oracle --engine api -p "Research [topic] comprehensively" --model o4-mini-deep-research
oracle --engine api -p "Research [topic] comprehensively" --model o3-deep-research

# Dry run (preview without spending tokens)
oracle --dry-run summary -p "Your prompt" --file context.md

Browser mode: Requires oracle-bridge running + valid session token at ~/.oracle/chatgpt-cookies.json.
API mode: Requires OPENAI_API_KEY (set in ~/.bashrc). Only use as explicit fallback with user approval.

See skills/oracle-bridge/SKILL.md for full bridge setup, troubleshooting, and token refresh.

Notes

Always create ~/research/<topic-slug>/ for each research task — keeps outputs organized and referenceable
Save intermediate findings — if a session crashes, you don't lose work
Time-box phases — if Phase 2 is taking >30 min, wrap up what you have and move to synthesis
Human in the loop — Phase 0 (scoping) is mandatory. Don't skip it, even for "obvious" topics
Classify before routing — the Research Type Classification section is the key improvement. Use it.
Quality > Speed — this skill is designed for depth, not quick answers. Take the time needed.
ChatGPT Deep Research — suggest it for major web surveys when the human is available. It's the most powerful option and free with Pro sub.

deep-research

Install

Deep Research Skill

Prerequisites

Related Skills

⚠️ Always Save Oracle Output to File

Oracle Engine Priority

Starting the Oracle Bridge

Research Type Classification

Type A: Web Literature / Ecosystem Survey

Type B: Codebase / Spec Analysis

Type C: Deep Reasoning / Novel Analysis

Type D: Cross-Client Comparison

Workflow

Phase 0: Scoping (5-10 min) — MANDATORY

Phase 1: Decomposition (5 min)

Phase 2: Parallel Investigation (15-30 min)

Phase 3: Synthesis (10-15 min)

Phase 4: Adversarial Critique (10-15 min)

Phase 5: Revision (5-10 min)

Phase 6: Delivery

ChatGPT Deep Research (Manual Mode)

Output Template

Model Selection Guide

Cost Reference

Self-Healing

Iteration

"Go Deeper" Loop

Follow-up Research

Research Chains

Oracle Quick Reference

Notes

Categories

Install

Recommended Skills