Git-Fg

browsing-web

"Interactive browser automation using agent-browser. Use when navigating dynamic sites, authentication, clicking, typing, and complex state navigation. Do NOT use for simple read-only text extraction."

Git-Fg 1 Updated 4mo ago
GitHub

Install

npx skillscat add git-fg/thecattoolkit/browsing-web

Install via the SkillsCat registry.

SKILL.md

Browser Interaction Protocol

Core Loop (The Ref Pattern)

You interact with the browser using References (@refs) derived from snapshots, not CSS selectors.

  1. Navigate: agent-browser open "url"
  2. Snapshot: agent-browser snapshot -i (Gets accessibility tree with @e refs)
  3. Interact: agent-browser click @e1 (Uses ref from snapshot)

Critical Constraints

  1. Never Guess Selectors: You cannot guess @e1. You MUST run snapshot to see current refs.
  2. Interactive Only: Always use snapshot -i to filter non-interactive elements (saves tokens).
  3. Stateful: The browser persists between commands. You do not need to re-open.

Common Patterns

Navigation & extraction

agent-browser open "https://google.com"
agent-browser snapshot -i
# Output shows: [ref=e4] button "Search"
agent-browser fill @e2 "Claude Code"
agent-browser click @e4
agent-browser wait --load networkidle

Visual Verification

Only if structure is confusing:

agent-browser screenshot page.png