Browser automation for web testing, form filling, screenshots, and data extraction. Use when navigating websites, interacting with web pages, filling forms, taking screenshots, testing web applications, or extracting information from web pages.
Install
npx skillscat add bentossell/skill-agent-browser Install via the SkillsCat registry.
SKILL.md
Browser Automation with agent-browser
Headless browser automation CLI for AI agents. Fast Rust CLI with Node.js fallback.
Installation
npm install -g agent-browser
agent-browser install # Download ChromiumQuick Start
agent-browser open <url> # Navigate to page
agent-browser snapshot -i # Get interactive elements with refs
agent-browser click @e1 # Click element by ref
agent-browser fill @e2 "text" # Fill input by ref
agent-browser close # Close browserCore Workflow
- Navigate:
agent-browser open <url> - Snapshot:
agent-browser snapshot -i(returns elements with refs like@e1,@e2) - Interact using refs from the snapshot
- Re-snapshot after navigation or significant DOM changes
Commands
Navigation
agent-browser open <url> # Navigate to URL
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload page
agent-browser close # Close browserSnapshot (Page Analysis)
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -c # Compact output
agent-browser snapshot -d 3 # Limit depth to 3Interactions (Use @refs from Snapshot)
agent-browser click @e1 # Click
agent-browser dblclick @e1 # Double-click
agent-browser fill @e1 "text" # Fill input
agent-browser type @e1 "text" # Type character by character
agent-browser select @e1 "option" # Select dropdown option
agent-browser check @e1 # Check checkbox
agent-browser uncheck @e1 # Uncheck checkbox
agent-browser hover @e1 # Hover over element
agent-browser focus @e1 # Focus element
agent-browser press Enter # Press key
agent-browser scroll @e1 down 200 # Scroll elementScreenshots & Media
agent-browser screenshot page.png # Full page screenshot
agent-browser screenshot @e1 element.png # Element screenshot
agent-browser pdf output.pdf # Save page as PDFGetting Data
agent-browser get text @e1 # Get element text
agent-browser get value @e1 # Get input value
agent-browser get attr @e1 href # Get attribute
agent-browser get html @e1 # Get inner HTML
agent-browser eval "document.title" # Run JavaScriptWaiting
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --url "**/dashboard" # Wait for URL pattern
agent-browser wait --text "Success" # Wait for text to appear
agent-browser wait @e1 # Wait for element
agent-browser wait @e1 --hidden # Wait for element to hideSessions (Parallel Browsers)
agent-browser --session test1 open site-a.com
agent-browser --session test2 open site-b.com
agent-browser session listAuthentication State
agent-browser state save auth.json # Save cookies/storage
agent-browser state load auth.json # Restore stateExample: Form Submission
agent-browser open https://example.com/form
agent-browser snapshot -i
# Output: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # Check resultExample: Login and Save State
# Login once
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "username"
agent-browser fill @e2 "password"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
# Later: restore authenticated session
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboardExample: Content Review
Review a web page by taking snapshots and interacting:
# Open page to review
agent-browser open http://localhost:3456/cookbook/what-is-cli/
# Get interactive elements
agent-browser snapshot -i
# Output shows buttons, links, inputs with refs
# Test interactive elements
agent-browser click @e75 # Click a suggestion button
# Take screenshot for visual review
agent-browser screenshot review.png
# Re-snapshot to verify state changed
agent-browser snapshot -iAlternative Selectors
When refs aren't available, use CSS selectors or semantic search:
agent-browser click "#submit"
agent-browser fill "#email" "test@example.com"
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"JSON Output
Add --json for machine-readable output:
agent-browser snapshot -i --json
agent-browser get text @e1 --jsonDebugging
agent-browser open example.com --headed # Show browser window
agent-browser console # View console messages
agent-browser errors # View page errorsTips
- Always snapshot first - Get refs before interacting
- Re-snapshot after navigation - Refs change when page updates
- Use
-iflag - Interactive-only snapshots are cleaner - Save auth state - Avoid re-logging in for each session
- Use
--json- When parsing output programmatically