xcdd

openclaw-docker-linux

"Install and troubleshoot OpenClaw on Linux using Docker Engine and Docker Compose v2, including remote SSH execution, model configuration generation (`openclaw.json`), strict checklist-driven autonomous debugging, mandatory CLI command coverage (`models list`, `gateway restart`, `tui`), local CLI Control UI validation, and live incident recovery. Use when users ask for Docker-based OpenClaw setup, first-run pairing, dashboard token setup, model/provider onboarding, or fixes for pairing/auth errors such as unauthorized and disconnected (1008): pairing required."

xcdd 2 Updated 2mo ago
GitHub

Install

npx skillscat add xcdd/openclaw-docker-install-skill

Install via the SkillsCat registry.

SKILL.md

OpenClaw Docker Linux

Use this skill to deliver a reliable Linux Docker installation flow for OpenClaw, execute on remote Linux hosts over SSH, and close incidents with evidence-based debugging.

Workflow

  1. Initialize checklist and execution guardrails (mandatory):
  • Treat ./CHECKLIST.template.md as read-only.
  • Before any runtime command, create a run file under ./runs/ and write progress there.
  • Language gate (mandatory): checklist template and all run files must use Simplified Chinese (简体中文) for descriptive text, section headers, evidence notes, and next-action fields. Status values (PENDING, IN_PROGRESS, PASS, etc.) and command/code blocks remain as-is.
  • Do not generate unrelated files in workspace. Write only to the run checklist file unless user explicitly asks otherwise.
  • Do not stop at partial progress. Continue until all mandatory gates pass.
  • Only pause for human-required actions (for example: SSH password/MFA/host-key, external firewall console, browser-only verification).
  • Enforce strict sequence: execute mandatory steps in numeric order only (1 -> 2 -> 3 ...).
  • Step N cannot be marked IN_PROGRESS or PASS before Step N-1 is PASS.
  • Hard execution gate: Step N commands/evidence collection must not run before Step N-1 is PASS.
  • Before each command batch, read the run checklist and identify exactly one current mandatory step.
  • Do not run commands from two different mandatory steps in one batch; finish current step evidence, update checklist status, then move on.
  • While current step is not PASS, only run commands that remediate the current step (plus allowed recovery probes).
  • If a later-step command is executed early, treat as a sequence violation: discard that evidence for completion, reset first invalid step and all downstream mandatory steps to PENDING, and resume from the first invalid step.
  • Recovery probe exception: when session state is unknown, only run the minimal recovery probes from references/checklist-autonomy-and-recovery.md; do not treat probe output as later-step completion evidence.
  • If out-of-order updates are detected, reset the first invalid step and all downstream mandatory steps to PENDING, then resume from the first invalid step.
  • Docker boundary hard guard (incident hardening):
    • For Docker deployments, do not install/upgrade host-level openclaw CLI or host npm global tools automatically.
    • Do not run host package-manager mutations (npm install -g ..., pnpm add -g ..., etc.) unless user explicitly requests host-side migration and confirms they accept host-environment changes.
    • Any host-side interactive onboarding command (for example feishu-plugin-onboard install) must be executed by the user in their own terminal via Operator Relay mode; agent should provide commands and validate results only.
  • Track conditional optional gates explicitly in the run checklist:
    • O1 IM plugin onboarding (Feishu/Mattermost/MS Teams).
    • O2 ClawHub skill-pack installation.
    • O3 Memory Search enablement and validation.
  • Evidence quality gate (mandatory for every step):
    • Record exact command(s), timestamp (UTC), key output, and the decision taken from that output.
    • Distinguish fact vs interpretation explicitly when root-cause is not yet confirmed.
  • Secret hygiene gate (mandatory for checklist and chat evidence):
    • Never write raw API keys/tokens/passwords/private keys in run files.
    • Use redacted placeholders such as __OPENCLAW_REDACTED__.
    • Never copy redacted placeholders back into runtime config as actual credential values.
  • Promotion rule: if user explicitly requests O1, O2, or O3 in current run, that gate is promoted to required-for-closure and cannot remain PENDING at completion.
  • Optional-gate order rule (mandatory): O1/O2/O3 commands must run only after checklist mandatory Steps 1-10 are PASS.
  • If O1/O2/O3 commands are executed before mandatory Steps 1-10 are PASS, treat as sequence violation: discard optional evidence for completion, finish mandatory chain first, then re-run optional gates.
  1. Confirm remote access and execution mode:
  • Require user to store SSH credentials locally in ~/.ssh (config, private key, known_hosts).
  • Tell user the exact local path before asking for alias:
    • Linux/macOS: ~/.ssh/
    • Windows: %USERPROFILE%\\.ssh\\ (for example, C:\\Users\\<username>\\.ssh\\)
  • Clarify password handling: OpenSSH config does not support a Password field; store passwords in local password manager, not in chat.
  • Never request or accept plaintext passwords/private keys in chat.
  • Ask only for an SSH host alias (for example, openclaw-prod) defined in ~/.ssh/config.
  • Enforce connection gate: never run ssh <alias> immediately after receiving alias.
  • Resolve and show alias target first (hostname, user, port, identity file) and ask explicit user confirmation before connecting.
  • Support two auth paths:
    • Key auth (preferred).
    • Password auth with local-only handling (interactive prompt or local password file flow from reference).
  • If password login is the only available entry path, run a one-time bootstrap: login by password in user terminal, add local public key to remote ~/.ssh/authorized_keys, then switch to key auth.
  • Enforce execution mode split:
    • Non-interactive commands can run by agent.
    • Any interactive SSH flow (password prompt, host-key yes/no, sudo prompt) must run in user terminal via Operator Relay mode.
  • Prefer SSH key auth and persistent shell options.
  • Start a stable remote session before install/debug work.
  1. Confirm Docker prerequisites:
  • Ask for distro only if installation commands are needed.
  • Verify docker --version and docker compose version.
  1. Install OpenClaw with Docker:
  • Prefer official repo + docker-setup.sh.
  • If needed, set OPENCLAW_IMAGE=ghcr.io/openclaw/openclaw:latest before setup.
  • Before any image download (docker pull, docker compose pull, docker compose up --pull, docker-setup.sh), ask user whether to use a mirror accelerator.
  • For users in mainland China, present both Docker Hub mirrors and GHCR mirrors from reference docs and wait for user choice before continuing.
  • Enforce image dedupe guard before download: inspect existing OpenClaw image inventory and skip pull when target image already exists and no refresh was requested.
  • For image pull/build/deploy waiting, do not set command timeout. Use blocking execution and log streaming until completion.
  1. Generate and validate model configuration (openclaw.json):
  • Collect provider/model information with openclaw configure --section models (preferred) or provider-specific model listing commands.
  • For CLI-driven checks, run openclaw models list --all --provider <provider> --json and confirm the provider returns model entries.
  • Verify resulting defaults and provider entries with openclaw config get ... --json.
  • Validate with openclaw doctor and openclaw models status --json before continuing.
  • For Docker deployments, enforce in-container model endpoint reachability before agent probe:
    • Run HTTP checks from inside openclaw-gateway container, not only from host.
    • If host curl succeeds but in-container curl fails, treat as Docker network topology issue (not model slowness).
    • Prefer same-network service DNS (for example http://new-api:3000/) over host LAN IP when model service is containerized.
  • Enforce API-protocol compatibility gate before timeout tuning:
    • If upstream rejects /v1/chat/completions with messages like Unsupported legacy protocol and asks for /v1/responses, treat as protocol mismatch, not latency.
    • Set provider API type to openai-responses, then re-run model status + agent probe.
    • Do not spend cycles tuning timeout first when protocol mismatch is confirmed.
  • Enforce model readiness pass gate before marking this stage done:
    • Confirm default primary model via openclaw config get agents.defaults.model.primary --json.
    • If openclaw models status --json includes missingProvidersInUse, it must be empty.
    • Run non-interactive probe openclaw agent --message "Reply exactly with MODEL_SETUP_OK." and require MODEL_SETUP_OK in output.
    • Never treat openclaw models list ... alone as model setup success.
  1. Run mandatory CLI command validation and TUI acceptance gate:
  • This skill is for remote Linux execution. Do not run OpenClaw runtime commands on the local authoring machine.
  • Checklist hard gate: do not execute checklist Step 6 commands until checklist Step 5 is PASS.
  • Run openclaw models list --all --provider <provider> --json as a real provider reachability check.
  • Run openclaw gateway restart, then verify runtime status with openclaw gateway status --json.
  • In Docker deployments, openclaw gateway restart may print a systemd-context notice (for example Gateway service disabled.); treat as non-failure only when openclaw gateway status --json confirms rpc.ok=true.
  • Checklist hard gate: do not execute checklist Step 7 (tui) until checklist Step 6 is PASS.
  • Run TUI smoke test with CLI-first execution order:
    • First try CLI-driven TUI probe with hang guard: timeout 10s openclaw --no-color tui --message "Reply exactly with TUI_SMOKE_OK." --timeout-ms <ms> > /tmp/openclaw-tui-smoke.txt 2>&1 || true then grep -q "TUI_SMOKE_OK" /tmp/openclaw-tui-smoke.txt.
    • Default outer timeout for this skill is fixed at 10s to avoid post-success TUI input-box hang.
    • If CLI-driven probe cannot provide usable evidence in current environment, fallback to interactive openclaw tui.
    • Any interactive openclaw tui session must run in user terminal via Operator Relay mode.
  • Use openclaw agent --message ... only as a non-interactive diagnostic fallback.
  • Never declare installation complete until the TUI smoke test passes, and record evidence (commands + output).
  • Do not mark Step 7 as BLOCKED_HUMAN before attempting CLI-driven TUI probe.
  1. Validate Control UI from local CLI and invite human verification:
  • For remote install, start SSH local forwarding and test from local CLI (curl against http://127.0.0.1:18789/) instead of relying only on remote shell checks.
  • Confirm local CLI can load Control UI endpoint successfully before asking user to open browser.
  • If localhost/tunnel path passes but LAN URL fails with device identity required, classify as secure-context/origin issue and continue with Step 7 remediation (do not misclassify as generic gateway down).
  • Invite user to test Control UI manually and confirm result in chat.
  • Record local CLI gate + human confirmation evidence in checklist before advancing.
  1. Auto-remediate post-install Control UI secure-context issues (mandatory):
  • Always run detection checks (logs + UI response text) from the reference before declaring success.
  • Remediation priority:
    • Prefer HTTPS for non-localhost browser access (Tailscale Serve or HTTPS reverse proxy).
    • Prefer localhost access (http://127.0.0.1:18789) via SSH tunnel when remote.
    • If user must keep plain HTTP and accepts risk, set gateway.controlUi.allowInsecureAuth: true (token/password auth only), then re-verify.
  • If user insists on LAN HTTP and accepts break-glass risk, require explicit user approval and use token mode with tokenized URL (http://<lan-host>:18789/#token=<OPENCLAW_GATEWAY_TOKEN>); rotate token and migrate to HTTPS afterward.
  • If allowInsecureAuth still leaves device identity required in current build, require HTTPS/localhost path unless user explicitly approves the break-glass token bypass and risk.
  • Re-validate by opening Control UI and checking logs/UI text again before marking done.
  1. Run live CLI interaction for incidents (autonomous closure required):
  • Keep one terminal for execution and one for logs (docker compose logs -f).
  • Collect snapshots (docker compose ps, docker compose config, service logs) before applying fixes.
  • After every fix, re-check health and pairing status.
  • For model timeouts in Docker mode, always collect dual-side evidence in the same window:
    • openclaw-gateway logs.
    • Upstream model service logs (for example new-api) to confirm whether request reached upstream.
  • If agent probe changes from LLM request timed out. to immediate auth/provider errors, treat it as forward progress and continue along auth/provider remediation path (do not revert to timeout tuning).
  • Do not apply timeout to long-running deploy operations (docker pull, docker build, docker compose up, docker-setup.sh).
  • Keep working until gates pass. Do not stop after reporting an error unless blocked by human-only action.
  1. Fix common errors:
  • For unauthorized or disconnected (1008): pairing required, run approval commands from the reference file.
  • For broader Linux Docker install/runtime blockers, use the internet-sourced playbook in references/linux-docker-common-pitfalls-and-best-practices.md and apply the matching fix path.
  • For No API key found for provider "<provider>" with custom providers, populate ~/.openclaw/agents/<agentId>/agent/auth-profiles.json for that provider (or use provider auth commands), then re-check openclaw models status --json.
  1. Mandatory closure gate before optional extensions:
  • Confirm checklist mandatory Steps 1-10 are all PASS with evidence.
  • Run final openclaw doctor --json (fallback: openclaw doctor) and ensure no runtime-impacting issue remains.
  • If warnings remain, record explicit non-blocking rationale; any runtime-impacting doctor issue must be fixed before continuing.
  • If O1/O2/O3 were requested in this run, do not declare run complete yet; continue to optional gates.
  1. Optional extension gate O1: install and validate IM channel plugins (Feishu, Mattermost, MS Teams):
  • Hard gate: Step 11 is forbidden until Step 10 is PASS.
  • If user requests Feishu integration, prefer the newer official Feishu onboarding plugin flow first (do not default to stock feishu plugin path):
    • Precheck: target host must be Linux, OpenClaw version should be >= 2026.2.26.
    • Execution boundary: in Docker mode, default path is container-only onboarding (temporary openclaw-cli shell) executed by user in Operator Relay mode.
    • Host-side package install/mutation is fallback-only and requires explicit user confirmation before execution.
    • Install onboarding CLI (feishu-plugin-onboard) and run feishu-plugin-onboard install in the selected path (Docker-first).
    • If the command is missing, provide manual install commands from reference docs and wait for user execution output.
    • If npm registry DNS errors occur (for example EAI_AGAIN), switch registry mirror and retry.
    • When install succeeds, verify openclaw plugins list shows feishu-openclaw-plugin as loaded and stock feishu as disabled.
    • If plugin scanner reports risky patterns during install, require explicit user acknowledgment before keeping plugin enabled.
    • Verify Feishu platform settings include long-connection subscription and event im.message.receive_v1; if card actions are used, also enable card callback on long connection.
    • Complete pairing with openclaw pairing approve feishu <PAIRING_CODE> --notify (or without --notify when unavailable in current build).
  • For Mattermost/MS Teams (or when user explicitly refuses the new Feishu plugin), continue with regular openclaw plugins ... path.
  • For channel configuration/testing, do not assume channels add or legacy message send --channel ... flags exist in all builds; check current CLI help/docs before execution and use plugin-specific onboarding flow when provided.
  • If channel credentials are unavailable, record explicit BLOCKED_HUMAN evidence and exact required credentials.
  • If requested by user in this run, this gate is required for closure (record PASS or explicit user waiver).
  1. Optional extension gate O2: install advanced-practice skills via ClawHub:
  • Hard gate: Step 12 is forbidden until Step 10 is PASS.
  • Clarify command split:
    • openclaw skills ... for listing/checking loaded skills.
    • clawhub install ... (or npx clawhub ... when global install is not permitted) for installing additional skill packs.
  • For Docker deployments, prefer non-root npx + managed workdir (--workdir /home/node/.openclaw) instead of global install.
  • If npm install -g clawhub fails with EACCES, do not escalate to root by default; switch to npx -y clawhub ....
  • Use non-interactive search flags when available (for example --no-input) to avoid relay stalls.
  • Ask user to choose from clawhub search results, then install at least 3 selected skills and verify via openclaw skills list + openclaw skills check.
  • If requested by user in this run, this gate is required for closure (record PASS or explicit user waiver).
  1. Post-extension regression gate (run only if Step 11 or 12 executed):
  • Re-verify runtime baseline after optional changes:
    • openclaw gateway restart
    • openclaw gateway status --json
    • openclaw doctor --json (fallback: openclaw doctor)
  • If regression appears, reopen incident flow from Step 8 and do not declare completion.
  1. Recover automatically when commands stop unexpectedly:
  • If session disconnects, command aborts, or state is unclear, run recovery probes from reference docs.
  • Infer the earliest unmet checklist step and resume from that step.
  • This recovery requirement applies even when checklist updates are missing or stale.
  1. Keep instructions current:
  • Treat Docker/OpenClaw steps as time-sensitive.
  • Re-check official docs when user asks for latest behavior or flags.
  • When command syntax/flags are uncertain, query live docs with openclaw docs <keywords> and open the linked https://docs.openclaw.ai page before proceeding.
  1. Optional extension gate O3 (last): enable and validate Memory Search
  • Hard gate: Step 16 is forbidden until mandatory Steps 1-10 are PASS.
  • Activate memory plugin slot and agent defaults in openclaw.json:
    • plugins.slots.memory must be memory-core (not none).
    • agents.defaults.memorySearch.enabled must be true.
    • Configure embeddings provider/model under agents.defaults.memorySearch (or remote endpoint when using OpenAI-compatible upstream).
  • Recommended Docker-safe baseline for compatibility:
    • Disable batch embeddings unless upstream confirms Batch API support: agents.defaults.memorySearch.remote.batch.enabled: false.
    • Ensure embeddings endpoint is reachable from inside openclaw-gateway container (prefer same-network DNS like http://new-api:3000/v1/ over host LAN IP when upstream is containerized).
  • File indexing scope and source readiness:
    • Place memory content in MEMORY.md and/or memory/**/*.md.
    • If needed, add extra include roots with agents.defaults.memorySearch.extraPaths.
  • Required acceptance commands/evidence:
    • openclaw memory status --deep --index --verbose (index builds and provider/auth are healthy).
    • openclaw memory search --query "<known phrase from MEMORY.md>" (returns expected hit).
    • openclaw gateway restart after config changes, then re-run memory status/search checks.
  • Credential integrity gate (mandatory):
    • Ensure agents.defaults.memorySearch.remote.apiKey is a real key and not a redacted placeholder string (for example __OPENCLAW_REDACTED__).
    • Keep real key hidden in evidence output.
  • Failure triage priorities:
    • Provider auth/key missing for embeddings.
    • Upstream endpoint supports chat but not /v1/embeddings.
    • Config written at wrong path (must be agents.defaults.memorySearch, not top-level memorySearch).
    • Memory plugin slot misconfigured (plugins.slots.memory not memory-core).
  • Warning-source separation:
    • Errors from web_search toolchain (for example missing Brave Search API key) are not Memory Search failures by themselves.
    • Judge O3 pass/fail only from openclaw memory status/search acceptance evidence.
  • If requested by user in this run, this gate is required for closure (record PASS or explicit user waiver).

Commands Reference

Load references/openclaw-docker-linux.md for install and pairing commands.
Load CHECKLIST.template.md as read-only template. Always copy to a run file before writing progress.
Load references/remote-cli-debug.md for SSH, live interaction, and debug workflow.
Load references/openclaw-model-config-and-image-hygiene.md for model/provider data collection, openclaw.json generation, and image duplication checks.
Load references/linux-docker-common-pitfalls-and-best-practices.md for official-doc-sourced Linux Docker deployment pitfalls, fixes, and best practices.
Load references/hello-world-plugins-and-skills.md for mandatory command coverage (models list, gateway restart, tui) and mandatory TUI acceptance gates, plus optional IM plugin onboarding (Feishu/Mattermost/MS Teams) and ClawHub skill-pack installation.
Load references/checklist-autonomy-and-recovery.md for strict non-stop execution policy, checklist run-file rules, and recovery-point logic.
Load references/control-ui-secure-context-and-firewall.md for automatic post-install remediation of secure-context warnings and firewall allow rules.