Control Pirate Dock — a bespoke Docker container for VPN-protected ebook and media downloads. Searches Anna's Archive and torrent indexers behind NordVPN. Auto-downloads via torrent when possible, sends manual download links when automation can't bypass CAPTCHAs.
Resources
10Install
npx skillscat add djmcnay/araminta-skills-pirate-dock Install via the SkillsCat registry.
Pirate Dock Skill
Overview
A custom-built Docker container running NordVPN (South Africa, P2P), aria2 for downloads, and Jackett for multi-site torrent search. Two parallel pipelines: Anna's Archive for ebooks, Jackett for torrents/video.
Architecture: Headless first (HTTP scraping), browser fallback via Playwright + Chromium running inside the container when CAPTCHAs, DDoS-Guard, login, or visual confirmation block the normal path. All traffic stays behind NordVPN inside the container. Bridge networking isolates the container's VPN from the host Pi. The human-in-the-loop display is not optional: Minty must be able to send David a URL that shows the container browser.
Project home: ~/Documents/GitHub/pirate-dock/
Container name: pirate-dock
API: http://localhost:9876 (published port — no docker exec needed)
Jackett UI: http://localhost:9118 (published port)
Human browser display: https://araminta.taild3f7b9.ts.net/pirate/vnc_lite.html?path=pirate%2F (public HTTPS via Tailscale Funnel)
Browser fallback: Container-local Playwright/Chromium — launched inside pirate-dock; zero host CDP/browser dependency. Headed Chromium uses container display :1, x11vnc exports it on localhost:5900, websockify bridges VNC→WebSocket on 0.0.0.0:6081 and serves the noVNC HTML5 client from /usr/share/novnc. The path URL parameter ensures WebSocket traffic routes through Tailscale Funnel's /pirate/ prefix.
Tailscale Funnel invariant: https://araminta.taild3f7b9.ts.net/pirate/ must proxy to http://127.0.0.1:6081. Check with sudo tailscale funnel status. Repair with sudo tailscale funnel --bg --https=443 --set-path=/pirate 6081. Use Funnel for browser access, and do not overwrite unrelated root routes on https://araminta.taild3f7b9.ts.net/.
Container Management
Build & start
cd ~/Documents/GitHub/pirate-dock
# ALWAYS use the safe build script (checks disk space)
bash scripts/build.shAfter container start: run.sh auto-whitelists the Docker bridge subnet (172.16.0.0/12) and published ports (9876, 9118, 6081) inside NordVPN's killswitch. This is required for the host Pi and Tailscale Funnel to reach the FastAPI/Jackett/websockify endpoints while NordVPN is active. If you manually change ports or networking, the whitelist must match.
Browser display URL
This is the URL Minty should send by WhatsApp when human intervention is needed:
https://araminta.taild3f7b9.ts.net/pirate/vnc_lite.html?path=pirate%2FThe path=pirate%2F parameter is critical — it tells noVNC to route its WebSocket through /pirate/ so Tailscale Funnel can proxy it correctly. Without it, noVNC connects to the root WebSocket path and Funnel drops it.
noVNC display stack (inside container)
When automation hits a visual challenge, browser_fallback.py launches Chromium in headed mode on container display :1. run.sh starts the full display stack:
Xvfb :1 → virtual framebuffer (1280x800x24)
x11vnc -display :1 → exports display as VNC on localhost:5900
websockify :6081 :5900 --web=/usr/share/novnc → bridges VNC→WebSocket, serves noVNC HTMLThe user connects through https://araminta.taild3f7b9.ts.net/pirate/vnc_lite.html?path=pirate%2F → Tailscale Funnel strips /pirate/ prefix → reaches websockify on :6081 → bridges to x11vnc on :5900 → displays Xvfb :1 with Chromium visible.
Check status
curl -sf http://localhost:9876/status | python3 -m json.toolView logs
docker logs pirate-dock --tail 50Stop
cd ~/Documents/GitHub/pirate-dock
docker compose downCleanup (if disk space low)
bash scripts/prune-docker.sh # Standard cleanup
bash scripts/prune-docker.sh --aggressive # Nuclear optionSkill Tests
# Functional tests (run from host — container must be running)
python3 scripts/test.py
# Host isolation safety tests (MUST pass before any docker-compose.yml changes)
python3 scripts/test_isolation.pyFunctional tests (test.py) cover:
- Anna's Archive search — finds a book and generates download links
- Jackett torrent search — lists top 10 UFC video results
- Download lifecycle — starts a torrent download, cancels it, deletes partial files
Isolation tests (test_isolation.py) cover:
- Host iptables are never modified by the container (baseline, while running, after stop)
- Host Pi can always reach Discord, GitHub, Google while container runs
- Container is confirmed on bridge networking (not host)
- API and Jackett are accessible via published ports
API Reference (http://localhost:9876)
VPN
| Method | Endpoint | Body | Description |
|---|---|---|---|
| GET | /status |
— | VPN + Jackett status |
| POST | /vpn/connect |
{"country": "South_Africa"} |
Connect VPN |
| POST | /vpn/disconnect |
— | Disconnect VPN |
Anna's Archive (eBooks)
| Method | Endpoint | Body | Description |
|---|---|---|---|
| GET | /search/annas-archive?q=... |
— | Search books by title/author/ISBN |
| POST | /search/annas-archive |
{\"query\": \"...\"} |
Search (POST version) |
| GET | /download/annas-archive/{md5} |
— | Get download info by MD5 |
| POST | /download/annas-archive |
{\"md5\": \"...\"} |
Download by MD5 hash |
| POST | /download/annas-archive/browser |
{\"md5\": \"...\"} |
Browser fallback — navigate with container Playwright, wait for human CAPTCHA solve if needed |
| GET | /download/annas-archive/{md5}/browser |
— | Browser fallback (GET convenience) |
| GET | /browser/status |
— | Check if container browser stack is available and return the display URL |
Torrent Search (via Jackett)
| Method | Endpoint | Body | Description |
|---|---|---|---|
| GET | /search/torrents?q=... |
— | Search all configured indexers |
| GET | /search/piratebay?q=... |
— | Search PirateBay only |
| GET | /search/1337x?q=... |
— | Search 1337x only |
| GET | /search/ext?q=... |
— | Search ext.to only |
| GET | /jackett/indexers |
— | List available indexers |
Downloads
| Method | Endpoint | Body | Description |
|---|---|---|---|
| POST | /download/magnet |
{"magnet": "..."} |
Download via aria2 |
| GET | /downloads/active |
— | Running aria2 processes |
| GET | /downloads/list |
— | Files in /downloads |
UFC Watch (background poller)
| Method | Endpoint | Body | Description |
|---|---|---|---|
| POST | /watch/ufc |
{"event": "UFC 327"} |
Start watching for event |
| GET | /watch/ufc |
— | Status of all watches |
| DELETE | /watch/ufc/{key} |
— | Stop watching |
Core Workflow: Book Request
When a user asks for a book (by title, Amazon link, Goodreads link, ISBN, or MD5), follow this sequence:
Step 1: Identify the book
- If given a URL (Amazon, Goodreads): resolve it to get the ISBN/title
- If given a title: use it directly
Step 2: Search both pipelines in parallel (behind NordVPN)
Anna's Archive: GET /search/annas-archive?q=<title+author+isbn>
Torrent: GET /search/torrents?q=<title+author+isbn>Step 3: Evaluate results
A) Torrent found with seeders ≥ 2 → AUTO-DOWNLOAD
POST /download/magnet {"magnet": "<best_magnet_link>"}Report: title, size, seeders. The file will land in /downloads inside the container.
B) No torrent with seeders → SEND ANNA'S ARCHIVE LINK
- Report what was found: title, MD5, file size, available formats (EPUB/PDF/MOBI)
- Provide the Anna's Archive page link:
https://annas-archive.gl/md5/{md5} - List the mirror links (
.gl,.pk,.gd) - User clicks the link and completes the download manually
C) Nothing found on either pipeline → report that clearly
Step 4: Report results
Always provide:
- What the book is (title, author, format, size)
- Auto-download status (downloaded / link to click)
- Anna's Archive link (always include, as fallback)
- Torrent details (if applicable: seeders, source indexer)
Download Reality
What works automatically:
- ✅ Anna's Archive search — reliable scraping of search results
- ✅ Jackett torrent search — TPB, 1337x, LimeTorrents, YTS, EZTV configured
- ✅ aria2 downloads via magnet — fast when seeders are healthy
- ✅ Container infrastructure — VPN, Playwright/Chromium, API, Jackett all operational
What does NOT work automatically:
- ❌ Anna's Archive free book download automation (2026-04-26). The container-local browser stack works, but AA's book page DOM has changed. The "Slow Partner Server" button no longer exists. Z-Library mirrors (
.gd,.se,.li) return 503 or redirect to parking pages.
The browser fallback flow (book page navigation) — currently STALLED:
- Headless scraping tries first (fast, no browser needed)
- If blocked by CAPTCHA/DDoS-Guard or no direct links, fall back to browser mode automatically
- Playwright launches Chromium inside
pirate-dock; every request stays inside the container's NordVPN tunnel - Use the working mirror
https://annas-archive.gl(the.limirror was redirecting to parking/spam) - Match the browser fingerprint to South Africa: timezone
Africa/Johannesburg, localeen-GB, user-agentMozilla/5.0 (X11; Linux aarch64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36 - BLOCKED HERE: AA's Downloads page now shows Z-Library mirrors, but Z-Library itself is down (503). The old "Slow Partner Server" path is gone.
- Human-in-the-loop fallback. When automation fails, Chromium can launch in headed mode on display
:1, visible throughhttps://araminta.taild3f7b9.ts.net/pirate/vnc_lite.html?path=pirate%2F. A human can interact with the browser inside the VPN tunnel to solve CAPTCHAs or click download links manually. The script then captures the resulting download URL or file. - File downloads via
curl --insecure --locationto/downloads— only works if a valid download URL is found
What does NOT work automatically (known gap):
- ❌ AA free book downloads — need a new download path or redesign to torrent/tor mirrors
- ❌ Title parsing is still flaky in some AA layouts — MD5 extraction works reliably, but titles can still show as "Unknown" when the nested DOM shifts.
Architecture note: The container keeps the browser work inside the same network namespace as NordVPN; no host CDP browser stack is required for the AA flow. Bridge networking is still the right choice. Host networking remains forbidden — see safety note below.
Seeder count caveat: Torznab seeder counts (especially from TPB) may show 0 even when torrents are alive and downloadable. Always try downloading before reporting "no seeders" to the user.
Workflow: Torrent Search (Jackett)
- Jackett runs inside the container with 619 indexer definitions loaded
- Configured public indexers: The Pirate Bay, 1337x, LimeTorrents, YTS, EZTV
- Search:
GET /search/torrents?q=UFC+327 - Returns results with title, size, seeders, magnet link
- Download:
POST /download/magnet {"magnet": "magnet:..."}
To add more indexers: access Jackett web UI at http://localhost:9118 from a machine that can reach the Pi.
Workflow: UFC Event Watch
POST /watch/ufc {"event": "UFC 327", "quality": "1080"}- Background poller searches all indexers every 5 min
- Filters: event name match + quality (1080p) + seeders >= 2
- Check status:
GET /watch/ufc - When found:
POST /download/magnetwith best match - Stop watching:
DELETE /watch/ufc/ufc_327
Credentials & Config
- NORDVPN_TOKEN: In
~/Documents/GitHub/pirate-dock/.env+ bind-mounted asscripts/token.txt - Jackett API key: Auto-detected on startup from
/data/jackett/ServerConfig.json - Jackett config: Via web UI at
http://localhost:9118(inside container) - NordVPN default region: South Africa (NordLynx P2P)
- Anna's Archive secret key: In memory — free account, used for authenticated browsing (metadata only, downloads still CAPTCHA-gated)
Known Issues — RESOLVED
- VPN login bug (2026-04-14). Token read from bind-mounted file (
/run/pirate-dock/token) instead of env vars (s6-overlay strips env vars). - Jackett indexers (2026-04-14). TPB, 1337x, LimeTorrents, YTS, EZTV all enabled and returning results.
- Anna's Archive search URL (2026-04-14). AA changed
/s?q=to/search?q=. Fixed inserver.py. - Anna's Archive HTML parser (2026-04-14). New UI uses
.js-aarecord-list-outercontainer with flex/border-b child divs. Updated_parse_annas_search(). - Jackett startup deadlock (2026-04-14).
_start_jackett()now checks for already-running Jackett before starting a new process; accepts HTTP 302 in addition to 200 (Jackett returns 302 for the indexers endpoint). - NordVPN killswitch leaked to host Pi (2026-04-17). RESOLVED. Root cause:
network_mode: host+CAP_NET_ADMINcaused NordVPN's iptables killswitch to apply to the Pi's own network namespace, blocking Discord, GitHub, and all non-local Pi connectivity for ~12 hours. Fix: switched to bridge networking — NordVPN's killswitch now operates inside the container's own namespace and physically cannot affect the host. Thetest_isolation.pysuite is a regression guard. - Docker bridge + killswitch blocked host-to-container API (2026-04-26). RESOLVED. When NordVPN connects inside the container with killswitch enabled, Docker bridge traffic (from host
172.19.0.1) was dropped. Fix:run.shnow auto-whitelists the Docker bridge subnet and published ports (9876, 9118, 6081) vianordvpn whitelist add subnet 172.16.0.0/12,nordvpn whitelist add port 9876,nordvpn whitelist add port 9118,nordvpn whitelist add port 6081. Container must restart to apply. The host can now reach the FastAPI, Jackett, and noVNC endpoints while NordVPN is active. - Playwright runtime installation failure (2026-04-26). RESOLVED.
playwright install chromiumwas failing inside the container due to missing shared libraries. Fix: Dockerfile now installslibnss3,xvfb, and other Chromium system deps at image build time. Chromium is baked into the image at/root/.cache/ms-playwright/. - Anna's Archive downloads CAPTCHA — old host-CDP approach (2026-04-16). OBSOLETE. Originally used host CDP on port 9222 with xpra. Replaced by container-local Playwright (see 2026-04-26). Host CDP dependency removed.
- Old VNC/noVNC approach (2026-04-27). OBSOLETE. The old manual x11vnc+websockify hack (ports 5998/5999, host display :99) was a desperate workaround that never worked. Now superseded by the clean Dockerfile-baked stack below.
- xpra 3.1 HTML5 display stack (2026-04-30). OBSOLETE. Entire xpra approach replaced with x11vnc+websockify+noVNC. See Red Herring Graveyard below for why.
Known Issues — CURRENT
- Anna's Archive title extraction: The parser extracts MD5 hashes correctly but titles show as "Unknown" — the title lives in a complex nested DOM structure that needs further parsing work.
- Jackett seeder counts via Torznab: Consistently report 0 seeders even when torrents are alive. TPB's Torznab adapter doesn't report accurate seeder data.
SOP for Anna's Archive downloads (updated 2026-04-30):
- Automation first —
browser_download()navigates book page, identifies and clicks the best download candidate - If DDoS-Guard JS challenge: wait up to 30s for auto-redirect
- If visual puzzle or hCAPTCHA: container returns
screenshot_b64+display_url - Human-in-the-loop URL:
https://araminta.taild3f7b9.ts.net/pirate/vnc_lite.html?path=pirate%2F - After challenge resolves → countdown page →
_handle_countdown_and_extract()polls up to 180s - Token URL pattern:
https://wbsg8v.xyz/d3/y/{unix_ts}/3000/g4/{category}/... - File curl'd to
/downloadswith proper cookies and headers
🔴 Red Herring Graveyard
These approaches were explored and FAILED. Do NOT attempt again.
XPRA (all approaches) — DO NOT RETRY
- What was tried: xpra 3.1 (Ubuntu 22.04 apt package) in both
shadowandstartmodes. xpra pip upgrade attempted and failed (needs full Cython build chain). - How it failed: jQuery was a symlink (→
/usr/share/javascript/jquery/jquery.js) which xpra's built-in HTTP server doesn't follow — returned 404. Even after resolving the symlink by installinglibjs-jqueryand copying the real file inline, xpra's application-layer WebSocket handshake threw "server error error accepting new connection" on every HTML5 client attempt. The raw WebSocket upgrade (101) worked at the TCP level, but xpra's own protocol handshake after upgrade was broken. Both shadow and start modes failed identically. - Why we thought it would work: Previous sessions had used xpra's HTML5 client successfully with CDP-based flows. The xpra documentation claims HTML5 support.
- Signs it was a dead end: Same error across multiple restarts, both display modes, even after jQuery fix. No amount of configuration flags changed the outcome.
- What actually works: x11vnc + websockify + noVNC (see below).
Manual x11vnc + websockify (April 27 hack) — DO NOT RETRY
- What was tried: Manually launching x11vnc and websockify from inside the container without Dockerfile integration, using port 5901 and noVNC's
vnc.html. - How it failed:
vnc.htmldoesn't handle path-based WebSocket routing (needsvnc_lite.html). Port 5901 wasn't whitelisted in NordVPN killswitch. No persistence on rebuild. - What actually works:
vnc_lite.html?path=pirate%2Fwith everything Dockerfile-baked.
Host CDP / browser on host — DO NOT RETRY
- What was tried: Running Chromium on the host Pi with CDP on port 9222, routing through container VPN.
- How it failed: Traffic leaked outside VPN. Host ISP DNS filtering blocks Anna's Archive. Violates the "all naughty traffic inside container" principle.
Architecture Principles (never violate)
Lessons learned (2026-04-30)
Display stack: The canonical browser display is Xvfb :1 → x11vnc :5900 → websockify :6081 → noVNC. xpra 3.1 is broken for HTML5 WebSocket — see Red Herring Graveyard. The path=pirate%2F URL parameter is MANDATORY for Funnel routing.
Self-contained images: Docker images must contain real files, not symlinks to files outside their document root. xpra's jquery.js symlink was the original red herring that wasted hours.
Do not put LLM/vision providers or API keys inside container tools. The container cannot import agent capabilities, and it should not call OpenRouter/OpenAI/Gemini/etc. directly. Correct boundary: browser_fallback.py returns screenshots and the live display URL; Minty/the calling agent decides whether to use its own vision capability or send David the noVNC link.
- All VPN traffic originates from INSIDE the container. Never install/run NordVPN on the host Pi.
- Container is disposable:
docker compose down && upshould restore everything. - Host-to-container ports: API and Jackett stay localhost-only (
127.0.0.1:9876,127.0.0.1:9118). noVNC/websockify display is published as host port6081because Tailscale Funnel proxies it to the Browser URL. - Auto-whitelist Docker bridge subnet in NordVPN on startup so killswitch doesn't block host access.
- If you need to interact with the browser from within the VPN tunnel, use the noVNC URL (
https://araminta.taild3f7b9.ts.net/pirate/vnc_lite.html?path=pirate%2F) — never run a browser on the host and route through the container.
Notes
- Downloads land in
/downloadsinside the container, mapped to./downloadson the host (bind mount) - Jackett state persisted in
pirate-dock-dataDocker volume at/data/jackett/ - Build scripts include disk space guardrails (refuses to build above 85% usage)
.dockerignoreprevents build context bloat (no .git, downloads, docs inside image)- Image is ~1.5 GB with Playwright + Chromium baked in (was ~400 MB before browser fallback)
- VPN kill switch blocks non-VPN traffic INSIDE the container — this is correct and desired. Host networking is unaffected.
- Token is 64 chars, stored in
.envandscripts/token.txt— never committoken.txtto git - Browser stack runs INSIDE the container via Playwright; no host CDP or noVNC server required
- Network mode is bridge (NOT host) — ports 9876 and 9118 published to
127.0.0.1only, port 6081 published to0.0.0.0for Tailscale Funnel - DO NOT change to
network_mode: host— this would re-introduce the 2026-04-17 incident where NordVPN's killswitch broke all Pi connectivity - Playwright requirements:
playwright>=1.50.0inrequirements.txt; Dockerfile installs Chromium libs + runsplaywright install chromium - noVNC display URL:
https://araminta.taild3f7b9.ts.net/pirate/vnc_lite.html?path=pirate%2F— thepathparameter is mandatory