Generate an academic paper project webpage from PDF using parse_paper, match_template, render_html, review_html_visual, and extract_table_html.
Install
npx skillscat add zhihaoairobotic/clawphd/page-gen Install via the SkillsCat registry.
SKILL.md
Academic Page Generation
Turn a paper PDF into a polished HTML project page with iterative visual refinement.
Available Tools
| Tool | Purpose |
|---|---|
parse_paper |
Parse PDF into markdown (pymupdf4llm) + extract complete figures with captions as screenshots |
match_template |
Rank page templates by style preferences (reads tags.json) |
render_html |
Render local HTML into PNG screenshot (Playwright) |
review_html_visual |
Vision-based review of rendered screenshot |
extract_table_html |
Convert table image to semantic HTML table |
Workflow (follow in order)
Step 1 - Parse
Call parse_paper with the user PDF path.
- The tool returns
markdown_pathand afiguresarray. - Each figure has:
num(Figure number),caption(full caption text from the paper), andpath(high-res screenshot of the complete figure region). - Read the markdown file to understand the paper content.
- The figures are page-rendered crops that exactly match what you see in the PDF — not raw embedded bitmaps.
Step 2 - Plan content
From parsed markdown:
- Remove references section.
- Keep only figures/tables that support understanding of method/results.
- Produce section plan:
titleauthorsaffiliation- dynamic sections based on paper content
- Generate concise, web-first content for each section.
Step 3 - Select template
Call match_template with the user's style preferences (or defaults).
The tool returns ranked template candidates with paths. You MUST use these templates.
Step 4 - Read template and generate HTML
CRITICAL: Do NOT write HTML/CSS from scratch. You MUST base your page on the selected template.
- Read the top-ranked template's
index.htmlusingread_fileto understand its full structure. - Read the template's CSS file(s) (usually in
assets/subfolder) to understand its styling. - Copy the template directory structure into the output folder:
- Copy CSS/JS/font files from the template to the output folder.
- Copy figure images from
figures_dir(returned by parse_paper) into the output folder.
- Generate a new
index.htmlthat:- Reuses the template's HTML structure (header, nav, sections, footer layout).
- References the same CSS file(s) — do NOT inline CSS.
- Fills in the paper content (title, authors, abstract, method, results, figures).
- For each figure, use the
captionfrom parse_paper output and set<img src>to the figure file. - Updates image
srcpaths to use relative paths. - Keeps all media paths relative.
Step 5 - Visual review loop (max 2 rounds)
- Call
render_htmlon generated page. - Call
review_html_visualon screenshot. - Apply targeted revisions from review.
- Repeat if necessary, max 2 rounds.
Step 6 - Table replacement
If table images are present:
- Call
extract_table_htmlper table image. - Replace corresponding
<img ...>with generated<table>...</table>. - Re-render once to confirm final visual quality.
Step 7 - Human feedback
Ask user for final adjustments. If feedback is provided:
- Apply requested edits.
- Re-render and confirm.
Important Rules
- NEVER generate HTML/CSS from scratch. Always base the page on a real template.
- Keep all outputs under workspace paths.
- Never destroy existing template files; write to a new project output folder.
- Use simple, robust HTML/CSS changes over risky rewrites.
- Prefer readability and information hierarchy over flashy layout.