page-gen

Generate an academic paper project webpage from PDF using parse_paper, match_template, render_html, review_html_visual, and extract_table_html.

ZhihaoAIRobotic 155 12 Updated 5mo ago

GitHub

Install

npx skillscat add zhihaoairobotic/clawphd/page-gen

Install via the SkillsCat registry.

SKILL.md

Academic Page Generation

Turn a paper PDF into a polished HTML project page with iterative visual refinement.

Available Tools

Tool	Purpose
`parse_paper`	Parse PDF into markdown (pymupdf4llm) + extract complete figures with captions as screenshots
`match_template`	Rank page templates by style preferences (reads tags.json)
`render_html`	Render local HTML into PNG screenshot (Playwright)
`review_html_visual`	Vision-based review of rendered screenshot
`extract_table_html`	Convert table image to semantic HTML table

Workflow (follow in order)

Step 1 - Parse

Call parse_paper with the user PDF path.

The tool returns markdown_path and a figures array.
Each figure has: num (Figure number), caption (full caption text from the paper), and path (high-res screenshot of the complete figure region).
Read the markdown file to understand the paper content.
The figures are page-rendered crops that exactly match what you see in the PDF — not raw embedded bitmaps.

Step 2 - Plan content

From parsed markdown:

Remove references section.
Keep only figures/tables that support understanding of method/results.
Produce section plan:
- title
- authors
- affiliation
- dynamic sections based on paper content
Generate concise, web-first content for each section.

Step 3 - Select template

Call match_template with the user's style preferences (or defaults).

The tool returns ranked template candidates with paths. You MUST use these templates.

Step 4 - Read template and generate HTML

CRITICAL: Do NOT write HTML/CSS from scratch. You MUST base your page on the selected template.

Read the top-ranked template's index.html using read_file to understand its full structure.
Read the template's CSS file(s) (usually in assets/ subfolder) to understand its styling.
Copy the template directory structure into the output folder:
- Copy CSS/JS/font files from the template to the output folder.
- Copy figure images from figures_dir (returned by parse_paper) into the output folder.
Generate a new index.html that:
- Reuses the template's HTML structure (header, nav, sections, footer layout).
- References the same CSS file(s) — do NOT inline CSS.
- Fills in the paper content (title, authors, abstract, method, results, figures).
- For each figure, use the caption from parse_paper output and set <img src> to the figure file.
- Updates image src paths to use relative paths.
- Keeps all media paths relative.

Step 5 - Visual review loop (max 2 rounds)

Call render_html on generated page.
Call review_html_visual on screenshot.
Apply targeted revisions from review.
Repeat if necessary, max 2 rounds.

Step 6 - Table replacement

If table images are present:

Call extract_table_html per table image.
Replace corresponding <img ...> with generated <table>...</table>.
Re-render once to confirm final visual quality.

Step 7 - Human feedback

Ask user for final adjustments. If feedback is provided:

Apply requested edits.
Re-render and confirm.

Important Rules

NEVER generate HTML/CSS from scratch. Always base the page on a real template.
Keep all outputs under workspace paths.
Never destroy existing template files; write to a new project output folder.
Use simple, robust HTML/CSS changes over risky rewrites.
Prefer readability and information hierarchy over flashy layout.

page-gen

Install

Academic Page Generation

Available Tools

Workflow (follow in order)

Step 1 - Parse

Step 2 - Plan content

Step 3 - Select template

Step 4 - Read template and generate HTML

Step 5 - Visual review loop (max 2 rounds)

Step 6 - Table replacement

Step 7 - Human feedback

Important Rules

Categories

Install

Recommended Skills