Academic paper workflow — find, read/annotate (HTML), daily digest, cite. Use when user invokes /paper, needs paper annotation, literature search, citation generation, or daily paper discovery.
Resources
5Install
npx skillscat add diskixix0-collab/paperwise Install via the SkillsCat registry.
/paper — Academic Paper Workflow
Default Config
Edit this section to customize for your project.
output_dir: "./papers"
# Obsidian vault users: set paper_output_dir in your project's CLAUDE.md
# Drive/OneDrive: use the appropriate MCP, then set path (e.g. "gdrive://My Drive/Papers")
venues: [CHI, ACL, EMNLP, NAACL, NeurIPS, ICML, ICLR, EDM, LAK, AIED, ITS, CSCW, SIGIR, CIKM]
keywords:
- cognitive load
- adaptive learning
- intelligent tutoring
- conversational learning
- LLM tutoring
- learning analytics
- behavioral signals
- educational dialogue
min_citations: 5 # /paper find: filter out papers with fewer citations
daily_max: 10 # /paper digest: max papers per run
annotation_lang: zh # zh = Chinese annotations | en = English; override with --lang
openalex_email: "" # Optional. Add email to join OpenAlex Polite Pool (higher rate limits).
semantic_scholar_api_key: "" # Optional. Free key: semanticscholar.org/product/apiOverride any config value in your project's CLAUDE.md using keys: paper_output_dir, paper_venues, paper_keywords, paper_annotation_lang.
Subcommands
| Command | Usage | Description |
|---|---|---|
find |
/paper find "cognitive load LLM" |
Search papers → Digest HTML with top N cards |
read |
/paper read /path/to/file.pdf |
Annotate a single PDF → full dual-column HTML |
digest |
/paper digest |
Daily new papers from arxiv (used by cron) |
cite |
/paper cite [[note-name]] |
Generate APA + BibTeX citation from existing annotation |
Flags:
--context "ML课第3周"— override project context--questions "Q1:... Q2:..."— switch to Question mode (Mode A)--questions— interactive question mode (Claude asks you)--output /path/— override output directory for this run--top N— return N results (default: 5 for find, daily_max for digest); also accepts bare number after query--lang en— override annotation language for this run--source arxiv— search only arxiv (latest preprints, no citation filter)--source venues— search only papers from configvenueslist (citation-weighted ranking)--local /path/— read local PDFs from folder (no API calls); or--local a.pdf b.pdffor specific files
Flags can be written with or without
--. Claude accepts natural language equivalents (e.g.top 5,source venues,local /path/,questions "...").
Context Resolution (in priority order)
--contextflag in command- Current directory's
CLAUDE.md(auto-loaded by Claude Code) - No context → use Mode B directly (zero-config, logic analysis mode)
Data Sources
Primary: OpenAlex
Free, no API key required, 10 req/s rate limit.
https://api.openalex.org/works?search={query}
&select=title,authorships,publication_year,cited_by_count,primary_location,doi
&per-page=20
[&mailto={openalex_email} ← add if configured]After fetching: immediately extract and keep only — title, first author, year, cited_by_count, venue name (primary_location.source.display_name), DOI. Discard all other fields from the response before further processing.
Secondary: Semantic Scholar (only if semantic_scholar_api_key is set)
https://api.semanticscholar.org/graph/v1/paper/search?query={topic}
&fields=title,authors,year,citationCount,venue,externalIds&limit=20Header: x-api-key: {semantic_scholar_api_key}
Tertiary: arxiv (fallback + digest source)
https://export.arxiv.org/api/query?search_query=(cat:cs.CL+OR+cat:cs.HC+OR+cat:cs.AI+OR+cat:cs.LG)
+AND+({keywords})&sortBy=submittedDate&sortOrder=descending&max_results=30Note: arxiv papers have no citation count. Label as preprint.
Rate Limit Handling
If a source returns 429: check Retry-After header → wait that many seconds. If no header: wait 2s → 5s → 10s (3 retries). After 3 failures → skip source, move to next in priority. Note which source was skipped in output.
Subcommand: find
Parse flags: topic query,
--top N(default: 5),--lang,--source(default: mixed),--local(path or file list)--top Nalso accepts a bare number immediately after the query string (e.g./paper find "query" 10→ top 10)
Resolve source strategy:
--source/ flagAPI calls Ranking formula (default, mixed) Single OpenAlex call sort=cited_by_count:desc; arxiv fallback if <3 results0.5 × relevance_rank + 0.3 × log(cited_by_count+1) + 0.2 × recency_scorearxivarxiv API only ( sortBy=submittedDate); nomin_citationsfilter0.5 × relevance_rank + 0.5 × recency_scorevenuesSingle OpenAlex call; keep only papers where venue name matches any entry in config venueslist (case-insensitive partial match onprimary_location.source.display_name)0.5 × relevance_rank + 0.5 × log(cited_by_count+1)--local /path/or--local a.pdf b.pdfNo API calls — read local PDFs only Take first --top Nfiles (alphabetical order)--localand--sourceare mutually exclusive. If both are given, return an error.If
--localflag present — execute local PDF steps instead of steps 3–6:- Scan the given folder for all
.pdffiles, or use the explicitly listed files directly - If file count >
--top N, take first N (alphabetical order) - For each PDF, extract: title, authors, year, abstract (first 300 words), conclusion paragraph (last 300 words)
- Generate a Digest Card per PDF (same 4-sentence format: 研究问题 / 核心方法 / 关键发现 / 相关性)
- Citation badge: fixed as
📄 Local PDF(no OpenAlex lookup) - Card bottom link:
/paper read /absolute/path/to/file.pdf 精读(use absolute path) - Save to:
{output_dir}/FindResults/find-local-YYYY-MM-DD-HHmm-{folder-slug}.html - Print summary and exit — skip steps 4–8 below
- Scan the given folder for all
3b. Fetch papers (online mode):
- Add
&filter=cited_by_count:>{min_citations}for default/venues mode (skip for arxiv) - After fetch: immediately discard all fields except title, first author, year, cited_by_count, venue name, DOI
- On any source failure → follow Rate Limit Handling
Deduplicate against existing files:
- Check
{output_dir}/FindResults/and{output_dir}/root for[FirstAuthor-YYYY-slug].html - Mark existing papers as
[已有], include in summary with their path, skip card generation
- Check
Rank top N using the formula for the active source strategy
Generate Digest HTML — a single self-contained HTML file containing N paper cards.
Digest Card format (one card per paper):
[Title] — [First Author] et al., [Year] — [Venue or arXiv] [Citation badge] [Source tag] 研究问题: [1 sentence — what problem does this paper solve?] 核心方法: [1 sentence — what approach do they use?] 关键发现: [1 sentence — what is the main result?] 相关性: [1 sentence — why this matters for your research, based on CLAUDE.md context] [DOI link if available] · → /paper read <doi_or_path> 精读- Citation badge uses tier logic from Paper Quality Bar section
- Annotation language for cards follows
annotation_langconfig (or--langflag) - HTML structure: simple cards layout, embed
references/template.cssin<style>
Save Digest HTML to:
{output_dir}/FindResults/find-YYYY-MM-DD-HHmm-{query-slug}.html- Create subdirectory automatically if it doesn't exist
Print summary:
Found N papers via {source}. Digest saved: → 30_Research/FindResults/find-2026-03-14-1430-cognitive-load-llm.html Papers included: - Jin-2025-llm-teachable-agent [⭐ 47 citations · CHI] - Cai-2025-intrinsic-load [📄 preprint · arXiv] - Klepsch-2017-two-types-icl [已有 → 30_Research/FindResults/...]
Subcommand: read
- Read the PDF at provided path
- Extract: title, authors, year, venue/journal, abstract, full text
- Resolve mode:
- No
--questions→ Mode B (logic analysis) --questions "..."→ Mode A (inline). Parse questions using these rules:- Strip optional
Q1:/Q2:etc. prefixes (they are decorative, not required) - Split on: comma
,/ semicolon;/ newline /Q\d+pattern boundaries - Extract up to 6 questions; if more are given, take first 6 and warn the user
- Accept any of these formats:
"Q1: 方法? Q2: 与 baseline 比较?" ← Q-label + colon "方法?, 与 baseline 比较?" ← comma-separated "方法?; 与 baseline 比较?" ← semicolon-separated
- Strip optional
--questions(no argument) → Mode A (interactive):
Output exactly:请输入你的阅读问题(每行一个,或逗号分隔,最多6个):
Wait for user input, then parse using the same rules above, then proceed with Mode A
- No
- Resolve annotation language:
--langflag >annotation_langconfig - Generate full dual-column HTML annotation (see HTML Template section)
- Paper Quality Bar: omit (no citation data from local PDF unless found in text)
- Save to
{output_dir}/[FirstAuthorLastName-YYYY-keyword].html- Write directly to path; do not ls the directory beforehand
- Output: file path + brief summary of key arguments found
Subcommand: digest
- Check cache: if
{output_dir}/PaperDigests/YYYY-MM-DD-digest.htmlexists today → skip fetch, output cached path - Fetch arxiv: same URL as Data Sources section above
- Filter: keep top
daily_maxmost relevant to config keywords (semantic match on title+abstract) - Deduplicate: skip arxiv IDs seen in previous 7 days' digests
- Annotate: for each paper, run Mode B annotation (brief version); annotation language follows config
- Combine: all annotations → single HTML digest file; each paper has a compact Paper Quality Bar
- Save:
{output_dir}/PaperDigests/YYYY-MM-DD-digest.html - Update daily note: append to
10_Daily/YYYY-MM-DD.md:## Paper Digest → [[30_Research/PaperDigests/YYYY-MM-DD-digest|Today's Paper Digest]] (N papers)
Subcommand: cite
- Parse note name from user input (e.g.,
[[Klepsch-2017-annotation]]or file path) - Find HTML or MD file in
{output_dir}/or vault root - Extract metadata: title, authors, year, venue, DOI/URL
- Output directly (no file saved):
- APA 7th:
Author, A., & Author, B. (Year). Title. *Venue*, *vol*(issue), pages. https://doi.org/... - BibTeX:
@article{key, author = {...}, title = {...}, journal = {...}, year = {...}, doi = {...} }
- APA 7th:
HTML Annotation Template
Paper Quality Bar (find/digest only — between navbar and section-nav)
Shows: {source_icon} {source} | {venue} | {year} | {citation_badge} | {venue_type_tag}
Citation tiers:
| Condition | Badge | CSS class |
|---|---|---|
| cited_by_count ≥ 100 | 🔥 高引 N citations | badge-high |
| 20–99 | ⭐ N citations | badge-important |
| 5–19 | ✓ N citations | badge-valid |
| < 5 or no data | 📄 Preprint | badge-preprint |
Venue type tag: A* 会议 (top venues: CHI/NeurIPS/ACL/EMNLP/ICML/ICLR/SIGIR/CSCW) | 会议/期刊 (other config venues) | Workshop | Preprint
Source icon: 🔍 OpenAlex | 📚 Semantic Scholar | 📄 arXiv
CSS: see references/template.css (.quality-bar, .badge-*, .qb-* classes).
Mode B: Logic Analysis (Default)
Generate a complete, self-contained HTML file. CSS: embed references/template.css as <style> in <head>. Google Fonts: import Lora + IBM Plex Sans.
5-Color Highlight System:
| Color | Class | Represents |
|---|---|---|
🟡 Yellow #fef08a |
thesis |
Core thesis / main claim |
🔴 Red #fecaca |
concept |
Key concepts / terminology |
🔵 Blue #bfdbfe |
evidence |
Empirical evidence / data |
🟢 Green #bbf7d0 |
concession |
Concessions / counterargument handling |
🟣 Purple #e9d5ff |
methodology |
Methodology description |
Document structure:
1. TOP NAVBAR
- Paper title
- Context label (or "General Reading" if none)
- Color legend: 5 colored chips with dimension labels
2. PAPER QUALITY BAR (find/digest only — omit for /paper read)
3. STICKY SECTION NAV
- Links: Abstract | Introduction | Related Work | Methods | Results | Discussion | Conclusion
- Highlight active section on scroll
4. DUAL-COLUMN BODY — one <div class="paragraph-group"> per paragraph
Left column — original text (ALWAYS IN ORIGINAL LANGUAGE — never translate):
- Copy verbatim from PDF; annotation_lang ONLY controls the right column
- Highlight key phrases: <mark class="thesis">...</mark> etc.
Right column — annotation cards (language follows annotation_lang):
[Colored left border matching paragraph's dominant highlight]
① 段落功能 / Paragraph Function:...
② 逻辑角色 / Logical Role:...
③ 论证技巧或潜在漏洞 / Rhetorical Technique or Logical Gap:...
5. BACK-TO-TOP BUTTON
<button id="back-to-top" title="返回顶部">↑</button>
<script>
const btn = document.getElementById('back-to-top');
window.addEventListener('scroll', () => { btn.style.display = window.scrollY > 300 ? 'flex' : 'none'; });
btn.addEventListener('click', () => window.scrollTo({ top: 0, behavior: 'smooth' }));
</script>
6. BOTTOM — Argument Structure Overview (language follows annotation_lang)
zh labels: 问题 / 论点 / 证据 / 反驳处理 / 结论
en labels: Problem / Argument / Evidence / Concession / Conclusion
- Author's core claim (1 sentence)
- 最强论证 / Strongest argument
- 最弱论证 / Weakest argument
- APA Citation (auto-formatted) + copy button:
<button class="copy-btn" data-label="复制 APA" onclick="copyText(this, '{apa_string}')">复制 APA</button>
- BibTeX block + copy button:
<button class="copy-btn" data-label="复制 BibTeX" onclick="copyText(this, '{bibtex_string}')">复制 BibTeX</button>
Add this script alongside the back-to-top script:
<script>
function copyText(btn, text) {
navigator.clipboard.writeText(text);
btn.textContent = '✓ 已复制'; btn.classList.add('copied');
setTimeout(() => { btn.textContent = btn.dataset.label; btn.classList.remove('copied'); }, 1500);
}
</script>Mode A: Question-Driven (triggered by --questions)
Same structure as Mode B, with these differences:
- Color legend: per-question colors Q1–Q6 (cycling through a distinct palette)
- Left highlights:
<mark class="q1">,<mark class="q2">, etc. - Right annotation cards: labeled
【Q2 核心论点】/[Q2 Core Argument]:
① Paragraph Function ② Argument Logic ③ Which question/layer this paragraph answers - Bottom: Q&A Worksheet (replaces Argument Structure Overview):
One card per question — Core argument + Key evidence + Potential counter/limitation
CSS
All CSS is in references/template.css (adjacent to this SKILL.md). When generating any HTML output, embed the file contents as a <style> block in <head>. Only load when generating HTML — /paper cite does not need it.
Error Handling
- PDF unreadable: output error with path, suggest checking permissions
- No results from any source: output "No papers found — try broader keywords"
- OpenAlex 429: retry with backoff (2s→5s→10s); after 3 failures fall back to arxiv
- Semantic Scholar 429: retry with backoff; fall back to OpenAlex/arxiv
- Missing output_dir: create directory automatically before saving
- Malformed --questions: re-prompt user for correct format
- Source fallback notice: always state which source(s) were used, e.g. "⚠️ OpenAlex unavailable — results from arXiv only (no citation counts)"
Notes for Open-Source Use
Set
output_dirin your project'sCLAUDE.md:paper_output_dir: 30_ResearchOverride keywords/venues/language:
paper_venues: [CHI, AIED, EDM, LAK] paper_keywords: [your, topic, keywords] paper_annotation_lang: enOptional — OpenAlex Polite Pool (higher rate limits, no account needed):
openalex_email: yourname@example.comOptional — Semantic Scholar secondary source (free key):
semantic_scholar_api_key: your_key_here