lingzhi227

literature-search

Search academic literature using Semantic Scholar, arXiv, and OpenAlex APIs. Returns structured JSONL with title, authors, year, venue, abstract, citations, and BibTeX. Use when the user needs to find papers, check related work, or build a bibliography.

lingzhi227 96 17 Updated 3mo ago

Resources

1
GitHub

Install

npx skillscat add lingzhi227/agent-research-skills/literature-search

Install via the SkillsCat registry.

SKILL.md

Literature Search

Search multiple academic databases to find relevant papers.

Input

  • $ARGUMENTS — The search query (natural language)

Scripts

Semantic Scholar (primary — best for ML/AI, has BibTeX)

python ~/.claude/skills/deep-research/scripts/search_semantic_scholar.py \
  --query "QUERY" --max-results 20 --year-range 2022-2026 \
  --api-key "$(grep S2_API_Key /Users/lingzhi/Code/keys.md 2>/dev/null | cut -d: -f2 | tr -d ' ')" \
  -o results_s2.jsonl

Key flags: --peer-reviewed-only, --top-conferences, --min-citations N, --venue NeurIPS ICML

arXiv (latest preprints)

python ~/.claude/skills/deep-research/scripts/search_arxiv.py \
  --query "QUERY" --max-results 10 -o results_arxiv.jsonl

OpenAlex (broadest coverage, free, no API key)

python ~/.claude/skills/literature-search/scripts/search_openalex.py \
  --query "QUERY" --max-results 20 --year-range 2022-2026 \
  --min-citations 5 -o results_openalex.jsonl

Merge & Deduplicate

python ~/.claude/skills/deep-research/scripts/paper_db.py merge \
  --inputs results_s2.jsonl results_arxiv.jsonl results_openalex.jsonl \
  --output merged.jsonl

CrossRef (DOI-based lookup, broadest type coverage)

python ~/.claude/skills/literature-search/scripts/search_crossref.py \
  --query "QUERY" --rows 10 --output results_crossref.jsonl

Key flags: --bibtex (output .bib format), --rows N

Download arXiv Source (get .tex files)

python ~/.claude/skills/literature-search/scripts/download_arxiv_source.py \
  --title "Paper Title" --output-dir arxiv_papers/

Key flags: --arxiv-id 1706.03762, --metadata, --max-results N

Generate BibTeX from results

python ~/.claude/skills/deep-research/scripts/bibtex_manager.py \
  --jsonl merged.jsonl --output references.bib

Workflow

  1. Expand the user's query into 2-4 complementary search queries
  2. Run Semantic Scholar search (primary) with expanded queries
  3. Run arXiv for very recent preprints (< 3 months)
  4. Optionally run OpenAlex for broader coverage
  5. Merge and deduplicate results
  6. Rank by: citations (0.3) + recency (0.3) + venue quality (0.2) + relevance (0.2)
  7. Present structured results table

Venue Quality Tiers

Tier 1: NeurIPS, ICML, ICLR, ACL, EMNLP, NAACL, CVPR, ICCV, ECCV, KDD, AAAI, IJCAI, SIGIR, WWW
Tier 2: AISTATS, UAI, COLT, COLING, EACL, WACV, JMLR, TACL
Tier 3: Workshops, arXiv preprints — mark with (preprint)

Output Format

Present results as a table + detailed entries with BibTeX keys. Always note preprint status.

Related Skills