code-review-for-quant

Use when reviewing or writing Python/Go/SQL code for quant research, backtests, market-data pipelines, or trading systems. Provides a structured checklist of failure modes specific to time-series financial code (lookahead, splits, snapshots, currency, NaN propagation, joint-filer dedup) that generic code review skips.

jefrnc 0 Updated 2mo ago

GitHub

Install

npx skillscat add jefrnc/quant-llm-skills/code-review-for-quant

Install via the SkillsCat registry.

SKILL.md

Code review for quant

Generic code review catches off-by-one errors and missing with blocks.
Quant code has its own failure modes — and they're the ones that
silently corrupt research output without raising. This skill enforces
a domain-specific checklist before approving any quant-touching code.

Core principle

Quant bugs hide as plausible numbers. A backtest that runs cleanly
and produces a nice equity curve can still be using future data. The
test "did it crash?" is meaningless. The test is "did each datapoint
trace to a publication date that precedes the query?".

The checklist

Run this against any function that touches historical financial data.

A. Time semantics

Every read of historical state takes a query_date argument
(or equivalent) and filters on filing_date <= query_date
(or accepted <= query_date).
No use of period_end, report_date, or as_of_date as the
known-date for filing data.
No use of "current" snapshots (ticker.info, latest API value)
for historical queries.
Splits / reverse splits applied with split-date as the cutoff
(not retroactively to all prior dates).
Adjusted prices not used for absolute price thresholds — adjusted
values change as new splits happen.
Earnings revisions / amendments treated as known only from the
amendment's own filing date.

B. Data shape

Fall-through on missing fields (no KeyError crashes when XBRL
has alternate tags or FPI structure differs).
Fall-through to text-extraction when XBRL returns 404 (FPIs,
SPACs, recent IPOs).
Multi-class share structures handled (Class A + Class B, ADSs +
ordinary shares with ratio conversion).
Currency conversion uses point-in-time FX rate, not current.

C. Aggregation hygiene

Joint-filer / Section 13(d) group dedup applied before summing
insider holdings.
Form 144 (intent) NOT counted as Form 4 (executed transaction).
13F filings treated separately from 13D/G (different lag, different
threshold, different dedup rules).
CUSIP changes (mergers, reverse splits) reconciled by ticker
history, not by CUSIP.

D. Numerical hygiene

Division-by-zero guarded (volume / float, returns / price).
None / NaN propagation explicit — no silent coverage gaps.
Outlier handling explicit — bid/ask crosses, halt periods,
suspicious prints (penny stocks: trades flagged with condition_codes
indicating odd lot / late / out-of-sequence).
Currency precision (Decimal vs float) consistent — float drift
compounds over millions of trades.

E. Friction realism

Slippage modeled as % of price or absolute spread, NOT zero.
Borrow / hard-to-borrow APR included for short-side simulations.
Locate-failure probability for very-low-float tickers.
Bid-ask spread for microcaps (often >5% on actual trades).
Halts and circuit breakers handled — not all volume is tradeable.

F. Reproducibility

Random seeds set explicitly for any stochastic component.
Data freshness recorded (which day was the underlying CSV pulled).
Environment locked (requirements.txt / go.sum / package-lock).
Output stamped with both run_date and data_as_of_date.

G. Performance traps

No O(N) re-reads of the same JSON/CSV inside a .apply() loop.
No per-bar HTTP / DB calls in tight backtest loops; pre-fetch.
No accidental quadratic time on date filtering (use indexed
lookups, not list comprehensions over the full universe).

Priority order when reviewing

When listing bugs found in a code review, ALWAYS rank by silent-corruption
potential, not by severity-of-symptom:

Look-ahead bias — silently wrong, looks fine
Snapshot used for history — silently wrong, looks fine
Joint-filer over-count — silently wrong, looks fine
Survivorship bias — silently wrong, looks fine
Split / adjusted-price misuse — silently wrong, looks fine
Friction-free assumption — silently optimistic
Performance bug — visible, will be fixed when run
Crash bug — visible, will be fixed at runtime
Style / hygiene — least urgent

Anti-pattern: leading the review with "you should use a context
manager for open()" while the function silently uses period_end
as a publication date. The first one is cosmetic; the second corrupts
every backtest result.

Workflow when handed a snippet

Identify what the function does (state lookup, aggregation,
calculation, simulation).
Walk the relevant section of the checklist above.
List bugs in silent-corruption order, not in code order.
For each lookahead-class bug: cite the specific datapoint that
would leak (e.g., "row with period_end: 2023-12-31 is unknowable
on 2024-02-01 if the 10-K was filed 2024-02-15").
Propose fixes that align with lookahead-safety and other relevant
skills — don't reinvent the rule.

Phrases that should trigger this skill

"review this code"
"is this backtest correct"
"audit my pipeline"
"find bugs in this script"
"code review"
any code block containing pd.DataFrame, yfinance, requests,
polygon, sec, companyfacts, period_end, filing_date,
apply(lambda

What this skill is NOT

This is not a generic linter. It does not catch missing semicolons,
unused imports, or PEP8 violations — those are the pre-existing
linter's job. It catches the ~20 quant-specific failure modes that
generic code review consistently misses. Combine with lookahead-safety,
xbrl-fallbacks, insider-dedup and other domain skills for
specific-rule fixes.

code-review-for-quant

Install

Code review for quant

Core principle

The checklist

A. Time semantics

B. Data shape

C. Aggregation hygiene

D. Numerical hygiene

E. Friction realism

F. Reproducibility

G. Performance traps

Priority order when reviewing

Workflow when handed a snippet

Phrases that should trigger this skill

What this skill is NOT

Categories

Install

Recommended Skills