keaven

gsDesignNB

Design, simulate, and adapt clinical trials with negative binomial recurrent event endpoints using the gsDesignNB R package. Use this skill when the task involves: NB sample size or power, event gaps and Jensen correction, calendar-time group sequential design, blinded or unblinded information estimation, sample size re-estimation (SSR), recurrent-event simulation (constant or seasonal rates), interim data cuts, completer-based analyses, or non-inferiority/super-superiority designs for recurrent events.

keaven 1 1 Updated 4w ago

Resources

18
GitHub

Install

npx skillscat add keaven/gsdesignnb

Install via the SkillsCat registry.

SKILL.md

gsDesignNB — Comprehensive AI Skill

Last updated: 2026-04-30 — matches package version 0.3.2

Package purpose

gsDesignNB is an R package for planning, monitoring, and adapting clinical trials whose primary
endpoint is a recurrent event count analyzed with a negative binomial (NB) model. It extends the
gsDesign ecosystem to handle the special features of NB designs: information that depends on
exposure and dispersion (not just event counts), protocol event gaps, calendar-time interim looks,
and blinded/unblinded sample size re-estimation.

Statistical model

Conditional on a subject-level gamma frailty, events follow a Poisson process with constant rate.
Marginally, the event count $Y_i$ for subject $i$ with exposure $t_i$ is:

  • Mean: $\mu_i = \lambda_g t_i$
  • Variance: $\mu_i + k \mu_i^2$

where $\lambda_g$ is the group rate and $k$ is the dispersion parameter. The treatment effect is
the log rate ratio $\theta = \log(\lambda_2 / \lambda_1)$, estimated via NB GLM with log link and
exposure offset.

Fisher information per subject: $\mathcal{I}_i = \mu_i / (1 + k \mu_i)$.

Wald vs score test

mutze_test() supports test_type = "wald" (default) and test_type = "score":

  • Wald test: Fits arm-specific NB GLMs; $z = \hat\theta / \text{SE}(\hat\theta)$.
  • Score test: Fits a pooled null NB GLM via MASS::glm.nb(); score statistic
    $U = \sum (y_i - \hat\mu_i) / (1 + \hat k_0 \hat\mu_i)$; Fisher information
    $I_0 = W_1 W_2 / (W_1 + W_2)$; $z = U / \sqrt{I_0}$.

The score test is slightly more conservative under $H_0$ and better controls Type I error
in small-sample / low-rate scenarios.

Two-variance sample size formula

sample_size_nbinom() supports test_type = "wald" (default) and test_type = "score":

  • Wald (single variance): $n_1 = (z_\alpha + z_\beta)^2 V_1 / \theta^2$
  • Score (two variance): $n_1 = (z_\alpha \sqrt{V_0} + z_\beta \sqrt{V_1})^2 / \theta^2$

where $V_1 = (1/\mu_1 + k_1)(1 + 1/r) + (1/\mu_2 + k_2)/r$ is the alternative-hypothesis
variance, and $V_0 = (1/\mu_0 + k_0)(1 + 1/r)$ uses a pooled rate
$\lambda_0 = (\lambda_1 + r \lambda_2) / (1 + r)$ under $H_0$.

The score formula is matched to the score test's null-variance reference distribution, but it is
not automatically more powerful. In the package superiority grid, Wald/Zhu--Lakkis sizing paired
with the score test preserved Type I error and gave slightly higher score-test power than score
sizing because it retained a small sample-size margin. Compare both sizing rules and verify Type I
error and power by simulation for the design setting.

Core workflow

sample_size_nbinom() → gsNBCalendar() → sim_gs_nbinom() or sim_ssr_nbinom() → summarize
  1. Fixed design: sample_size_nbinom() computes sample size/power under planned rates,
    dispersion, piecewise accrual, exponential dropout, max follow-up, and event gaps.
  2. GS design: gsNBCalendar() converts the fixed design to a gsNB object (inherits
    gsDesign) with calendar-time analysis looks and NB-specific information.
  3. Simulation: sim_gs_nbinom() (fixed GS timing) or sim_ssr_nbinom() (adaptive SSR)
    generates trial-level operating characteristics.
  4. Summary: check_gs_bound() + summarize_gs_sim(), or summarize_ssr_sim().

Function reference

Sample size and design

Function Purpose
sample_size_nbinom() Fixed-design sample size/power; test_type = "wald" (V₁ only) or "score" (V₀+V₁)
compute_info_at_time() Information at a given calendar time (internal to sample_size_nbinom)
gsNBCalendar() Calendar-time group sequential design from fixed design result
toInteger() Round design to integer sample sizes (S3 method for gsNB)

Data simulation

Function Purpose
nb_sim() Simulate recurrent events — piecewise accrual, exponential dropout, gamma-Poisson mixture
nb_sim_seasonal() Same as nb_sim() but with season-varying event rates

Interim data handling

Function Purpose
cut_data_by_date() Censor simulated data at a calendar date; aggregate events/exposure per subject
get_cut_date() Find analysis date from calendar, event, completer, or information targets
get_analysis_date() Find calendar time when target event count is reached
cut_date_for_completers() Find calendar time when target completer count is reached
cut_completers() Convenience wrapper: find completer date + cut data

Statistical analysis and information

Function Purpose
mutze_test() Wald or score test for treatment effect (test_type = "wald" / "score")
calculate_blinded_info() Blinded dispersion/rate estimation and statistical information
estimate_nb_mom() Method-of-moments NB parameter estimation (blinded or unblinded)

Group sequential simulation

Function Purpose
sim_gs_nbinom() Monte Carlo GS trial simulation with calendar-time cuts
check_gs_bound() Apply GS boundaries to simulation results on chosen info scale
summarize_gs_sim() Operating characteristic summaries (power, futility, per-analysis)

Sample size re-estimation

Function Purpose
blinded_ssr() Blinded SSR using pooled NB fit (Friede & Schmidli 2010)
unblinded_ssr() Unblinded SSR using arm-specific rate/dispersion estimates
sim_ssr_nbinom() Full adaptive simulation engine (No adaptation / Blinded / Unblinded)
summarize_ssr_sim() Summary of SSR simulation results

Utilities

Function Purpose
run_ssr_shiny() Interactive Shiny interface for SSR exploration
preview_pkgdown_site() Serve docs locally with correct CSS (not pkgdown::preview_site())

Re-exported from gsDesign

gsDesign(), gsBoundSummary(), and all sf*() spending functions are re-exported so users
can work in one namespace.

Jensen's inequality correction for event gaps

When a protocol defines a minimum gap $\gamma$ between counted events, the naive effective rate
$\lambda_{\text{eff}} = \lambda / (1 + \lambda\gamma)$ is biased upward due to Jensen's inequality
applied over the gamma-distributed subject-level rates. The second-order correction is:

$$
\lambda_{\text{eff}} \approx \frac{\lambda}{1 + \lambda\gamma}
\left(1 - \frac{k\lambda\gamma}{(1 + \lambda\gamma)^2}\right)
$$

When it matters: $k \geq 0.5$ and gap $\geq 20$ days. A 64-scenario simulation sweep showed
the naive formula underpowers by 1–2.5 pp in these regimes; the correction reduces effective-rate
approximation error from ~8% to ~1%.

When it is negligible: $k \leq 0.1$ or gap $\leq 10$ days.

The same gap rule flows through sample_size_nbinom(), nb_sim(), and cut_data_by_date() so
that planning, simulation, and interim cutting are consistent.

Three kinds of "gaps"

Concept Scale Package interface
Analysis gap $\Delta_i$ Trial calendar analysis_times in gsNBCalendar(), search criteria in get_cut_date()
Timing heterogeneity Accrual, dropout, follow-up accrual_rate/duration, dropout_rate, max_followup, trial_duration
Event gap $g$ Patient time event_gap argument, carried through planning → simulation → interim cut

SSR guidance

  • sim_ssr_nbinom() supports strategies: "No adaptation", "Blinded SSR", "Unblinded SSR".
  • bound_info controls which information scale drives boundaries: "unblinded_ml" (default),
    "blinded_ml", "unblinded_mom", "blinded_mom".
  • For SSR studies, treat the final test statistic as central. The current production SSR study uses
    the score test for power simulations and compares Wald vs score Type I error at the same nominal
    one-sided $\alpha = 0.025$.
  • In that non-binding Type I comparison, Wald is mildly anti-conservative (about 2.8%--3.1%) and
    score is conservative (about 2.1%--2.3%). Prefer the score final test when Type I calibration is
    the priority, then check whether Wald sizing, a higher target power, or a modest information
    margin is needed for power.
  • SSR adaptation is driven mainly by interim nuisance estimates and the adaptation cap; the score
    test reduces anti-conservative rejection after adaptation rather than eliminating the need to
    simulate adapted sample-size behavior.
  • Blinded SSR preserves masking. Unblinded SSR may be more sample-efficient, but the relative
    behavior is design-specific and should be compared under the trial's nuisance range, adaptation
    cap, and final test statistic.
  • Spending is not accelerated when observed information exceeds planned: uses min(planned IF, actual IF).
  • Futility assessment is deferred until ≥ 30% of planned information.

Time-scale checklist

Before writing or reviewing code, confirm all of these use the same time unit:

  • enroll_rate$duration and enroll_rate$rate
  • fail_rate$rate
  • dropout_rate$rate and dropout_rate$duration
  • trial_duration
  • max_followup
  • Calendar cut times / analysis_times
  • event_gap

If units are mixed, fix that first.

Key arguments shared across functions

Argument Used in Meaning
lambda1, lambda2 sample_size_nbinom() Control and experimental event rates
dispersion / k Multiple NB dispersion (scalar or arm-specific vector)
accrual_rate, accrual_duration sample_size_nbinom() Piecewise enrollment
enroll_rate nb_sim(), sim_gs_nbinom(), sim_ssr_nbinom() data.frame(rate, duration)
fail_rate Simulation functions data.frame(treatment, rate, dispersion)
dropout_rate Multiple data.frame(treatment, rate, duration) for exponential dropout
max_followup Multiple Maximum per-subject follow-up
trial_duration sample_size_nbinom() Total calendar time of the trial
event_gap sample_size_nbinom(), nb_sim(), cut_data_by_date() Minimum gap between counted events
rr0 sample_size_nbinom() Null rate ratio (default 1; set < 1 for non-inferiority)
test_type mutze_test(), sample_size_nbinom(), sim_gs_nbinom(), sim_ssr_nbinom() "wald" (default) or "score"

S3 class hierarchy

  • sample_size_nbinom_result: output of sample_size_nbinom(); has print() and summary().
  • gsNB: output of gsNBCalendar(); inherits from gsDesign; has summary(), toInteger().
  • nb_sim_data / nb_sim_seasonal: output of nb_sim() / nb_sim_seasonal(); dispatches cut_data_by_date().
  • mutze_test: output of mutze_test(); has print().

Vignettes

Vignette Topic
sample-size-nbinom Fixed-design NB sample size planning
simulation-example Basic recurrent-event data generation
seasonal-simulation Seasonal rate simulation
group-sequential-simulation Calendar-time GS design workflow
ssr-example Short SSR workflow
ssr-simulation-study Large SSR simulation case study
completers-interim-example Completer-based interim looks
non-inferiority-example Non-inferiority design
verification-simulation Simulation verification of sample size and test
score-vs-wald-simulation 2×2 comparison: Wald/score sizing × Wald/score test (interactive DT/plotly)
ai-skills Demonstrates how to use SKILL.md and llms.txt to guide package-native workflows
blinded-info-diagnostics Blinded information estimation diagnostics
blinded-info-diagnostics Edge cases for blinded information
verification-simulation Design-formula verification against simulation

Common pitfalls

  • Do not re-implement SSR logic if sim_ssr_nbinom() covers the workflow.
  • mutze_test() may fall back to Poisson based on poisson_threshold; this is defensive, not a model choice.
  • event_gap changes both counted events and at-risk exposure; it is not cosmetic.
  • check_gs_bound() accepts info_col to select the information column; use that rather than renaming.
  • For pkgdown preview, use gsDesignNB::preview_pkgdown_site(), not pkgdown::preview_site().

Disease-area calibration

Published NB trial parameters for reference:

Therapeutic area ~λ (events/month) ~k Gap
Heart failure hospitalizations 0.3 0.5 Extended hospitalization-free window
COPD exacerbations 0.5–0.8 0.5–1.0 ≥ 28 symptom-free days
MS relapses 1–2/year moderate–high Variable
Asthma exacerbations Variable ~0.8 Protocol-specific

Dependencies

  • Runtime: gsDesign (≥ 3.9.0), data.table, simtrial, MASS, stats
  • Suggested: ggplot2, dplyr, gt, DT, shiny, foreach, doFuture, future, testthat, knitr