openclaw-r-stats

Advanced statistical analysis in R. Use when the user asks for regression, hypothesis testing, ANOVA, time-series forecasting, Bayesian modeling, survival analysis, descriptive statistics, EDA, correlation analysis, diagnostics, or reproducible analytical reports. Also use when the user mentions R packages like ggplot2, tidyverse, forecast, brms, broom, lme4, survival, or any statistical method. 支持中文：当用户提到回归分析、假设检验、时间序列、贝叶斯、生存分析、描述统计、相关分析等统计方法时使用此技能。

CuiweiG 0 Updated 3mo ago

Resources

GitHub

Install

npx skillscat add cuiweig/openclaw-r-stats

Install via the SkillsCat registry.

SKILL.md

OpenClaw R Stats

When to use

User asks for statistical analysis, regression, hypothesis testing
User asks to compare groups, test significance, find associations
User mentions ANOVA, t-test, chi-square, correlation
User asks for time series forecasting or trend analysis
User uploads CSV and wants statistical insights
User asks "is this significant?" or "what predicts X?"
用户用中文提到：回归、检验、预测、显著性、描述统计

What this skill does NOT do

Do not claim causality from observational data. Use "associated with".
Do not run large exploratory fishing without clear user intent.
Do not silently ignore assumption violations.
Do not execute arbitrary inline R code. Always use the wrapper script.
Do not install packages during analysis. Installation is a separate step.
Do not report only p-values. Always include effect sizes and CIs.

Pre-flight checks (mandatory before any analysis)

Confirm the dataset file exists and is readable.
Run schema inspection:
bash {baseDir}/scripts/run-rstats.sh schema --data
Report to the user: row/column count, types, missing values, unique counts.
If missing data > 5%, warn and ask how to handle.
If sample size < 30, warn about small sample limitations.
Only then proceed to build the analysis spec.

Environment check

If first time or errors occur:
bash {baseDir}/scripts/run-rstats.sh doctor

If packages missing:
Rscript {baseDir}/scripts/install-core.R

Standard workflow

Determine the correct analysis type.
Inspect dataset schema and missingness.
Build a JSON analysis spec:
{
"dataset_path": "",
"analysis_type": "",
"outcome": "",
"predictors": ["",""],
"formula": "",
"group_var": "",
"hypothesis": "",
"missing_strategy": "complete_case",
"alpha": 0.05,
"seed": 42,
"output_dir": ""
}
Save the spec as a .json file.
Run: bash {baseDir}/scripts/run-rstats.sh analyze --spec
Read summary.json and report.md from the output directory.
Present results: Summary → Statistics → Interpretation → Plots → Assumptions → Caveats.
Offer follow-up: diagnostics, alternative methods, export.

Analysis selection

User intent	analysis_type
Describe data / EDA	summary
Compare 2 groups (continuous)	ttest
Compare 2 groups (non-normal/small n)	wilcoxon
Compare 3+ groups (continuous, normal)	anova
Compare 3+ groups (non-normal/ordinal)	kruskal
Compare categorical variables	chisq
Categorical (small expected counts)	fisher
Paired categorical (before/after)	mcnemar
Repeated measures non-parametric	friedman
Association between 2 continuous vars	correlation
Predict continuous outcome	linear_regression
Predict binary outcome	logistic_regression
Predict count outcome	poisson_regression
Forecast time series	forecast_arima
Assess missing data patterns	missing_diagnostics
Impute missing values	multiple_imputation
Correct for multiple comparisons	p_adjust
Survival curves + median survival	kaplan_meier
Survival regression (HR)	cox_regression
Competing risks (Fine-Gray)	competing_risks
Time-dependent Cox model	cox_time_dependent
Restricted mean survival time	rmst
Odds ratio (case-control)	odds_ratio
Risk ratio + NNT (cohort/RCT)	risk_ratio
Incidence rate ratio (person-time)	incidence_rate
Stratified analysis (confounding)	mantel_haenszel
Number needed to treat/harm	nnt
Linear mixed model (random effects)	lmm
Generalized linear mixed model	glmm
GEE marginal model	gee
Intraclass correlation	icc
Propensity score matching	propensity_match
Propensity score weighting (IPW)	propensity_weight
Causal mediation analysis	mediation_analysis
Instrumental variable regression	iv_regression
Difference-in-differences	did
Regression discontinuity	rdd

Automatic method switching guardrails

Normality doubtful AND n < 30 → prefer wilcoxon over ttest
Variance equality doubtful → use Welch t-test (equal_var: false)
Expected cell counts < 5 → prefer fisher over chisq
Overdispersion in Poisson → warn, suggest negative binomial
Residuals heteroscedastic → warn about robust SE

Reporting rules (non-negotiable)

Every analysis MUST include:

Sample size (n) and missing data handling
Method name and selection rationale
Point estimates with confidence intervals
Effect sizes (Cohen's d, η², R², OR, etc.)
Assumption check results
Warnings or limitations

Language rules:

✓ "associated with" / "evidence suggests" / "estimated effect"
✗ NEVER "causes" / "proves" / "definitively shows"

Output artifacts (every run produces all of these)

File	Contents
summary.json	Status, method, findings, warnings, artifact paths
schema.json	Column types, missingness, unique counts
report.md	Human-readable analysis report
session_info.txt	R version, packages, platform, timestamp
executed_spec.json	Copy of input spec for reproducibility
tables/*.csv	Coefficients, group stats, forecast values
figures/*.png	Diagnostic and result plots

Rules

Never run ad-hoc inline R code when the wrapper script can be used.
Never install packages during an analysis run.
Never access the internet during analysis execution.
Always set a random seed for reproducibility.
Always save session info for every analysis.
Support both English and Chinese (中文) queries.

openclaw-r-stats

Resources

Install

OpenClaw R Stats

When to use

What this skill does NOT do

Pre-flight checks (mandatory before any analysis)

Environment check

Standard workflow

Analysis selection

Automatic method switching guardrails

Reporting rules (non-negotiable)

Output artifacts (every run produces all of these)

Rules

Categories

Install

Recommended Skills