Abel-ai-causality

@Abel-ai-causality Organization

GitHub

1 Skills

0 Total Stars

May 2026 Joined

Public Skills

abelian

by Abel-ai-causality

Adversarial collaboration framework (Kahneman-style applied to LLM dispatch) for deep, innovative, long-horizon iteration with tractable doc and testable metric. Two LLM peers each propose AND challenge each other; mutual inspiration between rounds; mechanism-converge termination. 15 INVARIANTS rules provide long-horizon scaffolding (file-gate, drift, nonce, anti-compaction, forbidden termination rationales, mission-thread goal-anchor, evidence-class enum) — shared substrate with unilateral review frameworks; not abelian-specific. Two iteration modes: - Co-research mode (default since v2.10, "auto-research-loop") — two peer agents both propose AND challenge each other goal-driven; mutual inspiration prevents the hidden collapse of "attack-only adversary + propose-only generator." Best for: discovery, novel design, "where do I start", non-trivial work where any mutation has multiple defensible directions. Cost 2× per round but ~1.5× fewer rounds for non-trivial work (~33% net overhead). Diversity via DIFFERENT CONTEXT FRAMING per peer at SAME max-effort tier (not via downgrading one peer). Cross-model pair preferred for highest diversity; same-model pair with different context-framing is acceptable and beats opus+haiku per empirical 2026-04-26. - Unilateral mode (--mode=unilateral, "auto-verify-loop") — generator + adversary — mutate → evaluate → attack → keep/revert. Opt-in for known- target verification, ship-prep, audit, regression hardening, single-axis micro-optimization. Cost 1×. Cross-model adversary (Codex) opt-in for high-stakes. Default = co-research per v2.10 first-principles audit (collaborative framing > adversarial framing on Codex; "unilateral attack-only is itself a collapse vector for non-trivial work" — SKILL.md's own prior wording). Switch to unilateral with --mode=unilateral when the task is genuinely single-axis verification. Skill activation rule (v2.12, INVARIANTS rule #13): any conversation- level reference to this skill — campaign or meta-audit — that involves ≥3 mutation proposals, protocol-level changes, or "verdict / done / keep / revert / accept / pareto / trade-off" vocabulary applied to mutation evaluation triggers a hard requirement: spawn dispatched adversary (Agent + Skill('dissect') OR codex exec subprocess) BEFORE reaching verdict. Self-attack in conversation context is unilateral self-judge (rule #8 degraded mode), not co-research. RLHF prior overlap means mutator and self-attacker share the same prior over BOTH "what to mutate" and "how to attack mutations" — empirical 17× catch-rate ratio (peer-B vs self-attack, 2026-04-29 self-audit) confirms severity. Target should include executable artifacts whenever possible — spec-only is the degraded mode for both modes. v2.14 — non-code task readiness: attack-class libraries by domain shipped (research-class / audit-class / decision-class / doc-class); doc-task cross-attack template with falsification-form requirement; fuzzy-ground protocol (INVARIANTS rule #8 extension) for tasks where ground is prose / decision / research output rather than code / schema. Non-code campaigns declare task: field and ≥1 library; tasks with no testable metric remain out of scope (positioning preserved: tractable doc + testable metric). v2.15 — telos shift to goal-driven co-research: every round must populate mission_thread (rule #14, 7 fields including ≥2 candidate_routes the LLM generated this round + selection_reason citing trade-offs); adversary header gains evidence_class: enum (rule #15, 6-class ladder from theoretical to live) so attack scope is per-round explicit; commit-gate gains 3 always-on checks (mission_thread completeness, evidence_class enum, goal-progress required) so attack-survival is necessary but not sufficient for commit; convergence schema rewritten — adversary-exhausted and metric-only plateau REMOVED as standalone termination conditions, replaced by Frame-break Protocol (5-step mandatory sequence: reject-pool mining, attack-class library escalation, peer framing swap, goal re-paraphrase from current state, cross-peer alternative_routes mining) which fires when adversary-exhausted OR metric stalled OR candidate_routes weak; only no-proposal-after-K-frame-breaks after all 5 steps yield no positive-EV route can terminate the loop on exhaustion. Adversary mechanism (rules #1, #7, #11, #13) 100% preserved — every round still spawns isolated adversary with nonce header and attack-class checklist; co-research adversary may additionally write informational alternative_routes: (line 273 partial relaxation, co-research only). Cashes v2.13's "Adversarial Collaboration Framework" rename out structurally: collaboration is now in commit-gate / convergence / per-round mission anchor, not just marketing copy. Anchor: codex 56-round trading-internal PM dogfood (2026-05-02) where attacks closed clean rounds 30-56 with zero mission metric movement — v2.14 had no mechanism to revert those rounds; v2.15 does. Use when user says "abelian", "autoloop", "auto-optimize", "run experiments", "optimize this", or "Karpathy loop". The skill name is historical (covers unilateral verification too despite "research" framing); future v3.0 may flip default to co-research once empirical track record validates cost model.

0 1mo ago