Yeachan-Heo

visual-verdict

Structured visual QA verdict for screenshot-to-reference comparisons

Yeachan-Heo 30,307 2,397 Updated 3mo ago
GitHub

Install

npx skillscat add yeachan-heo/oh-my-codex/visual-verdict

Install via the SkillsCat registry.

SKILL.md
Use this skill to compare generated UI screenshots against one or more reference images and return a strict JSON verdict that can drive the next edit iteration. - The task includes visual fidelity requirements (layout, spacing, typography, component styling) - You have a generated screenshot and at least one reference image - You need deterministic pass/fail guidance before continuing edits </Use_When> - `reference_images[]` (one or more image paths) - `generated_screenshot` (current output image) - Optional: `category_hint` (e.g., `hackernews`, `sns-feed`, `dashboard`) Return **JSON only** with this exact shape:
{
  "score": 0,
  "verdict": "revise",
  "category_match": false,
  "differences": ["..."],
  "suggestions": ["..."],
  "reasoning": "short explanation"
}

Rules:

  • score: integer 0-100
  • verdict: short status (pass, revise, or fail)
  • category_match: true when the generated screenshot matches the intended UI category/style
  • differences[]: concrete visual mismatches (layout, spacing, typography, colors, hierarchy)
  • suggestions[]: actionable next edits tied to the differences
  • reasoning: 1-2 sentence summary
- Target pass threshold is **90+**. - If `score < 90`, continue editing and rerun `$visual-verdict` before any further code edits in the next iteration. - Persist the verdict in `.omx/state/{scope}/ralph-progress.json` with both: - numeric signal (`score`, threshold pass/fail) - qualitative signal (`reasoning`, `suggestions`, `next_actions`) </Threshold_And_Loop> When mismatch diagnosis is hard: 1. Keep `$visual-verdict` as the authoritative decision. 2. Use pixel-level diff tooling (pixel diff / pixelmatch overlay) as a **secondary debug aid** to localize hotspots. 3. Convert pixel diff hotspots into concrete `differences[]` and `suggestions[]` updates. </Debug_Visualization> ```json { "score": 87, "verdict": "revise", "category_match": true, "differences": [ "Top nav spacing is tighter than reference", "Primary button uses smaller font weight" ], "suggestions": [ "Increase nav item horizontal padding by 4px", "Set primary button font-weight to 600" ], "reasoning": "Core layout matches, but style details still diverge." } ```