Score assistant responses for conversational flow on a strict 1-5 scale, then return strict JSON only with dimension, score, rationale, and improvement suggestions. Use when the user asks to evaluate flow, coherence across turns, responsiveness, or how well the assistant carries context forward.
Resources
1Install
npx skillscat add whitespectre/ai-assistant-evals/eval-conversation-flow Install via the SkillsCat registry.
Eval Conversation Flow
Use this skill to evaluate how well an assistant response fits into the conversation: continuity, coherence, turn-taking, and whether it advances the interaction appropriately.
Inputs
Require:
- The assistant response text to evaluate.
- (Optional) The user’s prior message(s) for context.
Internal Rubric (1–5)
5 = Seamlessly continues the thread; correctly uses context; answers the user’s current ask; transitions naturally; asks clarifying questions only when truly needed
4 = Generally coherent and responsive; minor awkwardness (slight repetition, small context miss) but flow remains smooth
3 = Some coherence, but noticeable issues (repeats prior content, weak transitions, minor context loss, or slightly mismatched pacing)
2 = Poor flow: ignores or misuses context; abrupt topic shifts; repetitive or stilted; does not move the conversation forward
1 = Broken flow: contradicts prior turns, derails the conversation, or responds as if to a different thread entirely
Workflow
- Check context continuity (does it reflect the user’s latest message and prior constraints?).
- Check coherence and pacing (logical order, no abrupt shifts, minimal unnecessary repetition).
- Check interaction quality (does it advance the conversation appropriately?).
- Score on a 1-5 integer scale using the rubric only.
- Write concise rationale tied directly to rubric criteria.
- Produce actionable suggestions that improve flow.
Output Contract
Return JSON only. Do not include markdown, backticks, prose, or extra keys.
Use exactly this schema:
{
"dimension": "conversation_flow",
"score": 1,
"rationale": "...",
"improvement_suggestions": [
"..."
]
}
Hard Rules
dimensionmust always equal"conversation_flow".scoremust be an integer from 1 to 5.rationalemust be concise (max 3 sentences).- Do not include step-by-step reasoning.
improvement_suggestionsmust be a non-empty array of concrete edits.- Never output text outside the JSON object.