retrospective

Post-engagement lessons-learned retrospective. Reads the engagement directory, analyzes skill routing decisions, identifies knowledge gaps and missing skills, and produces an actionable improvement report.

blacklanternsecurity 234 32 Updated 4mo ago

GitHub

Install

npx skillscat add blacklanternsecurity/red-run/retrospective

Install via the SkillsCat registry.

SKILL.md

Engagement Retrospective

You are conducting a post-engagement retrospective for a penetration tester.
Your job is to analyze what happened during the engagement, evaluate how the
skill library performed, identify gaps, and produce actionable improvement
items. All analysis is local — you never touch the target.

Prerequisites

engagement/ directory must exist with at least activity.md and state.md
The engagement should be complete or paused — this is a post-mortem, not a
mid-engagement review
MCP skill-router available (search_skills, list_skills)

If engagement/activity.md or engagement/state.md are missing, tell the user:

Cannot run retrospective — engagement/activity.md and engagement/state.md are
required. These files are created by the orchestrator or discovery skills
during an engagement.

Engagement Logging

Check for ./engagement/ directory. If absent, proceed without logging.

When an engagement directory exists:

Print [retrospective] Activated → <target> to the screen on activation.
Evidence → save significant output to engagement/evidence/ with
descriptive filenames (e.g., sqli-users-dump.txt, ssrf-aws-creds.json).

Do NOT write to engagement/activity.md, engagement/findings.md, or
engagement state. The orchestrator maintains these files. Report all findings
in your return summary.

Step 1: Gather Context

Read all engagement files:

engagement/scope.md — targets, objectives, rules of engagement
engagement/state.db — final engagement state (call get_state_summary())
engagement/activity.md — chronological activity log
engagement/findings.md — confirmed vulnerabilities

If any file is missing (other than the required two), note it but continue with
what's available.

Subagent Execution Logs

Check for engagement/evidence/logs/*.jsonl. These are raw JSONL transcripts
captured from subagent executions by the SubagentStop hook. They contain every
tool call, tool result, assistant reasoning step, MCP call, and error — ground
truth that activity.md summaries cannot provide.

If JSONL logs are found, spawn a general-purpose Task subagent to parse them:

Task(
    subagent_type="general-purpose",
    prompt="Parse the subagent JSONL logs in engagement/evidence/logs/. For each
    .jsonl file, read it and extract a structured timeline:
    - Tool calls: tool name + input (truncate large inputs to 200 chars)
    - Tool results: status + truncated output (first 200 chars)
    - Assistant reasoning: key decision points and rationale
    - MCP calls: server + method + key params
    - Errors: any failures, retries, or exceptions
    - Target artifacts: flag commands that may have created artifacts on the
      target (file writes, user creation, registry changes, scheduled tasks,
      services installed, firewall rules modified)

    Return a markdown summary with one section per log file. Section header
    format: '## {filename} ({agent-type})'. Include a 'Target Artifacts' subsection
    listing any commands that may need cleanup.",
    description="Parse subagent JSONL logs"
)

Incorporate the parsed log data into subsequent analysis steps. The logs provide:

Routing analysis (Step 2): exact skills loaded, whether get_skill() was
called, inline vs routed execution
Knowledge gap analysis (Step 3): failed payloads, retries, manual
workarounds visible in the command sequence
Operational review (Step 5): exact commands run, timing, error recovery
decisions, artifact creation on target

If no JSONL logs are found, continue with engagement files only and note that
subagent execution traces are unavailable.

Engagement Summary

Summarize the engagement for the user:

Target(s) and objective(s)
Outcome: Were objectives met? Partially? Not at all?
Timeline: How many skill invocations, roughly how long
Final state: What access/credentials/vulns existed at completion

Ask the user if this summary is accurate and whether there's context not
captured in the engagement files (e.g., decisions made verbally, time pressure,
scope changes mid-engagement).

Step 2: Skill Routing Analysis

First, load the full skill inventory: call list_skills() to get every
available skill with its category and description. This is your reference for
what the library covers.

Read engagement/activity.md and compare each activity against the inventory.

For each activity entry, determine:

Was a skill loaded? Check for skill name references in activity entries.
Was it the right skill? Read the skill's SKILL.md at
skills/<category>/<skill-name>/SKILL.md to check its actual scope and
compare against what was done.
Were any skills skipped? Look for technique execution that should have
been routed through a skill (e.g., running sqlmap directly instead of
loading sql-injection-union).
Was anything done inline that a skill covers? Identify commands or
techniques executed without loading the corresponding skill.

Build a routing ledger:

Activity	Skill Used	Correct?	Notes
Web recon	web-discovery	Yes	—
SQL injection	(inline)	No	Should have routed to sql-injection-union

Present this ledger to the user and discuss any routing decisions that seem
wrong or suboptimal.

Step 3: Knowledge Gap Analysis

For each skill that was invoked during the engagement, read its SKILL.md at
skills/<category>/<skill-name>/SKILL.md, then evaluate:

Did the skill have adequate payloads? Were hand-crafted payloads needed
that should be embedded in the skill?
Were edge cases hit? Did the target present conditions the skill didn't
cover (e.g., unusual encodings, non-standard ports, WAF bypass needed)?
Was troubleshooting adequate? Did the skill's troubleshooting section
cover the problems encountered?
Was the methodology complete? Were steps missing or out of order?
Were tool commands correct? Did embedded commands work or need
modification?

For each gap found, note the specific skill and what's missing.

Step 4: Missing Skill Identification

Identify techniques used during the engagement that don't have a corresponding
skill. Consider:

Techniques used manually — anything done by hand that was non-trivial
and repeatable
Tool workflows — complex tool chains that could be standardized
Edge-case techniques — bypass methods, unusual attack paths, or niche
protocols encountered

Before proposing a new skill, verify it doesn't already exist: call
search_skills("description of the technique") and check results. A skill may
exist but was missed during the engagement (routing gap, not a coverage gap).

For each confirmed missing skill, propose:

Skill name (kebab-case)
Category (web, ad, privesc, network, etc.)
What it would cover
Why it's needed (one-off or likely to recur?)

For techniques where a skill exists but wasn't used, add these to the routing
ledger in Step 2 instead.

Step 5: Operational Review

Evaluate four operational dimensions:

Manual Interventions

What was done by hand that a skill should automate?
Were payloads crafted manually that should be embedded?
Was tool setup or configuration needed that should be in prerequisites?

OPSEC

Were OPSEC ratings respected? Did noisy skills get used when quiet
alternatives existed?
Were detection-prone techniques used unnecessarily?
Was Kerberos-first authentication followed in AD environments?
Were any OPSEC incidents noted (alerts triggered, blocks encountered)?

Routing Efficiency

Were there unnecessary detours? (e.g., broad scanning when targeted testing
would have found the same issue faster)
Were redundant scans run? (e.g., re-scanning ports already in the engagement state)
Were there missed shortcuts? (e.g., credentials found early but not tested
against other services until late)
Did the orchestrator chain vulnerabilities effectively?

State Management

Call get_state_summary() from the state-reader MCP server to read current
engagement state. Use it to:

Skip re-testing targets, parameters, or vulns already confirmed
Leverage existing credentials or access for this technique
Understand what's been tried and failed (check Blocked section)

Do NOT write engagement state. When your work is complete, report all
findings clearly in your return summary. The orchestrator parses your summary
and records state changes. Your return summary must include:

New targets/hosts discovered (with ports and services)
New credentials or tokens found
Access gained or changed (user, privilege level, method)
Vulnerabilities confirmed (with status and severity)
Pivot paths identified (what leads where)
Blocked items (what failed and why, whether retryable)

Step 6: Critical Path Review

Map the actual kill chain from recon to objective (or as far as the engagement
got):

[recon] → [discovery] → [initial access] → [pivot/escalation] → [objective]

For each step, note:

What skill handled it
Whether it was the fastest path
What blocked progress and how it was resolved
Whether steps could have been parallelized or reordered

Identify bottlenecks — where did the engagement stall, and why?

Step 7: Write Report

Produce engagement/retrospective.md with all findings:

# Engagement Retrospective

## Summary
<One paragraph: target, objective, outcome>

## Kill Chain
<Ordered attack path from recon to objective>

## Skill Routing Review
### Skills Invoked
- <skill-name> — <what it did, whether it performed well>
### Skills Skipped (Should Have Been Invoked)
- <skill-name> — <why it should have been invoked, what was done instead>
### Inline Execution (Should Have Been Routed)
- <description of what was done inline instead of via a skill>

## Knowledge Gaps
### <skill-name>
- <missing payload, edge case, or methodology>

## Missing Skills
- **<proposed-skill-name>** (<category>) — <what it would cover, why needed>

## Operational Review
### Manual Interventions
- <what was done manually that should be automated>
### OPSEC
- <assessment of noise level, detection surface>
### Routing Efficiency
- <unnecessary detours, missed shortcuts>
### State Management
- <quality of state.md flow, stale reads, missing updates>

## Actionable Items
Priority-ordered list:
1. [skill-update] <skill-name>: <specific change needed>
2. [new-skill] <proposed-name>: <brief description>
3. [routing-fix] <skill-name>: <routing table update needed>
4. [template-fix] <change to _template or conventions>

After writing the report, append a summary to engagement/activity.md:

### [YYYY-MM-DD HH:MM:SS] retrospective → complete
- Report written to engagement/retrospective.md
- Actionable items: N skill-update, N new-skill, N routing-fix, N template-fix

Present the actionable items to the user and ask which ones to prioritize.

Step 8: Implement Improvements

After the user selects which items to prioritize, make the edits. Skills are
plain Markdown files at skills/<category>/<skill-name>/SKILL.md — edit them
directly.

For each prioritized item:

[skill-update] — Edit an existing skill

Read the SKILL.md file at skills/<category>/<skill-name>/SKILL.md
Make the change — add payloads, fix methodology, update troubleshooting,
etc.
Preserve the existing structure and conventions (frontmatter, sections,
embedded payloads format)

[new-skill] — Create a new skill

Read skills/_template/SKILL.md for the canonical structure
Write the new skill to skills/<category>/<skill-name>/SKILL.md
Update the corresponding discovery skill's routing table to include it

[routing-fix] — Fix skill routing

Read the skill that needs the routing update
Add or fix the routing reference: "STOP. Return to orchestrator
recommending skill-name. Pass: ."

[template-fix] — Update conventions

Read skills/_template/SKILL.md
Make the change and note which existing skills may need the same update

After all edits are complete, re-index so the MCP skill-router picks up
changes:

uv run --directory tools/skill-router python indexer.py

Show the user what was changed and suggest committing.

Troubleshooting

Engagement directory exists but files are empty

The engagement may have been run without logging enabled. Do the retrospective
from conversation context instead — ask the user to describe what happened, then
analyze the current session transcript.

Activity log has no skill references

Techniques may have been executed inline (without loading the corresponding
skill) or the engagement predates the current skill library. Flag this
as a routing gap and reconstruct the timeline from the engagement state and findings.md
instead.

Multiple engagement directories

If the user has run multiple engagements, ask which one to review. Look for
date-stamped directories or scope.md contents to differentiate them.

retrospective

Install

Engagement Retrospective

Prerequisites

Engagement Logging

Step 1: Gather Context

Subagent Execution Logs

Engagement Summary

Step 2: Skill Routing Analysis

Step 3: Knowledge Gap Analysis

Step 4: Missing Skill Identification

Step 5: Operational Review

Manual Interventions

OPSEC

Routing Efficiency

State Management

Step 6: Critical Path Review

Step 7: Write Report

Step 8: Implement Improvements

[skill-update] — Edit an existing skill

[new-skill] — Create a new skill

[routing-fix] — Fix skill routing

[template-fix] — Update conventions

Troubleshooting

Engagement directory exists but files are empty

Activity log has no skill references

Multiple engagement directories

Categories

Install

Recommended Skills