Subtitle generation and burn-in. Volcengine transcription â dictionary correction â review â burn-in. Triggers: add subtitles, generate subtitles, å åå¹, çæåå¹, åå¹
Install
npx skillscat add quriosity-agent/qcut/videocut-subtitles Install via the SkillsCat registry.
Subtitles
Transcription â Agent proofreading â Manual review â Burn-in
Core Flow (~8-15 minutes total, including manual review)
1. Extract audio + upload ~1min
â
2. Volcengine transcription (with hot words) ~2min
â
3. Agent auto-proofread ~3-5min
â
4. Manual review & confirm Depends on user
â
5. Burn-in subtitles ~1-2minStep 1: Extract Audio and Upload
# Extract audio
ffmpeg -i "video.mp4" -vn -acodec libmp3lame -y audio.mp3
# Upload to uguu.se (temporary file hosting)
curl -s -F "files[]=@audio.mp3" https://uguu.se/upload
# Returns URL like: https://o.uguu.se/xxxxx.mp3Step 2: Volcengine Transcription (with Hot Words)
The transcription script automatically reads the dictionary as hot words to improve accuracy:
# Dictionary location: <project>/.claude/skills/qcut-toolkit/videocut/subtitles/dictionary.txt
# Script loads it automatically
bash ../talk-edit/scripts/volcengine_transcribe.sh "https://o.uguu.se/xxxxx.mp3"Dictionary format (one word per line):
skills
Claude
AgentStep 3: Agent Auto-Proofread
3.1 Generate Timestamped Subtitles
const result = JSON.parse(fs.readFileSync('volcengine_result.json'));
const subtitles = result.utterances.map((u, i) => ({
id: i + 1,
text: u.text,
start: u.start_time / 1000,
end: u.end_time / 1000
}));
fs.writeFileSync('subtitles_with_time.json', JSON.stringify(subtitles, null, 2));3.2 Agent Manual Proofread (No Scripts)
After transcription, Agent must read all subtitles line by line and manually proofread.
Common misrecognition patterns vary by language. Agent should:
- Check for homophones and similar-sounding word errors
- Check for missing words in sentences
- Verify proper nouns and technical terms against the dictionary
- Fix grammar errors
3.3 Cross-Reference with Script (If Available)
If an original script exists, use it as reference but do not auto-match (text differences cause timestamp drift).
Agent should:
- Read the script as reference
- Manually compare line by line
- Mark uncertain items for manual review
Step 4: Start Review Server
cd subtitles-dir/
node <project>/.claude/skills/qcut-toolkit/videocut/subtitles/scripts/subtitle_server.js 8898 "video.mp4"Visit http://localhost:8898
Features:
- Left: video playback, Right: subtitle list
- Auto-highlight current subtitle during playback
- Double-click subtitle text to edit (timestamps preserved)
- Variable playback speed (1x/1.5x/2x/3x)
- Save subtitles / Export SRT / Burn-in subtitles
- Dictionary quick-insert at bottom
Step 5: Burn-in Subtitles
Default style: Size 22, golden bold, black outline 2px, bottom center
ffmpeg -i "video.mp4" \
-vf "subtitles='video.srt':force_style='FontSize=22,FontName=PingFang SC,Bold=1,PrimaryColour=&H0000deff,OutlineColour=&H00000000,Outline=2,Alignment=2,MarginV=30'" \
-c:a copy \
-y "video_subtitled.mp4"| Parameter | Value | Description |
|---|---|---|
| FontSize | 22 | Font size |
| FontName | PingFang SC | PingFang font |
| Bold | 1 | Bold |
| PrimaryColour | &H0000deff | Golden #ffde00 |
| OutlineColour | &H00000000 | Black outline |
| Outline | 2 | Outline width |
| Alignment | 2 | Bottom center |
| MarginV | 30 | Bottom margin |
Directory Structure
output/YYYY-MM-DD_video-name/subtitles/
âââ 1_transcription/
â âââ audio.mp3
â âââ volcengine_result.json
âââ subtitles_with_time.json # Core file
âââ 3_output/
âââ video.srt
âââ video_subtitled.mp4Subtitle Rules
| Rule | Description |
|---|---|
| One line per screen | No line breaks, no stacking |
| No end punctuation | Hello not Hello. |
| Keep mid-sentence punctuation | Click here, then there |