Setup & Installation
Install Video Podcast Maker using the ClawHub CLI or OpenClaw CLI:
clawhub install video-podcast-makerIf the CLI is not installed:
npx clawhub@latest install video-podcast-makerOr install with OpenClaw CLI:
openclaw skills install video-podcast-makerWhat This Skill Does
Video Podcast Maker is a Creative & Media skill for OpenClaw by Agents365-ai.
REQUIRED: Load Remotion Best Practices First
This skill depends on
remotion-best-practices. You MUST invoke it before proceeding:Skill tool: skill="remotion-best-practices"
Video Podcast Maker
Quick Start
Open Claude Code and say: "Make a video podcast about $ARGUMENTS"
Or invoke directly: /video-podcast-maker AI Agent tutorial
Design Learning
Extract visual design patterns from reference videos or images, store them in a searchable library, and apply them to new video compositions.
Commands
# Learn from images (Claude Vision analyzes design patterns)
python3 learn_design.py ./screenshot1.png ./screenshot2.png
# Learn from a local video (ffmpeg extracts frames automatically)
python3 learn_design.py ./reference.mp4
# Learn from a URL (Playwright captures screenshots — experimental)
python3 learn_design.py https://www.bilibili.com/video/BV1xx411c7mD
# Save with a named profile and tags
python3 learn_design.py ./reference.mp4 --profile "tech-minimal" --tags "tech,minimal,dark"
Reference Library Commands
references list # List all stored references (auto-cleans orphaned entries)
references show <id> # Show full design report for a reference
references delete <id> # Delete a reference and its files
Style Profile Commands
profiles list # List all saved style profiles
profiles show <name> # Show profile props_override
profiles delete <name> # Delete a style profile
profiles create <name> # Create a new style profile interactively
Pre-Workflow Usage
When the user provides a reference video or image alongside a video creation request, extract design patterns before Step 1 and apply them as session overrides. See references/workflow-steps.md → Pre-workflow section for the full extraction flow.
Step 9 Integration
Before choosing visual design in Step 9, check for matching style profiles or reference library entries. Apply the best match as a starting point for Remotion composition props. See references/workflow-steps.md → Step 9 Style Profile Integration for the priority chain.
Auto Update Check
Agent behavior: Check for updates at most once per day (throttled by timestamp file):
STAMP="${CLAUDE_SKILL_DIR}/.last_update_check"
NOW=$(date +%s)
LAST=$(cat "$STAMP" 2>/dev/null || echo 0)
if [ $((NOW - LAST)) -gt 86400 ]; then
timeout 5 git -C ${CLAUDE_SKILL_DIR} fetch --quiet 2>/dev/null || true
LOCAL=$(git -C ${CLAUDE_SKILL_DIR} rev-parse HEAD 2>/dev/null)
REMOTE=$(git -C ${CLAUDE_SKILL_DIR} rev-parse origin/main 2>/dev/null)
echo "$NOW" > "$STAMP"
if [ -n "$LOCAL" ] && [ -n "$REMOTE" ] && [ "$LOCAL" != "$REMOTE" ]; then
echo "UPDATE_AVAILABLE"
else
echo "UP_TO_DATE"
fi
else
echo "SKIPPED_RECENT_CHECK"
fi
- Update available: Ask user via AskUserQuestion. Yes →
git -C ${CLAUDE_SKILL_DIR} pull. No → continue. - Up to date / Skipped: Continue silently.
Prerequisites Check
!( missing=""; node -v >/dev/null 2>&1 || missing="$missing node"; python3 --version >/dev/null 2>&1 || missing="$missing python3"; ffmpeg -version >/dev/null 2>&1 || missing="$missing ffmpeg"; [ -n "$AZURE_SPEECH_KEY" ] || missing="$missing AZURE_SPEECH_KEY"; if [ -n "$missing" ]; then echo "MISSING:$missing"; else echo "ALL_OK"; fi )
If MISSING reported above, see README.md for full setup instructions (install commands, API key setup, Remotion project init).
Overview
Automated pipeline for professional Bilibili horizontal knowledge videos from a topic.
Target: Bilibili horizontal video (16:9)
- Resolution: 3840×2160 (4K) or 1920×1080 (1080p)
- Style: Clean white (default)
Tech stack: Claude + Azure TTS + Remotion + FFmpeg
Output Specs
| Parameter | Horizontal (16:9) | Vertical (9:16) |
|---|---|---|
| Resolution | 3840×2160 (4K) | 2160×3840 (4K) |
| Frame rate | 30 fps | 30 fps |
| Encoding | H.264, 16Mbps | H.264, 16Mbps |
| Audio | AAC, 192kbps | AAC, 192kbps |
| Duration | 1-15 min | 60-90s (highlight) |
Execution Modes
Agent behavior: Detect user intent at workflow start:
- "Make a video about..." / no special instructions → Auto Mode
- "I want to control each step" / mentions interactive → Interactive Mode
- Default: Auto Mode
Auto Mode (Default)
Full pipeline with sensible defaults. Mandatory stop at Step 9:
- Step 9: Launch Remotion Studio — user reviews in real-time, requests changes until satisfied
- Step 10: Only triggered when user explicitly says "render 4K" / "render final version"
| Step | Decision | Auto Default |
|---|---|---|
| 3 | Title position | top-center |
| 5 | Media assets | Skip (text-only animations) |
| 7 | Thumbnail method | Remotion-generated (16:9 + 4:3) |
| 9 | Outro animation | Pre-made MP4 (white/black by theme) |
| 9 | Preview method | Remotion Studio (mandatory) |
| 12 | Subtitles | Skip |
| 14 | Cleanup | Auto-clean temp files |
Users can override any default in their initial request:
- "make a video about AI, burn subtitles" → auto + subtitles on
- "use dark theme, AI thumbnails" → auto + dark + imagen
- "need screenshots" → auto + media collection enabled
Interactive Mode
Prompts at each decision point. Activated by:
- "interactive mode" / "I want to choose each option"
- User explicitly requests control
Workflow State & Resume
Planned feature (not yet implemented). Currently, workflow progress is tracked via Claude's conversation context. If a session is interrupted, re-invoke the skill and Claude will check existing files in
videos/{name}/to determine where to resume.
Technical Rules
Hard constraints for video production — visual design is Claude's creative freedom:
| Rule | Requirement |
|---|---|
| Single Project | All videos under videos/{name}/ in user's Remotion project. NEVER create a new project per video. |
| 4K Output | 3840×2160, use scale(2) wrapper over 1920×1080 design space |
| Content Width | ≥85% of screen width |
| Bottom Safe Zone | Bottom 100px reserved for subtitles |
| Audio Sync | All animations driven by timing.json timestamps |
| Thumbnail | MUST generate 16:9 (1920×1080) AND 4:3 (1200×900). Centered layout, title ≥120px, icons ≥120px, fill most of canvas. See design-guide.md. |
| Font | PingFang SC / Noto Sans SC for Chinese text |
| Studio Before Render | MUST launch remotion studio for user review. NEVER render 4K until user explicitly confirms ("render 4K", "render final"). |
Additional Resources
Claude loads these files on demand — do NOT load all at once:
- references/workflow-steps.md: Detailed step-by-step instructions (Steps 1-14). Load at workflow start.
- references/design-guide.md: Visual minimums, typography, layout patterns, checklists. MUST load before Step 9.
- references/troubleshooting.md: Error fixes, BGM options, preference commands, preference learning. Load on error or user request.
- examples/: Real production video projects. Claude may reference these for composition structure and timing.json format.
Directory Structure
project-root/ # Remotion project root
├── src/remotion/ # Remotion source
│ ├── compositions/ # Video composition definitions
│ ├── Root.tsx # Remotion entry
│ └── index.ts # Exports
│
├── public/ # Remotion default (unused — use --public-dir videos/{name}/)
│
├── videos/{video-name}/ # Video project assets
│ ├── workflow_state.json # Workflow progress
│ ├── topic_definition.md # Step 1
│ ├── topic_research.md # Step 2
│ ├── podcast.txt # Step 4: narration script
│ ├── podcast_audio.wav # Step 8: TTS audio
│ ├── podcast_audio.srt # Step 8: subtitles
│ ├── timing.json # Step 8: timeline
│ ├── thumbnail_*.png # Step 7
│ ├── output.mp4 # Step 10
│ ├── video_with_bgm.mp4 # Step 11
│ ├── final_video.mp4 # Step 12: final output
│ └── bgm.mp3 # Background music
│
└── remotion.config.ts
Important: Always use
--public-dirand full output path for Remotion render:npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --public-dir videos/{name}/
Naming Rules
Video name {video-name}: lowercase English, hyphen-separated (e.g., reference-manager-comparison)
Section name {section}: lowercase English, underscore-separated, matches [SECTION:xxx]
Thumbnail naming (16:9 AND 4:3 both required):
| Type | 16:9 | 4:3 |
|---|---|---|
| Remotion | thumbnail_remotion_16x9.png |
thumbnail_remotion_4x3.png |
| AI | thumbnail_ai_16x9.png |
thumbnail_ai_4x3.png |
Public Directory
Use --public-dir videos/{name}/ for all Remotion commands. Each video's assets (timing.json, podcast_audio.wav, bgm.mp3) stay in its own directory — no copying to public/ needed. This enables parallel renders of different videos.
# All render/studio/still commands use --public-dir
npx remotion studio src/remotion/index.ts --public-dir videos/{name}/
npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --public-dir videos/{name}/ --video-bitrate 16M
npx remotion still src/remotion/index.ts Thumbnail16x9 videos/{name}/thumbnail.png --public-dir videos/{name}/
Workflow
Progress Tracking
At Step 1 start:
- Create
videos/{name}/workflow_state.json - Use
TaskCreateto create tasks per step. Markin_progresson start,completedon finish. - Each step updates BOTH
workflow_state.jsonAND TaskUpdate.
1. Define topic direction → topic_definition.md
2. Research topic → topic_research.md
3. Design video sections (5-7 chapters)
4. Write narration script → podcast.txt
5. Collect media assets → media_manifest.json
6. Generate publish info (Part 1) → publish_info.md
7. Generate thumbnails (16:9 + 4:3) → thumbnail_*.png
8. Generate TTS audio → podcast_audio.wav, timing.json
9. Create Remotion composition + Studio preview (mandatory stop)
10. Render 4K video (only on user request) → output.mp4
11. Mix background music → video_with_bgm.mp4
12. Add subtitles (optional) → final_video.mp4
13. Complete publish info (Part 2) → chapter timestamps
14. Verify output & cleanup
15. Generate vertical shorts (optional) → shorts/
Validation Checkpoints
After Step 8 (TTS):
-
podcast_audio.wavexists and plays correctly -
timing.jsonhas all sections with correct timestamps -
podcast_audio.srtencoding is UTF-8
After Step 10 (Render):
-
output.mp4resolution is 3840x2160 - Audio-video sync verified
- No black frames
Key Commands Reference
See CLAUDE.md for the full command reference (TTS, Remotion, FFmpeg, shorts generation).
User Preference System
Skill learns and applies preferences automatically. See references/troubleshooting.md for commands and learning details.
Storage Files
| File | Purpose |
|---|---|
user_prefs.json |
Learned preferences (auto-created from template) |
user_prefs.template.json |
Default values |
prefs_schema.json |
JSON schema definition |
Priority
Final = merge(Root.tsx defaults < global < topic_patterns[type] < current instructions)
User Commands
| Command | Effect |
|---|---|
| "show preferences" | Show current preferences |
| "reset preferences" | Reset to defaults |
| "save as X default" | Save to topic_patterns |
Troubleshooting & Preferences
Full reference: Read references/troubleshooting.md on errors, preference questions, or BGM options.
Version History
Latest version: 2.0.0
First published: Apr 6, 2026. Last updated: Apr 6, 2026.
1 version released.
Frequently Asked Questions
Is Video Podcast Maker free to use?
What platforms does Video Podcast Maker support?
How do I update Video Podcast Maker?
openclaw skills update video-podcast-maker to get the latest version. OpenClaw will download and apply the update automatically, preserving your existing configuration.