Setup & Installation

Install Video Podcast Maker using the ClawHub CLI or OpenClaw CLI:

clawhub install video-podcast-maker

If the CLI is not installed:

npx clawhub@latest install video-podcast-maker

Or install with OpenClaw CLI:

openclaw skills install video-podcast-maker

View on ClawHub · View on GitHub

What This Skill Does

Video Podcast Maker is a Creative & Media skill for OpenClaw by Agents365-ai.

REQUIRED: Load Remotion Best Practices First

This skill depends on remotion-best-practices. You MUST invoke it before proceeding:

Skill tool: skill="remotion-best-practices"

Video Podcast Maker

Quick Start

Open Claude Code and say: "Make a video podcast about $ARGUMENTS"

Or invoke directly: /video-podcast-maker AI Agent tutorial


Design Learning

Extract visual design patterns from reference videos or images, store them in a searchable library, and apply them to new video compositions.

Commands

# Learn from images (Claude Vision analyzes design patterns)
python3 learn_design.py ./screenshot1.png ./screenshot2.png

# Learn from a local video (ffmpeg extracts frames automatically)
python3 learn_design.py ./reference.mp4

# Learn from a URL (Playwright captures screenshots — experimental)
python3 learn_design.py https://www.bilibili.com/video/BV1xx411c7mD

# Save with a named profile and tags
python3 learn_design.py ./reference.mp4 --profile "tech-minimal" --tags "tech,minimal,dark"

Reference Library Commands

references list          # List all stored references (auto-cleans orphaned entries)
references show <id>     # Show full design report for a reference
references delete <id>   # Delete a reference and its files

Style Profile Commands

profiles list            # List all saved style profiles
profiles show <name>     # Show profile props_override
profiles delete <name>   # Delete a style profile
profiles create <name>   # Create a new style profile interactively

Pre-Workflow Usage

When the user provides a reference video or image alongside a video creation request, extract design patterns before Step 1 and apply them as session overrides. See references/workflow-steps.md → Pre-workflow section for the full extraction flow.

Step 9 Integration

Before choosing visual design in Step 9, check for matching style profiles or reference library entries. Apply the best match as a starting point for Remotion composition props. See references/workflow-steps.md → Step 9 Style Profile Integration for the priority chain.


Auto Update Check

Agent behavior: Check for updates at most once per day (throttled by timestamp file):

STAMP="${CLAUDE_SKILL_DIR}/.last_update_check"
NOW=$(date +%s)
LAST=$(cat "$STAMP" 2>/dev/null || echo 0)
if [ $((NOW - LAST)) -gt 86400 ]; then
  timeout 5 git -C ${CLAUDE_SKILL_DIR} fetch --quiet 2>/dev/null || true
  LOCAL=$(git -C ${CLAUDE_SKILL_DIR} rev-parse HEAD 2>/dev/null)
  REMOTE=$(git -C ${CLAUDE_SKILL_DIR} rev-parse origin/main 2>/dev/null)
  echo "$NOW" > "$STAMP"
  if [ -n "$LOCAL" ] && [ -n "$REMOTE" ] && [ "$LOCAL" != "$REMOTE" ]; then
    echo "UPDATE_AVAILABLE"
  else
    echo "UP_TO_DATE"
  fi
else
  echo "SKIPPED_RECENT_CHECK"
fi
  • Update available: Ask user via AskUserQuestion. Yes → git -C ${CLAUDE_SKILL_DIR} pull. No → continue.
  • Up to date / Skipped: Continue silently.

Prerequisites Check

!( missing=""; node -v >/dev/null 2>&1 || missing="$missing node"; python3 --version >/dev/null 2>&1 || missing="$missing python3"; ffmpeg -version >/dev/null 2>&1 || missing="$missing ffmpeg"; [ -n "$AZURE_SPEECH_KEY" ] || missing="$missing AZURE_SPEECH_KEY"; if [ -n "$missing" ]; then echo "MISSING:$missing"; else echo "ALL_OK"; fi )

If MISSING reported above, see README.md for full setup instructions (install commands, API key setup, Remotion project init).


Overview

Automated pipeline for professional Bilibili horizontal knowledge videos from a topic.

Target: Bilibili horizontal video (16:9)

  • Resolution: 3840×2160 (4K) or 1920×1080 (1080p)
  • Style: Clean white (default)

Tech stack: Claude + Azure TTS + Remotion + FFmpeg

Output Specs

Parameter Horizontal (16:9) Vertical (9:16)
Resolution 3840×2160 (4K) 2160×3840 (4K)
Frame rate 30 fps 30 fps
Encoding H.264, 16Mbps H.264, 16Mbps
Audio AAC, 192kbps AAC, 192kbps
Duration 1-15 min 60-90s (highlight)

Execution Modes

Agent behavior: Detect user intent at workflow start:

  • "Make a video about..." / no special instructions → Auto Mode
  • "I want to control each step" / mentions interactive → Interactive Mode
  • Default: Auto Mode

Auto Mode (Default)

Full pipeline with sensible defaults. Mandatory stop at Step 9:

  1. Step 9: Launch Remotion Studio — user reviews in real-time, requests changes until satisfied
  2. Step 10: Only triggered when user explicitly says "render 4K" / "render final version"
Step Decision Auto Default
3 Title position top-center
5 Media assets Skip (text-only animations)
7 Thumbnail method Remotion-generated (16:9 + 4:3)
9 Outro animation Pre-made MP4 (white/black by theme)
9 Preview method Remotion Studio (mandatory)
12 Subtitles Skip
14 Cleanup Auto-clean temp files

Users can override any default in their initial request:

  • "make a video about AI, burn subtitles" → auto + subtitles on
  • "use dark theme, AI thumbnails" → auto + dark + imagen
  • "need screenshots" → auto + media collection enabled

Interactive Mode

Prompts at each decision point. Activated by:

  • "interactive mode" / "I want to choose each option"
  • User explicitly requests control

Workflow State & Resume

Planned feature (not yet implemented). Currently, workflow progress is tracked via Claude's conversation context. If a session is interrupted, re-invoke the skill and Claude will check existing files in videos/{name}/ to determine where to resume.


Technical Rules

Hard constraints for video production — visual design is Claude's creative freedom:

Rule Requirement
Single Project All videos under videos/{name}/ in user's Remotion project. NEVER create a new project per video.
4K Output 3840×2160, use scale(2) wrapper over 1920×1080 design space
Content Width ≥85% of screen width
Bottom Safe Zone Bottom 100px reserved for subtitles
Audio Sync All animations driven by timing.json timestamps
Thumbnail MUST generate 16:9 (1920×1080) AND 4:3 (1200×900). Centered layout, title ≥120px, icons ≥120px, fill most of canvas. See design-guide.md.
Font PingFang SC / Noto Sans SC for Chinese text
Studio Before Render MUST launch remotion studio for user review. NEVER render 4K until user explicitly confirms ("render 4K", "render final").

Additional Resources

Claude loads these files on demand — do NOT load all at once:

  • references/workflow-steps.md: Detailed step-by-step instructions (Steps 1-14). Load at workflow start.
  • references/design-guide.md: Visual minimums, typography, layout patterns, checklists. MUST load before Step 9.
  • references/troubleshooting.md: Error fixes, BGM options, preference commands, preference learning. Load on error or user request.
  • examples/: Real production video projects. Claude may reference these for composition structure and timing.json format.

Directory Structure

project-root/                           # Remotion project root
├── src/remotion/                       # Remotion source
│   ├── compositions/                   # Video composition definitions
│   ├── Root.tsx                        # Remotion entry
│   └── index.ts                        # Exports
│
├── public/                             # Remotion default (unused — use --public-dir videos/{name}/)
│
├── videos/{video-name}/                # Video project assets
│   ├── workflow_state.json             # Workflow progress
│   ├── topic_definition.md             # Step 1
│   ├── topic_research.md               # Step 2
│   ├── podcast.txt                     # Step 4: narration script
│   ├── podcast_audio.wav               # Step 8: TTS audio
│   ├── podcast_audio.srt               # Step 8: subtitles
│   ├── timing.json                     # Step 8: timeline
│   ├── thumbnail_*.png                 # Step 7
│   ├── output.mp4                      # Step 10
│   ├── video_with_bgm.mp4             # Step 11
│   ├── final_video.mp4                 # Step 12: final output
│   └── bgm.mp3                         # Background music
│
└── remotion.config.ts

Important: Always use --public-dir and full output path for Remotion render:

npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --public-dir videos/{name}/

Naming Rules

Video name {video-name}: lowercase English, hyphen-separated (e.g., reference-manager-comparison)

Section name {section}: lowercase English, underscore-separated, matches [SECTION:xxx]

Thumbnail naming (16:9 AND 4:3 both required):

Type 16:9 4:3
Remotion thumbnail_remotion_16x9.png thumbnail_remotion_4x3.png
AI thumbnail_ai_16x9.png thumbnail_ai_4x3.png

Public Directory

Use --public-dir videos/{name}/ for all Remotion commands. Each video's assets (timing.json, podcast_audio.wav, bgm.mp3) stay in its own directory — no copying to public/ needed. This enables parallel renders of different videos.

# All render/studio/still commands use --public-dir
npx remotion studio src/remotion/index.ts --public-dir videos/{name}/
npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --public-dir videos/{name}/ --video-bitrate 16M
npx remotion still src/remotion/index.ts Thumbnail16x9 videos/{name}/thumbnail.png --public-dir videos/{name}/

Workflow

Progress Tracking

At Step 1 start:

  1. Create videos/{name}/workflow_state.json
  2. Use TaskCreate to create tasks per step. Mark in_progress on start, completed on finish.
  3. Each step updates BOTH workflow_state.json AND TaskUpdate.
 1. Define topic direction → topic_definition.md
 2. Research topic → topic_research.md
 3. Design video sections (5-7 chapters)
 4. Write narration script → podcast.txt
 5. Collect media assets → media_manifest.json
 6. Generate publish info (Part 1) → publish_info.md
 7. Generate thumbnails (16:9 + 4:3) → thumbnail_*.png
 8. Generate TTS audio → podcast_audio.wav, timing.json
 9. Create Remotion composition + Studio preview (mandatory stop)
10. Render 4K video (only on user request) → output.mp4
11. Mix background music → video_with_bgm.mp4
12. Add subtitles (optional) → final_video.mp4
13. Complete publish info (Part 2) → chapter timestamps
14. Verify output & cleanup
15. Generate vertical shorts (optional) → shorts/

Validation Checkpoints

After Step 8 (TTS):

  • podcast_audio.wav exists and plays correctly
  • timing.json has all sections with correct timestamps
  • podcast_audio.srt encoding is UTF-8

After Step 10 (Render):

  • output.mp4 resolution is 3840x2160
  • Audio-video sync verified
  • No black frames

Key Commands Reference

See CLAUDE.md for the full command reference (TTS, Remotion, FFmpeg, shorts generation).


User Preference System

Skill learns and applies preferences automatically. See references/troubleshooting.md for commands and learning details.

Storage Files

File Purpose
user_prefs.json Learned preferences (auto-created from template)
user_prefs.template.json Default values
prefs_schema.json JSON schema definition

Priority

Final = merge(Root.tsx defaults < global < topic_patterns[type] < current instructions)

User Commands

Command Effect
"show preferences" Show current preferences
"reset preferences" Reset to defaults
"save as X default" Save to topic_patterns

Troubleshooting & Preferences

Full reference: Read references/troubleshooting.md on errors, preference questions, or BGM options.

Version History

Latest version: 2.0.0

First published: Apr 6, 2026. Last updated: Apr 6, 2026.

1 version released.

Frequently Asked Questions

Is Video Podcast Maker free to use?
Yes. Video Podcast Maker is a free, open-source skill available on the OpenClaw Skills Registry. You can install and use it at no cost, and the source code is publicly available for review and contribution.
What platforms does Video Podcast Maker support?
It runs on any platform that supports OpenClaw, including macOS, Linux, and Windows. As long as you have the OpenClaw runtime installed, Video Podcast Maker will work seamlessly across operating systems.
How do I update Video Podcast Maker?
Run openclaw skills update video-podcast-maker to get the latest version. OpenClaw will download and apply the update automatically, preserving your existing configuration.
Can I use Video Podcast Maker with other skills?
Yes. OpenClaw skills are composable — you can combine Video Podcast Maker with any other installed skill in your workflows. This allows you to build powerful multi-step automations by chaining skills together.