# claude2video

**Repository Path**: likefallwind/claude2video

## Basic Information

- **Project Name**: claude2video
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-03-19
- **Last Updated**: 2026-03-21

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# claude2video

An AI agent skill for generating educational videos from a topic or Markdown outline. Built on [Manim Community Edition](https://www.manim.community/) with TTS narration via edge-tts.

Adapted from the [Code2Video](https://github.com/showlab/Code2Video) paradigm (arXiv:2510.01174) into a Claude Code skill format.

---

## Installation

**Step 1 — Install the skill into your project:**

```bash
npm install claude2video
```

This copies the skill files into `.agents/skills/claude2video/` in your project.

**Step 2 — Install Python and system dependencies:**

```bash
bash node_modules/claude2video/setup.sh
```

Supports Ubuntu/Debian (including WSL2), Fedora, and macOS.

**Step 3 — Use the skill in Claude Code:**

```
@claude2video 生成一个讲解「光合作用」的教学视频
```

---

## Overview

The skill takes a **learning topic string** or a **Markdown course outline file** as input and produces a fully narrated educational video (MP4) through an 8-stage pipeline:

```
Planner → TTS → Coder → Critic → Audio-Video Merge → Final Assembly
```

Each section of the course becomes a separate Manim animation scene. Narration audio is synthesized first, and animation timing is then paced to match the audio exactly.

---

## Prerequisites

| Dependency | Install |
|---|---|
| Python 3.9+ | — |
| Manim Community v0.19+ | `pip install manim` |
| edge-tts | `pip install edge-tts` |
| ffmpeg | system package |

---

## Repository Structure

```
.
├── .agents/skills/claude2video/   # The skill (copied to user project on npm install)
│   ├── SKILL.md                   # Entry point — 8-stage pipeline execution guide
│   ├── planner.md                 # P_parse / P_outline / P_storyboard / P_asset prompts
│   ├── coder.md                   # P_coder + Duration Control + ScopeRefine prompts
│   ├── critic.md                  # P_refine (layout) + P_aesth (scoring) prompts
│   ├── tts.py                     # CLI: storyboard.json → audio/ + durations.json
│   ├── teaching_scene.py          # TeachingScene base class (6×6 grid system)
│   └── example_section.py         # Working example — manim render -ql example_section.py
├── examples/
│   └── biology_cells.md           # Sample Markdown course outline (7th grade biology)
├── scripts/
│   └── install.js                 # npm postinstall — copies skill to .agents/skills/
├── package.json
├── requirements.txt               # Python dependencies
├── setup.sh                       # System dependency installer
├── CLAUDE.md                      # Project instructions for Claude Code
├── .gitignore
└── README.md
```

Output is written to `output/{topic}/` (gitignored):

```
output/{topic}/
├── outline.json                 # Stage 1 output
├── storyboard.json              # Stage 2 output (includes narrations)
├── teaching_scene.py            # Copied from skill dir
├── sections/
│   ├── section_1.py             # Generated Manim scene
│   └── ...
├── audio/
│   ├── section_1/
│   │   ├── line_1.mp3           # Per-line TTS audio
│   │   └── section_1.mp3        # Merged section audio (with 0.5s gaps)
│   └── durations.json           # Per-line durations for animation sync
├── sections_with_audio/
│   ├── section_1.mp4            # Video + audio merged
│   └── ...
├── media/                       # Manim render output
└── final_video.mp4              # Final output
```

---

## Usage

Invoke the skill from Claude Code:

```
/claude2video
```

Or reference it directly in a prompt:

> 请用 claude2video skill 生成一个讲解「勾股定理」的教学视频

The skill will ask whether to generate sections in **parallel** (faster) or **sequential** (step-by-step review) before starting.

### Input options

**Option A — Topic string:**
> 生成一个讲解「自由落体」的教学视频

**Option B — Markdown outline file:**
> 用这个大纲文件生成视频：`my_course.md`

The Markdown outline format:
```markdown
# 课程标题
目标受众：高中生

## 第一节：概念介绍
内容...

## 第二节：公式推导
内容...
```

Section count in the outline is preserved exactly.

---

## Pipeline Stages

| Stage | Role | Description |
|---|---|---|
| 1 | Planner | Parse Markdown outline or generate outline from topic |
| 2 | Planner | Build storyboard (lecture_lines + narrations + animations) |
| 3 | Planner | Select & download reference image assets |
| 4 | TTS | Synthesize narration audio, output durations.json |
| 5 | Coder | Generate + render Manim section code (parallel or sequential) |
| 6 | Critic | Visual layout refinement via frame extraction (up to 3 rounds) |
| 7 | — | Merge section video + audio via ffmpeg |
| 8 | Critic | Optional aesthetic scoring + final concatenation |

---

## Core Design

### 6×6 Visual Anchor Grid

All animations are placed using a named grid (`A1`–`F6`). The left column is reserved for lecture notes; the right area is the animation canvas.

```
           A1  A2  A3  A4  A5  A6
           B1  B2  B3  B4  B5  B6
lecture |  C1  C2  C3  C4  C5  C6
           D1  D2  D3  D4  D5  D6
           E1  E2  E3  E4  E5  E6
           F1  F2  F3  F4  F5  F6
```

Two positioning methods (no manual coordinates):
- `self.place_at_grid(obj, 'B2')` — snap to grid point
- `self.place_in_area(obj, 'A1', 'C3')` — fit within grid region

### Audio-First TTS Sync

TTS durations are generated **before** code generation. The Coder receives `line_durations` and pads each animation block with `self.wait()` so video duration matches audio exactly.

### ScopeRefine Debugging

Render failures are resolved through escalating scope:
1. **Line scope** — fix the offending line (≤3 attempts)
2. **Block scope** — rewrite the animation block (≤2 attempts)
3. **Global scope** — regenerate the entire scene from scratch

---

## References

- Paper: [Code2Video: Automated Educational Video Generation Leveraging AI Coding Agents](https://arxiv.org/abs/2510.01174)
- Reference implementation: https://github.com/showlab/Code2Video