Ultimate Video Framework
Enterprise-grade skill for downloads, videos, youtube, other. Includes structured workflows, validation checks, and reusable patterns for media.
Ultimate Video Framework
A comprehensive skill for programmatic video creation and processing — covering video generation from scripts, clip assembly with FFmpeg, subtitle overlay, transition effects, thumbnail generation, and automated video pipeline workflows.
When to Use This Skill
Choose Ultimate Video Framework when you need to:
- Generate videos programmatically from scripts and assets
- Assemble clips with transitions, overlays, and subtitles
- Convert, compress, or reformat video files
- Create video thumbnails and preview GIFs
- Build automated video processing pipelines
Consider alternatives when:
- You need AI-generated video content (use an AI video generation tool)
- You need real-time video streaming (use a streaming platform skill)
- You need video editing with a GUI (use a desktop video editor)
Quick Start
# Basic video operations with FFmpeg # Convert video format ffmpeg -i input.mov -c:v libx264 -crf 23 -c:a aac output.mp4 # Trim video (start at 30s, duration 60s) ffmpeg -i input.mp4 -ss 00:00:30 -t 00:01:00 -c copy trimmed.mp4 # Extract audio from video ffmpeg -i input.mp4 -vn -c:a libmp3lame -q:a 2 audio.mp3 # Generate thumbnail at 5 seconds ffmpeg -i input.mp4 -ss 00:00:05 -vframes 1 thumbnail.jpg
# Video pipeline with MoviePy from moviepy.editor import ( VideoFileClip, TextClip, CompositeVideoClip, concatenate_videoclips, AudioFileClip ) def create_video_from_clips(clips_config, output_path): """Assemble a video from multiple clips with transitions.""" clips = [] for config in clips_config: clip = VideoFileClip(config["path"]) if "start" in config: clip = clip.subclip(config["start"], config.get("end")) if "resize" in config: clip = clip.resize(config["resize"]) clips.append(clip) # Concatenate with crossfade final = concatenate_videoclips(clips, method="compose") # Add background audio if provided if "audio" in clips_config[0]: audio = AudioFileClip(clips_config[0]["audio"]) final = final.set_audio(audio.subclip(0, final.duration)) final.write_videofile(output_path, codec="libx264", audio_codec="aac") clips = [ {"path": "intro.mp4", "end": 5}, {"path": "demo.mp4", "start": 10, "end": 60}, {"path": "outro.mp4"}, ] create_video_from_clips(clips, "final_video.mp4")
Core Concepts
FFmpeg Common Operations
| Operation | Command | Purpose |
|---|---|---|
| Convert | ffmpeg -i in.mov -c:v libx264 out.mp4 | Format conversion |
| Trim | ffmpeg -i in.mp4 -ss 0:30 -t 1:00 out.mp4 | Cut segment |
| Resize | ffmpeg -i in.mp4 -vf scale=1280:720 out.mp4 | Resolution change |
| Compress | ffmpeg -i in.mp4 -crf 28 out.mp4 | Reduce file size |
| Audio extract | ffmpeg -i in.mp4 -vn audio.mp3 | Extract audio track |
| Thumbnail | ffmpeg -i in.mp4 -ss 5 -vframes 1 thumb.jpg | Generate thumbnail |
| GIF | ffmpeg -i in.mp4 -vf "fps=10,scale=480:-1" out.gif | Preview GIF |
| Subtitle burn | ffmpeg -i in.mp4 -vf subtitles=sub.srt out.mp4 | Burn subtitles |
Subtitle Overlay
# Burn subtitles into video ffmpeg -i video.mp4 -vf "subtitles=captions.srt:force_style='\ FontName=Arial,FontSize=24,PrimaryColour=&H00FFFFFF,\ OutlineColour=&H00000000,Outline=2,Shadow=1'" \ -c:a copy output_with_subs.mp4 # Add text overlay (watermark/title) ffmpeg -i video.mp4 -vf "drawtext=text='Demo Video':\ fontsize=36:fontcolor=white:x=(w-text_w)/2:y=50:\ fontfile=/path/to/font.ttf:borderw=2:bordercolor=black" \ -c:a copy output_with_title.mp4
Video Assembly Pipeline
# Automated video assembly from a script import subprocess import json from pathlib import Path def assemble_video(script_path, output_path): """Build a video from a JSON script definition.""" with open(script_path) as f: script = json.load(f) temp_files = [] for i, segment in enumerate(script["segments"]): temp_out = f"/tmp/segment_{i:03d}.mp4" if segment["type"] == "clip": # Trim and resize clip cmd = [ "ffmpeg", "-y", "-i", segment["source"], "-ss", str(segment.get("start", 0)), "-t", str(segment.get("duration", 5)), "-vf", f"scale={script['width']}:{script['height']}", "-c:v", "libx264", "-crf", "23", temp_out ] subprocess.run(cmd, capture_output=True) elif segment["type"] == "title": # Generate title card cmd = [ "ffmpeg", "-y", "-f", "lavfi", "-i", f"color=c={segment.get('bg', 'black')}:s={script['width']}x{script['height']}:d={segment.get('duration', 3)}", "-vf", f"drawtext=text='{segment['text']}':fontsize=48:fontcolor=white:x=(w-text_w)/2:y=(h-text_h)/2", temp_out ] subprocess.run(cmd, capture_output=True) temp_files.append(temp_out) # Create concat list concat_file = "/tmp/concat_list.txt" with open(concat_file, "w") as f: for tf in temp_files: f.write(f"file '{tf}'\n") # Concatenate all segments subprocess.run([ "ffmpeg", "-y", "-f", "concat", "-safe", "0", "-i", concat_file, "-c", "copy", output_path ], capture_output=True) print(f"Assembled: {output_path}")
Configuration
| Parameter | Description | Example |
|---|---|---|
width | Output video width | 1920 |
height | Output video height | 1080 |
fps | Frames per second | 30 |
codec | Video codec | "libx264" / "libx265" |
crf | Quality (lower = better, 18-28) | 23 |
audio_codec | Audio codec | "aac" / "libmp3lame" |
Best Practices
-
Use CRF for quality control, not bitrate — CRF (Constant Rate Factor) produces consistent quality regardless of scene complexity. CRF 18 is visually lossless, 23 is good quality, 28 is acceptable. Bitrate targeting wastes bits on simple scenes and starves complex ones.
-
Process video segments in parallel — When assembling from multiple clips, encode each segment independently in parallel, then concatenate. This is much faster than sequential processing and takes advantage of multi-core CPUs.
-
Generate preview thumbnails before processing full videos — Extract a frame at the midpoint of each clip and review the previews before running a full assembly. This catches wrong clips, bad frames, and sizing issues without waiting for full encoding.
-
Use
-c copywhen no re-encoding is needed — When trimming, concatenating, or remuxing without changing quality, codec, or resolution, use stream copy mode (-c copy) for instant processing. Re-encoding a 2-hour video just to trim 30 seconds wastes significant time. -
Always set explicit output resolution — When combining clips from different sources, force all clips to the same resolution with
scale=W:H. Mismatched resolutions cause FFmpeg errors or distorted output.
Common Issues
Concatenating clips produces audio sync drift — When clips have different audio sample rates or codec parameters, concatenation can cause gradual audio drift. Re-encode all clips to the same audio parameters before concatenation: -c:a aac -ar 44100 -ac 2.
Video plays in some players but not others — Use the faststart flag for web delivery: ffmpeg -i in.mp4 -movflags +faststart out.mp4. This moves the metadata to the beginning of the file so streaming players can start playback without downloading the entire file.
Subtitle text is too small or unreadable — When burning subtitles, the default font size scales with the video resolution. A subtitle that looks fine at 1080p is tiny at 4K. Set explicit font size relative to the video height: FontSize=max(24, video_height/30) as a starting formula.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Full-Stack Code Reviewer
Comprehensive code review skill that checks for security vulnerabilities, performance issues, accessibility, and best practices across frontend and backend code.
Test Suite Generator
Generates comprehensive test suites with unit tests, integration tests, and edge cases. Supports Jest, Vitest, Pytest, and Go testing.
Pro Architecture Workspace
Battle-tested skill for architectural, decision, making, framework. Includes structured workflows, validation checks, and reusable patterns for development.