Transcribe Command
Thetranscribe command extracts audio from a video file or YouTube video and transcribes it using Whisper. This is useful for generating text content from videos or for accessibility purposes.
Usage
Arguments
| Argument | Description |
|---|---|
PATH | Path to video file or YouTube URL |
Options
| Option | Description | Default |
|---|---|---|
--output-dir TEXT | Output directory for transcripts | transcripts/ |
--model TEXT | Whisper model to use (tiny, base, small, medium, large) | base |
--device TEXT | Device to use for transcription (cpu, cuda) | cpu |
--language TEXT | Language code (auto for auto-detection) | auto |
--format TEXT | Output format (txt, srt, vtt, json) | txt |
--help | Show help message and exit | - |
Examples
Transcribe a local video file
Transcribe a YouTube video
Transcribe with a specific model
Transcribe with GPU acceleration
Transcribe to a specific format
Output
The command generates a transcript file in the specified format in the output directory:Whisper Models
The command supports the following Whisper models:| Model | Size | Memory Required | Relative Speed |
|---|---|---|---|
| tiny | 39M | ~1GB | ~32x |
| base | 74M | ~1GB | ~16x |
| small | 244M | ~2GB | ~6x |
| medium | 769M | ~5GB | ~2x |
| large | 1550M | ~10GB | 1x |
Output Formats
The command supports the following output formats:txt: Plain text transcriptsrt: SubRip subtitle formatvtt: WebVTT subtitle formatjson: JSON format with timestamps and confidence scores
JSON Format Example
When using thejson format, the output will look like this: