Skip to main content

Advanced Configuration

Transfer Learning provides extensive configuration options to customize its behavior for your specific needs. This guide covers all available configuration options and how to use them.

Configuration Methods

There are several ways to configure Transfer Learning:

Environment Variables

Set environment variables directly in your shell or in a .env file

CLI Options

Pass options directly to CLI commands

Config Command

Use the config command to view and modify settings

Configuration Files

Create custom configuration files for different projects

Environment Variables

The most common way to configure Transfer Learning is through environment variables. These can be set in your shell or in a .env file in the project root directory.

Core Settings

OPENAI_API_KEY
string
required
Your OpenAI API key for accessing GPT-4 Vision models.
ANTHROPIC_API_KEY
string
Your Anthropic API key for Claude models (optional).
HUGGINGFACE_API_KEY
string
Your Hugging Face API key for accessing models (optional).

Processing Settings

FRAME_EXTRACTION_INTERVAL
integer
default:"30"
Interval between frame extractions (in frames).
MAX_FRAMES_PER_VIDEO
integer
default:"100"
Maximum number of frames to extract per video.
BATCH_SIZE
integer
default:"30"
Number of frames to process in each batch.
MAX_CONCURRENT_BATCHES
integer
default:"100"
Maximum number of concurrent batches to process.
SUPPORTED_VIDEO_FORMATS
string
default:".mp4,.avi,.mov,.mkv"
Comma-separated list of supported video formats.
MAX_VIDEO_SIZE_MB
integer
default:"500"
Maximum video size in megabytes.

Model Settings

OPENAI_MODEL
string
default:"gpt-4o-mini"
OpenAI model to use for guide generation.
VISION_MODEL
string
default:"o3-mini"
Vision model to use for image analysis.
FAST_MODEL
string
default:"gpt-4o-mini"
Faster model to use for simpler tasks.
WHISPER_MODEL
string
default:"base"
Whisper model to use for transcription (tiny, base, small, medium, large).
WHISPER_DEVICE
string
default:"cpu"
Device to use for Whisper transcription (cpu, cuda).
WHISPER_COMPUTE_TYPE
string
default:"int8"
Compute type for Whisper models (float16, int8).

Cache Settings

ENABLE_CACHE
boolean
default:"true"
Enable or disable caching.
CACHE_TTL_HOURS
integer
default:"24"
Time-to-live for cache entries in hours.

Monitoring Settings

ENABLE_MONITORING
boolean
default:"true"
Enable or disable monitoring and metrics collection.
LOG_LEVEL
string
default:"INFO"
Logging level (DEBUG, INFO, WARNING, ERROR).
METRICS_ENABLED
boolean
default:"true"
Enable or disable metrics collection.

Using the Config Command

The config command allows you to view and modify configuration settings:
# Show current configuration
transfer-learning config --show

# Reset configuration to defaults
transfer-learning config --reset

Creating a Custom Configuration File

You can create a custom .env file for different projects:
# Create a custom configuration file
cat > my-project.env << EOF
OPENAI_API_KEY=your-api-key-here
FRAME_EXTRACTION_INTERVAL=60
MAX_FRAMES_PER_VIDEO=50
BATCH_SIZE=20
WHISPER_MODEL=small
EOF

# Use the custom configuration file
export DOTENV_PATH=my-project.env
transfer-learning process-video path/to/video.mp4

Directory Configuration

Transfer Learning uses several directories for storing data, logs, and metrics. You can customize these directories:
DATA_DIR
string
default:"./data"
Base directory for all data storage.
CACHE_DIR
string
default:"./.cache"
Directory for cache storage.
LOGS_DIR
string
default:"./logs"
Directory for log files.
VIDEOS_DIR
string
default:"./data/videos"
Directory for downloaded videos.
FRAMES_DIR
string
default:"./data/frames"
Directory for extracted frames.
TRANSCRIPTS_DIR
string
default:"./data/transcripts"
Directory for transcripts.
GUIDES_DIR
string
default:"./data/guides"
Directory for generated guides.
ANALYSIS_DIR
string
default:"./data/analysis"
Directory for analysis results.
TEMP_DIR
string
default:"./data/temp"
Directory for temporary files.

Configuration Examples

For high-quality processing with detailed guides:
# .env file
OPENAI_API_KEY=your-api-key-here
OPENAI_MODEL=gpt-4o
VISION_MODEL=gpt-4-vision-preview
FRAME_EXTRACTION_INTERVAL=15
MAX_FRAMES_PER_VIDEO=200
WHISPER_MODEL=medium
For faster processing with lower resource usage:
# .env file
OPENAI_API_KEY=your-api-key-here
OPENAI_MODEL=gpt-4o-mini
VISION_MODEL=o3-mini
FRAME_EXTRACTION_INTERVAL=60
MAX_FRAMES_PER_VIDEO=50
BATCH_SIZE=10
MAX_CONCURRENT_BATCHES=5
WHISPER_MODEL=tiny
For GPU-accelerated processing:
# .env file
OPENAI_API_KEY=your-api-key-here
WHISPER_MODEL=medium
WHISPER_DEVICE=cuda
WHISPER_COMPUTE_TYPE=float16
For detailed logging and debugging:
# .env file
LOG_LEVEL=DEBUG
ENABLE_MONITORING=true
METRICS_ENABLED=true

Command-Line Options

Most configuration options can also be passed directly to CLI commands:
# Override batch size and model for a specific command
transfer-learning process-video path/to/video.mp4 --batch-size 15 --max-concurrent 10

# Use a specific model for guide generation
transfer-learning generate-guide data/videos/video_name --model gpt-4o --temperature 0.3

Configuration Precedence

Configuration options are applied in the following order of precedence (highest to lowest):
  1. Command-line options
  2. Environment variables
  3. .env file
  4. Default values
This means that command-line options will override environment variables, which will override values in the .env file, which will override default values.

Validating Configuration

You can validate your configuration using the config command:
transfer-learning config --validate
This will check that all required configuration options are set and that values are within acceptable ranges.

Troubleshooting

If you encounter errors about a missing API key:
  1. Check that your API key is correctly set in the .env file
  2. Verify that the .env file is in the correct location
  3. Try setting the API key as an environment variable:
export OPENAI_API_KEY=your-api-key-here
If you encounter errors about directory permissions:
  1. Check that the user running Transfer Learning has write permissions to the data directories
  2. Try setting custom directory paths in locations where you have write permissions:
export DATA_DIR=/path/with/write/permissions
If you encounter memory issues during processing:
  1. Reduce the batch size with --batch-size
  2. Reduce the maximum concurrent batches with --max-concurrent
  3. Increase the frame extraction interval to process fewer frames
export FRAME_EXTRACTION_INTERVAL=60
export BATCH_SIZE=10
export MAX_CONCURRENT_BATCHES=5