TwitchDownloader/IMPROVEMENTS.md
2026-02-09 21:29:47 +01:00

9.5 KiB

Twitch Archive - Code Improvements

Overview

This document outlines the improvements made to make the twitch-archive.py script more user-friendly for programmers who aren't familiar with Python.

Key Improvements

1. Better Code Organization

  • Module-level docstring: Added comprehensive documentation at the top explaining what the script does
  • Constants section: All magic values and API endpoints moved to well-named constants at the top
  • Logical grouping: Code is now organized into clear sections with visual separators

2. Enhanced Documentation

  • Type hints: Added type annotations to all methods (e.g., def run(self) -> None:)
  • Docstrings: Every method now has detailed documentation explaining:
    • What it does
    • What parameters it accepts
    • What it returns
    • Potential errors/exceptions
  • Inline comments: Added explanatory comments throughout the code

3. Improved Method Names

  • Old: sendNotif() → New: send_notification() (more descriptive)
  • Old: get_OS() → New: _detect_operating_system() (clearer purpose)
  • Old: correct_user() → New: _validate_username() (describes action)
  • Added underscore prefix to private methods (Python convention)

4. Modular Design

The massive loopcheck() method was broken down into smaller, focused methods:

  • _is_stream_already_processed() - Check if stream was already recorded
  • _mark_stream_as_processed() - Log stream to prevent duplicates
  • _record_livestream() - Handle stream recording with streamlink
  • _process_raw_stream() - Convert .ts files with ffmpeg
  • _download_vod() - Download VOD with TwitchDownloaderCLI
  • _download_and_render_chat() - Download and render chat logs
  • _save_metadata() - Save stream metadata
  • _upload_to_cloud() - Upload to cloud storage with rclone
  • _delete_local_files() - Clean up local files after upload

5. Better Error Handling

  • Specific exceptions: Catch and handle specific exception types
  • User-friendly messages: Error messages now explain what went wrong and how to fix it
  • Graceful degradation: Script continues running even when non-critical operations fail
  • Visual feedback: Uses colored output (✓, ✗, ⚠) to indicate status

6. Improved User Feedback

Before:

Configuration:
Root path: E:\dev\Twitch-Archive-2\archive
Refresh rate: 60.0
Email notifications: Enabled

After:

============================================================
TWITCH ARCHIVE - Configuration Summary
============================================================

Streamer: vinesauce
Quality: best
Storage: E:\dev\Twitch-Archive-2\archive
Refresh rate: 60s

Email notifications: Enabled ✓
Metadata download: Enabled ✓
VOD download: Enabled ✓
Chat download & render: Enabled ✓
Cloud upload: Enabled ✓

✓ Files will be preserved locally
============================================================

7. Constants for Magic Values

# Before: Scattered throughout code
response = requests.post('https://gql.twitch.tv/gql', ...)

# After: Named constants at top
TWITCH_GQL_URL = "https://gql.twitch.tv/gql"
response = requests.post(TWITCH_GQL_URL, ...)

8. Cleaner Configuration Loading

  • Uses DEFAULT_CONFIG dictionary instead of inline defaults
  • Better error messages when config.json is missing or invalid
  • Filters comment fields more elegantly
  • Validates JSON syntax and provides helpful error messages

9. Path Handling Improvements

  • Uses pathlib.Path consistently for cross-platform compatibility
  • Centralized path initialization in _initialize_paths() method
  • All paths are absolute to avoid confusion

10. Better Help Text

The command-line help is now formatted with colors and clear sections:

============================================================
TWITCH ARCHIVE - Automated Stream Recording & Archiving
============================================================

USAGE:
    python twitch-archive.py [OPTIONS]

OPTIONS:
    -h, --help              Display this help information
    -u, --username <name>   Twitch channel username to monitor
    ...

TIPS:
    • Configure settings in config.json
    • Set up API credentials in .env file
    ...

11. Safer File Operations

  • Checks if files exist before attempting to delete them
  • Groups all file deletion in one method for easier tracking
  • Prevents deletion if upload fails (data safety)

12. Code Readability Enhancements

  • Consistent indentation and spacing
  • Logical variable names (e.g., filename_base instead of live_raw_filename)
  • Removed redundant code (e.g., duplicate API calls)
  • Consistent string formatting (f-strings everywhere)

How These Improvements Help Non-Python Programmers

  1. Clearer Intent: Type hints and docstrings make it obvious what each function does without needing to read the implementation

  2. Easier Debugging: Modular functions mean you can test individual pieces separately

  3. Better Error Messages: Instead of cryptic Python errors, users get helpful messages like:

    ✗ ERROR: Twitch user "invaliduser" not found
      → Check the username in your config.json file
    
  4. Self-Documenting: The code reads more like pseudocode with clear method names and constants

  5. Standard Conventions: Follows PEP 8 (Python style guide) making it easier to understand for anyone familiar with coding standards

  6. Visual Organization: Section headers and consistent formatting make it easy to navigate

  7. Extensibility: Each feature is isolated, making it easy to add new features or modify existing ones

Maintenance Benefits

  • Easier Updates: Modular design means changing one feature doesn't affect others
  • Testable: Each method can be unit tested independently
  • Understandable: Future developers can quickly understand what each part does
  • Documented Decisions: Docstrings explain why things are done a certain way

Example: Before vs After

Before (Complex loopcheck):

def loopcheck(self):
    while True:
        try:
            is_live = self.check_user()['data']['user']['stream']
            if is_live is not None:
                is_live_ready = self.check_user()['data']['user']['stream']['title']
                if is_live_ready is not None:
                    bin_path = str(pathlib.Path(__file__).parent.resolve())+"/bin"
                    live_date = datetime.strptime(is_live["createdAt"],'%Y-%m-%dT%H:%M:%SZ').replace(...)
                    # ... 200+ more lines of mixed logic ...

After (Clean and modular):

def loopcheck(self) -> None:
    """
    Main monitoring loop.
    
    Continuously checks if the streamer is live, and when they are:
    1. Records the live stream
    2. Downloads the VOD
    3. Downloads and renders chat
    4. Uploads everything to cloud storage (if enabled)
    5. Optionally deletes local files after upload
    """
    while True:
        try:
            response = self._check_stream_status()
            is_live = response['data']['user']['stream']
            
            if is_live is None:
                time.sleep(self.refresh)
                continue
            
            # Each step is a well-named method
            stream_id = self._create_stream_id(is_live)
            if self._is_stream_already_processed(stream_id):
                continue
            
            self._record_livestream(...)
            self._process_raw_stream(...)
            self._download_vod(...)
            # ... clear logical flow ...

Backward Compatibility

All improvements maintain backward compatibility:

  • Same configuration file format
  • Same command-line arguments
  • Same file output structure
  • Same external tool dependencies

Testing Recommendations

After these improvements, test:

  1. ✓ Configuration loading (valid and invalid config.json)
  2. ✓ Username validation
  3. ✓ Live stream recording
  4. ✓ VOD downloading
  5. ✓ Chat downloading
  6. ✓ Cloud upload
  7. ✓ File deletion safety
  8. ✓ Error recovery

Future Enhancement Opportunities

Now that the code is more maintainable, future enhancements could include:

  • Configuration validation with schemas
  • Logging to file instead of print statements
  • Progress bars for downloads
  • Multi-channel monitoring
  • Web interface for configuration
  • Database for tracking streams
  • Automated testing suite

Deprecated Options Removed

Updated the script to work with streamlink 5.0+ by removing deprecated command-line options:

  1. --twitch-disable-reruns - This option has been completely removed from streamlink

    • The script now works without this option
    • Twitch's rerun detection is handled differently in newer versions
  2. --twitch-proxy-playlist - The ttvlol ad-blocking proxy option has been removed

    • If streamlink_ttvlol is enabled in config, the script will show a warning
    • Users should disable this option or use alternative ad-blocking methods
    • Consider using Twitch Turbo subscription for ad-free viewing
  3. --hls-segment-threads - Renamed to --stream-segment-threads

    • The script already uses the correct new option name
    • This applies to all segmented stream types (HLS, DASH, etc.)
    • Valid values: 1-10 threads (default: 1)

What You Need to Do

  1. Update streamlink to the latest version:

    pip install --upgrade streamlink
    
  2. Update your config.json:

    • Set streamlink_ttvlol to 0 if you were using ad-blocking
    • The other settings remain the same
  3. Run the script normally - it will work with the latest streamlink version