Model Registry
=============

The model registry is a central component that manages OpenAI model capabilities, version requirements, and parameter validation.

Configuration
------------

The model registry uses two main configuration files:

1. ``models.yml``: Defines model capabilities and version requirements
2. ``parameter_constraints.yml``: Defines parameter validation rules

Model Capabilities
----------------

Each model in the registry has the following capabilities:

- ``context_window``: Maximum context window size in tokens
- ``max_output_tokens``: Maximum number of output tokens
- ``supports_structured``: Whether the model supports structured output
- ``supports_streaming``: Whether the model supports streaming responses
- ``supported_parameters``: List of supported parameters with constraints
- ``min_version``: Minimum supported version for the model

Version Validation
---------------

The registry supports both dated models and aliases:

- Dated models (e.g., ``gpt-4o-2024-08-06``): Specific versions with fixed capabilities
- Aliases (e.g., ``gpt-4o``): Point to the latest stable version
- Version validation ensures compatibility:
  - Validates date format (YYYY-MM-DD)
  - Checks against minimum supported version
  - Handles version comparison and fallbacks

Parameter Validation
-----------------

The registry provides comprehensive parameter validation:

Supported Parameters
~~~~~~~~~~~~~~~~

Different models support different parameters:

- GPT-4 models (gpt-4o and gpt-4o-mini):
    - temperature
    - top_p
    - frequency_penalty
    - presence_penalty
    - max_completion_tokens

- o1 and o3 models:
    - max_completion_tokens
    - reasoning_effort

.. note::
    o1 and o3 models do not support temperature, top_p, frequency_penalty, or presence_penalty parameters.
    Attempting to use these parameters with o1 or o3 models will raise an OpenAIClientError.

Numeric Parameters
~~~~~~~~~~~~~~~~

.. code-block:: python

    {
        "temperature": {
            "type": "numeric",
            "min_value": 0.0,
            "max_value": 2.0,
            "allow_float": true,
            "allow_int": false
        }
    }

Enum Parameters
~~~~~~~~~~~~~

.. code-block:: python

    {
        "reasoning_effort": {
            "type": "enum",
            "allowed_values": ["low", "medium", "high"]
        }
    }


Error Handling
------------

The registry provides specific error types for different validation scenarios:

- ``ModelNotSupportedError``: Model not found in registry (includes available models and aliases)
- ``InvalidDateError``: Invalid date format in model version (includes format guidance)
- ``VersionTooOldError``: Model version older than minimum supported (includes latest alias suggestion)
- ``TokenParameterError``: Invalid token-related parameter (includes parameter guidance)
- ``OpenAIClientError``: Base class for all registry errors

Example Usage
-----------

Basic Capability Check
~~~~~~~~~~~~~~~~~~~

.. code-block:: python

    from openai_structured import ModelRegistry

    registry = ModelRegistry.get_instance()

    # Check model capabilities
    caps = registry.get_capabilities("gpt-4o-2024-08-06")
    print(f"Context window: {caps.context_window}")
    print(f"Supports streaming: {caps.supports_streaming}")

Parameter Validation
~~~~~~~~~~~~~~~~~

.. code-block:: python

    try:
        # Validate parameters
        caps.validate_parameter("temperature", 0.7)
        caps.validate_parameter("reasoning_effort", "medium")
    except OpenAIClientError as e:
        # Error message examples:
        # "Invalid value 'high' for parameter 'reasoning_effort'. Description: Controls the model's reasoning depth. Allowed values: low, medium, high"
        # "Parameter 'temperature' must be between 0.0 and 2.0. Description: Controls randomness in the output"
        print(f"Parameter validation failed: {e}")

Version Validation
~~~~~~~~~~~~~~~

.. code-block:: python

    try:
        # Check version compatibility
        caps = registry.get_capabilities("gpt-4o-2024-07-01")
    except VersionTooOldError as e:
        # Error message example:
        # "Model 'gpt-4o-2024-07-01' version 2024-07-01 is too old.
        # Minimum supported version: 2024-08-06
        # Note: Use the alias 'gpt-4o' to always get the latest version"
        print(f"Version too old: {e}")
    except InvalidDateError as e:
        # Error message example:
        # "Invalid date format in model version: Month must be between 1 and 12
        # Use format: YYYY-MM-DD (e.g. 2024-08-06)"
        print(f"Invalid date format: {e}")

Configuration
-----------

Custom Registry Path
~~~~~~~~~~~~~~~~~

You can specify custom paths for the registry configuration:

.. code-block:: bash

    export MODEL_REGISTRY_PATH=/path/to/models.yml
    export PARAMETER_CONSTRAINTS_PATH=/path/to/constraints.yml

Fallback Behavior
~~~~~~~~~~~~~~

The registry includes built-in fallback configurations when the main configuration files are unavailable:

1. Attempts to load from specified paths
2. Falls back to built-in configuration if files are missing
3. Maintains core functionality even without external configuration

Updating Registry
~~~~~~~~~~~~~~

The registry can be updated from the official repository using the command line tool:

.. code-block:: bash

    # Basic update with confirmation prompt
    openai-structured-refresh

    # Update with verbose output showing available models
    openai-structured-refresh -v

    # Update from custom URL without confirmation
    openai-structured-refresh -f --url https://example.com/models.yml

    # Validate current configuration without updating
    openai-structured-refresh --validate

    # Check for updates without downloading
    openai-structured-refresh --check

The refresh command will:

1. Download the latest model configurations from the official repository (or custom URL)
2. Validate the configuration format and values
3. Update your local ``models.yml`` file
4. Reload the registry with the new configurations

When using ``--verbose``, you'll see detailed information about each model:

.. code-block:: text

    Available models:
    - gpt-4o-2024-08-06
      Context window: 128000
      Max output tokens: 16384
      Supports streaming: True

    - o1-2024-12-17
      Context window: 200000
      Max output tokens: 100000
      Supports streaming: False

You can also update the registry programmatically:

.. code-block:: python

    from openai_structured import ModelRegistry, RegistryUpdateStatus

    registry = ModelRegistry.get_instance()

    # Check if an update is available (without downloading)
    check_result = registry.check_for_updates()
    if check_result.success and check_result.status == RegistryUpdateStatus.UPDATE_AVAILABLE:
        print("A registry update is available!")
        # Optionally ask the user if they want to update
        user_consents = ask_user_for_consent()  # Your implementation
        if user_consents:
            update_result = registry.refresh_from_remote()
            if update_result.success:
                print("Registry updated successfully")

    # Or directly update without checking first
    result = registry.refresh_from_remote()
    if result.success:
        if result.status == RegistryUpdateStatus.UPDATED:
            print("Registry updated successfully")
        elif result.status == RegistryUpdateStatus.ALREADY_CURRENT:
            print("Registry is already up to date")
    else:
        print(f"Failed to update registry: {result.message}")

Command Line Utilities
-------------------

The library provides command line utilities for managing the model registry:

Update Registry
~~~~~~~~~~~~~~~~~~~

The ``openai-structured-refresh`` command (implemented in ``scripts/update_registry.py``) provides a user-friendly way to update and validate the model registry:

.. code-block:: bash

    # Basic update with confirmation prompt
    openai-structured-refresh

    # Update with verbose output showing available models
    openai-structured-refresh -v

    # Update from custom URL without confirmation
    openai-structured-refresh -f --url https://example.com/models.yml

    # Validate current configuration without updating
    openai-structured-refresh --validate

    # Check for updates without downloading
    openai-structured-refresh --check

Cache Metadata
~~~~~~~~~~~~~~~~~~~

The registry uses cache metadata files to optimize network requests:

- Metadata is stored in ``.yml.meta`` files alongside the registry files
- Contains HTTP caching headers like ``ETag`` and ``Last-Modified``
- Enables conditional requests that only download when content has changed
- Reduces bandwidth usage and improves performance
- Automatically managed by the registry

.. code-block:: python

    # The metadata is automatically used when checking for updates
    result = registry.check_for_updates()

    # It's also used when refreshing from remote
    result = registry.refresh_from_remote()

    # The metadata file path is derived from the registry file
    # For example: models.yml -> models.yml.meta

Update Fallback Models
~~~~~~~~~~~~~~~~~~~

The ``scripts/update_fallbacks.py`` script updates the fallback models in ``model_registry.py`` to match the configuration in ``models.yml``:

.. code-block:: bash

    python scripts/update_fallbacks.py

This script:

1. Reads the current ``models.yml`` configuration
2. Generates Python code for fallback models
3. Updates the fallback section in ``model_registry.py``
4. Maintains proper indentation and formatting

The script is used in two ways:

1. Automatically via GitHub Actions:
   - Triggered when ``models.yml`` changes in main/next branches
   - Creates a PR with the updates
   - Labels the PR as "automated pr" and "dependencies"

2. Manually by developers:
   - Run locally to test changes
   - Verify fallback models match configuration
   - Debug configuration issues

Error Handling:

- Validates file existence
- Reports clear error messages
- Exits with status code 1 on failure

Example workflow:

1. Update ``models.yml`` with new model:

   .. code-block:: yaml

       dated_models:
         new-model-2024-03-01:
           context_window: 128000
           max_output_tokens: 16384
           supports_structured: true
           supports_streaming: true
           supported_parameters:
             - ref: numeric_constraints.temperature
             - ref: numeric_constraints.top_p

2. Run update script:

   .. code-block:: bash

       python scripts/update_fallbacks.py

3. Verify changes in ``model_registry.py``

Generate Default Models
~~~~~~~~~~~~~~~~~~~~

The model registry automatically generates default model configurations when external configuration files are unavailable:

1. Built-in fallbacks provide core model support
2. Ensures library works without external files
3. Matches the structure in ``models.yml``

To update the default models:

1. Modify ``models.yml`` with new configuration
2. Run the update script
3. Commit changes to both files