Skip to content

Documentation for ExperimentsManagerWithMinIOBackend

Overview

ExperimentsManagerWithMinIOBackend is a wrapper around MLFlow designed for fine-tuning experiments with a MinIO backend. It connects to the MLFlow server and manages models stored in a MinIO bucket, integrating error handling and retry policies specific to MinIO operations.

Parameters

  • tracking_uri (str): URL of the MLFlow server.
  • minio_credentials (dict): Credentials to connect to MinIO.
  • main_metric (str): Primary metric to determine the best model.
  • plugin_name (str): Name of the fine-tuning method.
  • accumulators (list): List of metrics accumulators for logging.
  • is_loss (bool): Flag indicating if the main metric should be minimized.
  • n_top_runs (int): Maximum number of top runs to consider.
  • requirements (list): Extra requirements for MLFlow model logging.
  • retry_config (dict): Retry policy configuration.

Usage

  • Purpose: Manage fine-tuning experiments with a MinIO storage backend. This class inherits from ExperimentsManager to reuse common tracking features.

Example

manager = ExperimentsManagerWithMinIOBackend(
    "http://mlflow-server",
    minio_credentials,
    "accuracy",
    "fine-tuning",
    [MetricsAccumulator()],
    is_loss=False,
    n_top_runs=10
)
manager._delete_model("run-id", "experiment-id")

ExperimentsManagerWithMinIOBackend.is_retryable_error

Functionality

Checks if an exception is a retryable error based on server error conditions. If the error is of type ServerError and its status code is between 500 and 599, it is considered retryable.

Parameters

  • e (Exception): The exception encountered during an operation.

Returns

  • bool: True if the error is retryable; otherwise, False.

Usage

  • Purpose: To determine if an error from MinIO operations might be temporary and thus eligible for a retry.

Example

try:
    perform_operation()
except Exception as e:
    if manager.is_retryable_error(e):
        retry_operation()
    else:
        handle_failure()

ExperimentsManagerWithMinIOBackend._delete_model

Functionality

Deletes model files stored on MinIO using the MLFlow artifact URI. It extracts the object path from the artifact URI and calls the MinIO client to remove the object from the designated bucket. The method logs the outcome and returns True if the deletion was successful, or False if an error occurred.

Parameters

  • run_id (str): The MLFlow run identifier to locate the model.
  • experiment_id (str): The MLFlow experiment identifier (not used in deletion).

Usage

  • Purpose: To remove stored model files on MinIO for a given MLFlow run.

Example

Given a valid run_id and experiment_id, the method retrieves run info, computes the object path, and attempts to remove the model from the bucket. A successful deletion returns True, while a NoSuchKey error returns False.