Skip to content

mlflow_with_local_backend

Documentation for ExperimentsManagerWithLocalFileSystem

Functionality

The ExperimentsManagerWithLocalFileSystem class serves as a wrapper over the MLflow package to manage fine-tuning experiments using a local file system backend. It extends ExperimentsManager to add local artifact management and deletion capabilities with a retry policy for robustness. The class allows for tracking and managing fine-tuning experiments while facilitating local operations, such as the deletion of model files.

Parameters

  • tracking_uri (str): URL of the MLflow server.
  • main_metric (str): Primary metric name for evaluating models.
  • plugin_name (str): Fine-tuning method identifier.
  • accumulators (list): List of MetricsAccumulator instances for logging metrics.
  • is_loss (bool): Boolean flag indicating if the main metric represents loss.
  • n_top_runs (int): Number of top runs to consider in further tuning steps.
  • requirements (str): Additional requirements for model logging.
  • retry_config: Configuration for retrying failed operations.

Inheritance

Inherits from ExperimentsManager to reuse core experiment management functionality while adding local file system support.

Method: _delete_model

This method deletes model files associated with a given run. It retrieves the artifact URI from MLflow and uses the LocalArtifactRepository to remove all related files from the local file system.

Parameters

  • run_id (str): Identifier of the run whose model files will be deleted.
  • experiment_id (str): Identifier of the experiment to which the run belongs.

Usage

  • Purpose: Clean up local model files after an experiment run, freeing disk space and managing storage.

Example

Create an instance of ExperimentsManagerWithLocalFileSystem to manage an experiment with local storage:

experiments_manager = ExperimentsManagerWithLocalFileSystem(
    tracking_uri="http://localhost:5000",
    main_metric="accuracy",
    plugin_name="finetune",
    accumulators=[accumulator1, accumulator2],
    is_loss=False,
    n_top_runs=10,
    requirements="torch>=1.7.1",
    retry_config=your_retry_config
)

result = experiments_manager._delete_model(
    "some_run_id", "some_experiment_id"
)
if result:
    print("Model files deleted successfully.")
else:
    print("Model file deletion failed.")