Skip to content

Documentation for mark_same_query_and_items and convert_for_triton

mark_same_query_and_items

Functionality

Marks that the query and items models are the same by creating a flag file in the query model directory. This flag indicates that both the query and items models are identical.

Parameters

  • query_model_path (str): The path to the query model directory. A file named "same_query" will be created at this path to signal that the models are the same.

Usage

  • Purpose: To flag that a single model is used for both queries and items.

Example

Simple usage example:

mark_same_query_and_items("/path/to/query/model")


convert_for_triton

Functionality

Prepares and deploys a model for use with the Triton Inference Server. It dynamically selects GPU, processes, and saves both query and items models. Additionally, it creates configuration files for Triton.

Parameters

  • model: (EmbeddingsModelInterface) The model interface that provides access to query and items models.
  • plugin_name: The name used for creating directories and files.
  • model_repo: The file path to the repository where model versions are stored.
  • model_version: The version number of the model to be saved.
  • embedding_model_id: A unique identifier for the model.
  • embedding_studio_path: Optional; default is /embedding_studio, the root path for the studio.

Usage

  • Purpose: Deploys a traced model with corresponding configuration for Triton Inference Server.

Example

convert_for_triton(model, "plugin_example", "/path/to/repo", 1, "model123", "/embedding_studio")