Documentation for `mark_same_query_and_items` and `convert_for_triton`¶

`mark_same_query_and_items`¶

Functionality¶

Marks that the query and items models are the same by creating a flag file in the query model directory. This flag indicates that both the query and items models are identical.

Parameters¶

query_model_path (str): The path to the query model directory. A file named "same_query" will be created at this path to signal that the models are the same.

Usage¶

Purpose: To flag that a single model is used for both queries and items.

Example¶

Simple usage example:

mark_same_query_and_items("/path/to/query/model")

`convert_for_triton`¶

Functionality¶

Prepares and deploys a model for use with the Triton Inference Server. It dynamically selects GPU, processes, and saves both query and items models. Additionally, it creates configuration files for Triton.

Parameters¶

model: (EmbeddingsModelInterface) The model interface that provides access to query and items models.
plugin_name: The name used for creating directories and files.
model_repo: The file path to the repository where model versions are stored.
model_version: The version number of the model to be saved.
embedding_model_id: A unique identifier for the model.
embedding_studio_path: Optional; default is /embedding_studio, the root path for the studio.

Usage¶

Purpose: Deploys a traced model with corresponding configuration for Triton Inference Server.

Example¶

convert_for_triton(model, "plugin_example", "/path/to/repo", 1, "model123", "/embedding_studio")

Documentation for mark_same_query_and_items and convert_for_triton¶

mark_same_query_and_items¶

Functionality¶

Parameters¶

Usage¶

Example¶

convert_for_triton¶

Functionality¶

Parameters¶

Usage¶

Example¶

Documentation for `mark_same_query_and_items` and `convert_for_triton`¶

`mark_same_query_and_items`¶

`convert_for_triton`¶