deployment
Documentation for handle_deployment
¶
Functionality¶
The handle_deployment
method manages the deployment of embedding models to the Triton Inference Server. It retrieves a deployment task, validates the model and its corresponding plugin, verifies the model's existence and compliance with supported plugins, enforces deployment limits, downloads the model's iteration from MLflow, converts it for Triton compatibility, and deploys it while utilizing file locking to prevent concurrent deployments.
Parameters¶
task_id
: A string representing the deployment task ID.
Usage¶
- Purpose: Automate the complete workflow for deploying an embedding model to Triton Inference Server safely and efficiently.
Example¶
handle_deployment("your_task_id")