inference_deployment_tasks
Documentation¶
Method: deploy
¶
Functionality¶
Deploy a model using the green deployment strategy. This function receives deployment details and creates a model deployment task using the application context. It forwards the task to a background process.
Parameters¶
- body: The
ModelDeploymentRequest
containing model details and configuration for deploying a model.
Usage¶
- Purpose: Create and send a task for model deployment, then return a standardized response.
Example¶
Request:
{
"model_id": "abc123",
"config": { ... }
}
{
"status": "success",
"deployed_model_id": "abc123"
}
Method: get_deploy_task
¶
Functionality¶
Retrieves details of a specific deployment task using a task ID. Returns a ModelDeploymentResponse
with task data if found. If not found, raises a HTTPException (404) error.
Parameters¶
- task_id: A string, unique ID of the deployment task.
Usage¶
- Purpose: Retrieve information on a deployment task.
Example¶
For task ID "123", a GET request to /deploy/123
will return the deployment task details.
Method: get_model_deploy_status
¶
Functionality¶
This function retrieves the deployment details for a given embedding model. It obtains a deployment task based on the model ID and returns a formatted ModelDeploymentResponse
if found. If no task is present, it raises an HTTPException.
Parameters¶
- embedding_model_id: A string representing the model's unique ID.
Usage¶
- Purpose: Retrieve deployment details for a specific embedding model using its unique identifier.
Example¶
A GET request to the endpoint /deploy-status/{embedding_model_id}
returns the status for the specified model.
Method: delete
¶
Functionality¶
Delete a model by scheduling a deletion task. The function logs the request, creates a deletion task using the provided request body, and dispatches the task through a worker. On success, it returns the updated task details.
Parameters¶
- body: A
ModelDeletionRequest
object that contains the details required to delete the model.
Usage¶
- Purpose: To submit a deletion request and process the deletion asynchronously using a worker task.
Example¶
from fastapi.testclient import TestClient
from embedding_studio.api.api_v1.internal_endpoints.inference_deployment_tasks import router
client = TestClient(router)
data = {
"model_id": "your_model_id",
"reason": "No longer needed"
}
response = client.post("/delete", json=data)
print(response.json())
Method: get_delete_task
¶
Functionality¶
Retrieves the details of a specific deletion task based on the provided task identifier. If found, returns the task details in a formatted response; otherwise, raises an HTTP 404 error.
Parameters¶
- task_id: The unique identifier of the deletion task.
Usage¶
- Purpose: To obtain detailed information about a deletion task.
Example¶
Assuming a task with ID "123", calling:
get_delete_task("123")
Method: get_model_delete_status
¶
Functionality¶
Retrieves deletion task details using the model identifier. It accesses the deployment task context to fetch a deletion task and returns a formatted response. If no task is found, an HTTP exception is raised.
Parameters¶
- embedding_model_id: ID of the model for which the deletion task status is retrieved.
Usage¶
- Purpose: To obtain the current deletion request status for a model by providing its embedding model ID.
Example¶
response = get_model_delete_status("model_id123")
print(response)