jit_trace_manager

Documentation for `JitTraceTritonModelStorageManager`¶

Functionality¶

The JitTraceTritonModelStorageManager class serializes a PyTorch model by applying JIT tracing with an example input. It produces a static computational graph optimized for inference and prepares the model for deployment on Triton Inference Server.

Parameters¶

storage_info: Config details for model storage and naming.
do_dynamic_batching: Boolean flag that enables dynamic batching.

Inheritance¶

Inherits from TritonModelStorageManager and reuses directory setup and configuration generation methods.

Usage¶

Purpose: To trace models into static graphs for efficient inference, which is ideal for cases without dynamic control flow.

Example¶

Instantiate JitTraceTritonModelStorageManager and use its _save_model method with the model and sample input to produce a traced model ready for Triton deployment.

Method: `_get_model_artifacts`¶

Functionality¶

Returns a list of artifact filenames expected in a model directory. For a JIT-traced model, it includes only the traced model file.

Parameters¶

This method does not take any parameters.

Return Value¶

A list of strings representing model artifact filenames. It returns ["model.pt"].

Usage¶

Call this method to verify the required model file for Triton. It ensures that only the JIT-traced model file is used.

Example¶

artifacts = manager._get_model_artifacts()
# Expected output: ["model.pt"]

Method: `_generate_triton_config_model_info`¶

Functionality¶

Generates Triton configuration lines for the PyTorch LibTorch backend. This method sets the model name, platform, and maximum batch size using the storage info provided to the manager.

Parameters¶

This method does not require parameters. It retrieves configuration settings from the instance's _storage_info attribute.

Usage¶

Purpose: To provide a configuration snippet for Triton when serving a JIT-traced PyTorch model.

Example¶

manager = JitTraceTritonModelStorageManager(storage_info, True)
config = manager._generate_triton_config_model_info()
print(config)

Method: `_save_model`¶

Functionality¶

This method saves a PyTorch model using JIT tracing. It converts the model into an optimized serialized form that can be loaded by Triton's LibTorch backend.

Parameters¶

model: The PyTorch nn.Module to trace.
example_inputs: A dictionary mapping input names to example tensors.

Usage¶

Purpose: Trace and save a model in a format optimized for inference with Triton.

Example¶

Assuming a properly prepared model and inputs:

manager._save_model(model, {"input": input_tensor})

jit_trace_manager

Documentation for JitTraceTritonModelStorageManager¶

Functionality¶

Parameters¶

Inheritance¶

Usage¶

Example¶

Method: _get_model_artifacts¶

Functionality¶

Parameters¶

Return Value¶

Usage¶

Example¶

Method: _generate_triton_config_model_info¶

Functionality¶

Parameters¶

Usage¶

Example¶

Method: _save_model¶

Functionality¶

Parameters¶

Usage¶

Example¶

Documentation for `JitTraceTritonModelStorageManager`¶

Method: `_get_model_artifacts`¶

Method: `_generate_triton_config_model_info`¶

Method: `_save_model`¶