e5
Documentation for E5ModelSimplifiedWrapper
¶
Functionality¶
The E5ModelSimplifiedWrapper class is a simple adapter for E5 models. It wraps a given torch.nn.Module (often an underlying transformer model) and modifies its output. In the forward pass, it aggregates token-level embeddings by averaging the last hidden state, then applies L2 normalization. This ensures that the model produces consistent, normalized embeddings for use in various text-to-text scenarios.
Motivation¶
The wrapper simplifies integration of diverse E5 model implementations by unifying the output processing mechanism. It abstracts the average pooling and normalization steps, allowing users to focus on higher-level application logic without handling low-level details.
Inheritance¶
E5ModelSimplifiedWrapper inherits from torch.nn.Module
, making it fully compatible with PyTorch neural network modules and pipelines.
Functionality of E5ModelSimplifiedWrapper.forward
¶
This method performs a forward pass on the simplified E5 model. It obtains a pooled embedding by summing the last hidden state of the underlying transformer model, applies the attention mask, and then averages and normalizes the result.
Parameters¶
input_ids
: Tensor of token identifiers for the input sequence.attention_mask
: Tensor indicating valid tokens in the input.
Usage Example¶
import torch
from embedding_studio.embeddings.models.text_to_text.e5 import E5ModelSimplifiedWrapper
# Assuming transformer_model is a pretrained model instance
model = E5ModelSimplifiedWrapper(transformer_model)
input_ids = torch.tensor([[101, 2057, 2024, 102]])
attention_mask = torch.tensor([[1, 1, 1, 1]])
embeddings = model.forward(input_ids, attention_mask)
print(embeddings)
Documentation for TextToTextE5Model
¶
Functionality¶
TextToTextE5Model is a wrapper that standardizes the use of E5 models for text-to-text search embeddings. It supports both SentenceTransformer and AutoModel variants, enabling a unified interface for generating embeddings.
Inheritance¶
This class inherits from EmbeddingsModelInterface, ensuring compatibility with the broader embedding framework.
Motivation¶
The class simplifies the management of model and tokenizer setup for E5 models. It hides the underlying complexity and allows users to focus on generating accurate text embeddings.
Usage Example¶
from sentence_transformers import SentenceTransformer
from embedding_studio.embeddings.models.text_to_text.e5 import TextToTextE5Model
model = TextToTextE5Model(SentenceTransformer('intfloat/multilingual-e5-large'))
Methods¶
get_query_model
¶
This method returns the E5 model used for processing query inputs. Since both queries and items use the same model, it provides the underlying model instance wrapped by E5ModelSimplifiedWrapper.
- Return Value: Returns a torch.nn.Module instance representing the query model.
get_items_model
¶
Returns the model for processing items. Since query and items use the same underlying model, this method returns the E5 model component.
- Parameters: None.
- Purpose: Retrieve the items model for embedding computations.
get_query_model_params
¶
Retrieve an iterator over parameters of the query model. This iterator is useful for training and optimization tasks.
- Return Value: Returns an iterator over the model parameters.
get_items_model_params
¶
Returns an iterator over the parameters of the items model. Since query and items share the same model, it calls the same parameters iterator as get_query_model_params
.
- Parameters: None.
is_named_inputs
¶
This property method verifies that the E5 model expects its inputs as named arguments. This method always returns True.
- Parameters: None.
get_query_model_inputs
¶
Creates a sample input dictionary with tokenized text, including input_ids and attention_mask tensors. It uses max_length padding and truncation.
- Parameters:
device
: Optional device to place tensors on.
get_items_model_inputs
¶
Generates example inputs for the items model used for model tracing.
- Parameters:
device
: Optional device where tensors are placed.
get_query_model_inference_manager_class
¶
This method returns the Triton model storage manager class to be used for query model inference.
- Parameters: None.
get_items_model_inference_manager_class
¶
Returns the class that manages items model inference in Triton.
- Parameters: None.
fix_query_model
¶
Fix specific layers of the query model by freezing its embeddings and a number of encoder layers during fine-tuning.
- Parameters:
num_fixed_layers
: The number of bottom encoder layers to fix.
unfix_query_model
¶
Unfixes all layers in the query model to enable gradient updates.
- Parameters: None.
fix_item_model
¶
Fixes a specific number of layers in the item model during fine-tuning.
- Parameters:
num_fixed_layers
: Number of layers to freeze from the bottom of the model.
unfix_item_model
¶
Unfixes all layers of the item model by enabling gradient updates for every layer.
- Parameters: None.
tokenize
¶
The tokenize
method converts a text query or a list of queries into a tokenized dictionary format for model processing.
- Parameters:
query
: A text query or a list of queries to be tokenized.
forward_query
¶
Processes a text query through the model and returns an embedding. It prepends "query: " to the input string before tokenizing.
- Parameters:
query
: A string representing the text query to encode.
forward_items
¶
Processes a list of text items by tokenizing them and running the tokens through the underlying E5 model.
- Parameters:
items
: A list of strings, where each string is a text item to encode.