Skip to content

e5

Documentation for E5ModelSimplifiedWrapper

Functionality

The E5ModelSimplifiedWrapper class is a simple adapter for E5 models. It wraps a given torch.nn.Module (often an underlying transformer model) and modifies its output. In the forward pass, it aggregates token-level embeddings by averaging the last hidden state, then applies L2 normalization. This ensures that the model produces consistent, normalized embeddings for use in various text-to-text scenarios.

Motivation

The wrapper simplifies integration of diverse E5 model implementations by unifying the output processing mechanism. It abstracts the average pooling and normalization steps, allowing users to focus on higher-level application logic without handling low-level details.

Inheritance

E5ModelSimplifiedWrapper inherits from torch.nn.Module, making it fully compatible with PyTorch neural network modules and pipelines.

Functionality of E5ModelSimplifiedWrapper.forward

This method performs a forward pass on the simplified E5 model. It obtains a pooled embedding by summing the last hidden state of the underlying transformer model, applies the attention mask, and then averages and normalizes the result.

Parameters

  • input_ids: Tensor of token identifiers for the input sequence.
  • attention_mask: Tensor indicating valid tokens in the input.

Usage Example

import torch
from embedding_studio.embeddings.models.text_to_text.e5 import E5ModelSimplifiedWrapper

# Assuming transformer_model is a pretrained model instance
model = E5ModelSimplifiedWrapper(transformer_model)

input_ids = torch.tensor([[101, 2057, 2024, 102]])
attention_mask = torch.tensor([[1, 1, 1, 1]])

embeddings = model.forward(input_ids, attention_mask)
print(embeddings)

Documentation for TextToTextE5Model

Functionality

TextToTextE5Model is a wrapper that standardizes the use of E5 models for text-to-text search embeddings. It supports both SentenceTransformer and AutoModel variants, enabling a unified interface for generating embeddings.

Inheritance

This class inherits from EmbeddingsModelInterface, ensuring compatibility with the broader embedding framework.

Motivation

The class simplifies the management of model and tokenizer setup for E5 models. It hides the underlying complexity and allows users to focus on generating accurate text embeddings.

Usage Example

from sentence_transformers import SentenceTransformer
from embedding_studio.embeddings.models.text_to_text.e5 import TextToTextE5Model

model = TextToTextE5Model(SentenceTransformer('intfloat/multilingual-e5-large'))

Methods

get_query_model

This method returns the E5 model used for processing query inputs. Since both queries and items use the same model, it provides the underlying model instance wrapped by E5ModelSimplifiedWrapper.

  • Return Value: Returns a torch.nn.Module instance representing the query model.

get_items_model

Returns the model for processing items. Since query and items use the same underlying model, this method returns the E5 model component.

  • Parameters: None.
  • Purpose: Retrieve the items model for embedding computations.

get_query_model_params

Retrieve an iterator over parameters of the query model. This iterator is useful for training and optimization tasks.

  • Return Value: Returns an iterator over the model parameters.

get_items_model_params

Returns an iterator over the parameters of the items model. Since query and items share the same model, it calls the same parameters iterator as get_query_model_params.

  • Parameters: None.

is_named_inputs

This property method verifies that the E5 model expects its inputs as named arguments. This method always returns True.

  • Parameters: None.

get_query_model_inputs

Creates a sample input dictionary with tokenized text, including input_ids and attention_mask tensors. It uses max_length padding and truncation.

  • Parameters: device: Optional device to place tensors on.

get_items_model_inputs

Generates example inputs for the items model used for model tracing.

  • Parameters: device: Optional device where tensors are placed.

get_query_model_inference_manager_class

This method returns the Triton model storage manager class to be used for query model inference.

  • Parameters: None.

get_items_model_inference_manager_class

Returns the class that manages items model inference in Triton.

  • Parameters: None.

fix_query_model

Fix specific layers of the query model by freezing its embeddings and a number of encoder layers during fine-tuning.

  • Parameters: num_fixed_layers: The number of bottom encoder layers to fix.

unfix_query_model

Unfixes all layers in the query model to enable gradient updates.

  • Parameters: None.

fix_item_model

Fixes a specific number of layers in the item model during fine-tuning.

  • Parameters: num_fixed_layers: Number of layers to freeze from the bottom of the model.

unfix_item_model

Unfixes all layers of the item model by enabling gradient updates for every layer.

  • Parameters: None.

tokenize

The tokenize method converts a text query or a list of queries into a tokenized dictionary format for model processing.

  • Parameters: query: A text query or a list of queries to be tokenized.

forward_query

Processes a text query through the model and returns an embedding. It prepends "query: " to the input string before tokenizing.

  • Parameters: query: A string representing the text query to encode.

forward_items

Processes a list of text items by tokenizing them and running the tokens through the underlying E5 model.

  • Parameters: items: A list of strings, where each string is a text item to encode.