Skip to content

clickstream_augmentation_applier

Documentation for ClickstreamQueryAugmentationApplier

Functionality

The ClickstreamQueryAugmentationApplier class is designed to augment clickstream inputs by applying a transformation to each query within FineTuningInput objects. It generates additional inputs with transformed queries to expand the dataset for fine-tuning. This method duplicates the original inputs with transformed queries, then returns a combined list of original and augmented inputs.

Motivation

Handling clickstream data requires simulating query variations. By applying an augmentation strategy, the class helps create diverse inputs that can improve model training and evaluation.

Inheritance

This class does not explicitly inherit from any other class; it is a basic object in Python.

Usage

  • Purpose: Enrich existing clickstream data by augmenting queries through a data transformation strategy.

Parameters

  • inputs: A list of FineTuningInput objects, each containing a query string to be augmented.

Returns

  • A list of FineTuningInput objects that includes both the original and augmented inputs.

Example

Assuming you have a list of FineTuningInput objects called inputs and an augmentation strategy (e.g., YourAugmentationStrategy or AugmentationWithRandomSelection):

from embedding_studio.embeddings.augmentations.clickstream_augmentation_applier import ClickstreamQueryAugmentationApplier
from embedding_studio.embeddings.features.fine_tuning_input import FineTuningInput

# Create your augmentation strategy instance
augmentation = YourAugmentationStrategy()  # or AugmentationWithRandomSelection(...)

# Initialize the applier
applier = ClickstreamQueryAugmentationApplier(augmentation)

# Get the augmented inputs
augmented_inputs = applier.apply_augmentation(inputs)