Documentation for prepare_data
¶
Functionality¶
The prepare_data
method prepares fine-tuning data from clickstream inputs. It processes session data, downloads the required files, retrieves queries, and filters out any failed downloads. Finally, it splits the inputs into training and testing sets to generate ranking data.
Parameters¶
fine_tuning_inputs
: A list of clickstream sessions or dictionaries.converter
: A function that converts sessions into the fine-tuning input format.clickstream_splitter
: A function that splits the inputs into training and testing sets.query_retriever
: A function that retrieves query items for the inputs.loader
: A function that loads item data based on the provided metadata.items_set_manager
: A function that organizes items and prepares the final dataset.
Usage¶
- Purpose: The primary purpose of
prepare_data
is to prepare data for fine-tuning ranking models by processing and filtering clickstream data.
Example¶
Here is a basic usage example:
ranking_data = prepare_data(
inputs, converter, splitter,
query_retriever, loader, manager
)
The output of this method is a RankingData
object that contains the training and testing datasets.