plugin
Documentation for FineTuningMethod¶
Functionality¶
FineTuningMethod is an abstract base class that acts as a plugin for fine-tuning methods. It defines several abstract methods for preprocessing, data loading, query retrieval, experiment management, and inference client creation.
Motivation¶
This class provides a unified structure for implementing fine-tuning strategies. By forcing subclasses to implement core functionality, it ensures consistency and ease of extension for custom tuning methods.
Inheritance¶
FineTuningMethod inherits from Python's ABC, requiring subclasses to override methods such as get_items_preprocessor, get_data_loader, get_query_retriever, get_manager, and get_inference_client_factory.
Example¶
A subclass of FineTuningMethod must implement all abstract methods to provide concrete behavior for specific fine-tuning tasks.
Documentation for FineTuningMethod.get_items_splitter¶
Functionality¶
Return an ItemSplitter instance. By default, it returns an instance of NoSplitter, which means no splitting operation is applied on items.
Parameters¶
None.
Returns¶
- Instance of ItemSplitter: Normally, a NoSplitter instance is returned.
Usage¶
- Purpose: Retrieve the default item splitter for fine-tuning methods.
Example¶
splitter = fine_tuning_instance.get_items_splitter()
if isinstance(splitter, NoSplitter):
print("No splitter is applied.")
Documentation for FineTuningMethod.get_items_preprocessor¶
Functionality¶
This method returns an instance of ItemsDatasetDictPreprocessor. Subclasses must implement this method to provide custom preprocessing logic for fine-tuning methods.
Parameters¶
This method does not take any parameters.
Usage¶
- Purpose - Provides a custom items preprocessor for fine-tuning. The returned preprocessor is used to transform datasets before training.
Example¶
class CustomFineTuning(FineTuningMethod):
def get_items_preprocessor(self):
# Implement custom preprocessing
return CustomItemsPreprocessor()
Documentation for FineTuningMethod.get_data_loader¶
Functionality¶
This method is intended to return a DataLoader instance that is used to load and preprocess data for fine-tuning tasks. Subclasses should implement it to provide a data loading mechanism appropriate for their context.
Parameters¶
This method does not take any parameters.
Usage¶
- Purpose: Retrieve a DataLoader instance to manage dataset loading during fine-tuning.
Example¶
class MyFineTuningMethod(FineTuningMethod):
def get_data_loader(self):
# Initialize and return a DataLoader instance
return MyDataLoader()
Documentation for FineTuningMethod.get_query_retriever¶
Functionality¶
Return a QueryRetriever instance that retrieves query data for fine-tuning. Subclasses must override this method to supply a custom QueryRetriever.
Parameters¶
- self: The instance of FineTuningMethod.
Usage¶
- Purpose: To obtain a query retriever for managing query operations during fine-tuning.
Example¶
class MyFineTuningMethod(FineTuningMethod):
def get_query_retriever(self) -> QueryRetriever:
return MyQueryRetriever(config)
Documentation for FineTuningMethod.get_manager¶
Functionality¶
This method is designed to return an instance of the ExperimentsManager, which handles experiment tracking and management during fine-tuning. It ensures that any subclass provides a mechanism to log and track experiment details.
Parameters¶
This method does not require any parameters.
Usage¶
Subclasses of FineTuningMethod must implement this method to return a valid ExperimentsManager instance. This instance will handle tasks such as logging experiment metrics and managing experiment configurations.
Example¶
class MyFineTuningMethod(FineTuningMethod):
def get_manager(self) -> ExperimentsManager:
from embedding_studio.experiments.experiments_tracker import ExperimentsManager
return ExperimentsManager()
Documentation for FineTuningMethod.get_inference_client_factory¶
Functionality¶
This method returns a TritonClientFactory instance that is used to create clients for inference operations in the fine-tuning workflow. Subclasses of FineTuningMethod must implement this method to provide a valid TritonClientFactory instance.
Parameters¶
None.
Usage¶
- Purpose: To obtain an inference client factory for starting inference operations during fine-tuning.
Example¶
class MyFineTuningMethod(FineTuningMethod):
def get_inference_client_factory(self) -> TritonClientFactory:
return MyTritonClientFactory()
Documentation for FineTuningMethod.upload_initial_model¶
Functionality¶
Uploads the initial model to the items_set. This abstract method must be implemented by subclasses, ensuring that the initial model is provided for further fine-tuning processes.
Parameters¶
This method does not accept any parameters.
Usage¶
- Purpose - Define the logic for uploading the initial model prior to the fine-tuning process.
Example¶
class MyFineTuningMethod(FineTuningMethod):
def upload_initial_model(self) -> None:
# Example: Implement model upload logic here
Documentation for FineTuningMethod.get_fine_tuning_builder¶
Functionality¶
Returns a FineTuningBuilder instance that is used to set up and launch the fine-tuning process. This process leverages user feedback collected in the clickstream sessions to enhance the model.
Parameters¶
clickstream
: A list of SessionWithEvents containing user feedback. This data is used to adjust the model during fine-tuning.
Usage¶
Use this method to obtain a FineTuningBuilder for initiating the fine-tuning process. The builder can then be configured with additional options as required.
Example¶
builder = plugin_instance.get_fine_tuning_builder(clickstream)
builder.configure(...)
Documentation for FineTuningMethod.get_embedding_model_info¶
Functionality¶
This method returns an instance of EmbeddingModelInfo for a given model ID. It uses the plugin's meta information and search index details to populate the EmbeddingModelInfo object.
Parameters¶
id
: A string representing the unique identifier for an embedding model in the model storage system.
Usage¶
- Purpose - Generates the embedding model info including its name, dimensions, metric type, aggregation type, and vector database settings.
Example¶
For example, given a fine-tuning method instance and a valid model ID:
model_info = fine_tuning_method.get_embedding_model_info("model_id")
Documentation for FineTuningMethod.get_search_index_info¶
Functionality¶
Returns a SearchIndexInfo instance that defines the parameters of the vectordb index used for search operations. Subclasses should implement this method to provide the correct configuration for the index.
Parameters¶
- None
Usage¶
- Purpose: Configure search index parameters for an embedding model.
Example¶
def get_search_index_info(self) -> SearchIndexInfo:
return SearchIndexInfo(
dimensions=128,
metric_type="cosine",
metric_aggregation_type="avg",
hnsw={"ef_construction": 200}
)
Documentation for FineTuningMethod.get_vectors_adjuster¶
Functionality¶
Returns a VectorsAdjuster instance. This method defines a procedure for adjusting or transforming vectors generated during fine-tuning. Subclasses must implement this method to provide a custom vector adjustment logic.
Parameters¶
- None
Usage¶
- Purpose - To specify how vectors are modified after fine-tuning.
Example¶
In a subclass, implement the method as follows:
class MyFineTuningMethod(FineTuningMethod):
def get_vectors_adjuster(self) -> VectorsAdjuster:
return MyCustomVectorsAdjuster()
Documentation for FineTuningMethod.get_vectordb_optimizations¶
Functionality¶
Returns a list of vectordb optimizations that are pending or applicable. This method should be implemented by subclasses to provide custom optimization strategies for vector databases.
Parameters¶
None
Usage¶
- Purpose - To retrieve optimization settings for a vector database. Subclasses implement this method to return a list of outstanding optimizations.
Example¶
class MyFineTuningMethod(FineTuningMethod):
def get_vectordb_optimizations(self) -> List[Optimization]:
return [Optimization('optimize_index')]
Documentation for CategoriesFineTuningMethod¶
Functionality¶
CategoriesFineTuningMethod is an abstract fine-tuning class for categories. It defines methods for selecting categories and setting parameters like maximum similar categories and margin. This design ensures a standard approach for category-based tuning in plugins.
Motivation¶
This class enforces a uniform interface for category fine-tuning, aiding consistent development of category optimization within the project. It clarifies responsibilities and reduces implementation errors.
Inheritance¶
CategoriesFineTuningMethod extends FineTuningMethod. It inherits common methods for preprocessing, data loading, query retrieval, manager handling, and inference client creation. Subclasses must implement category-specific methods.
Example¶
Below is an outline of a subclass implementation:
class MyCategoriesTuning(CategoriesFineTuningMethod):
meta = PluginMeta(...)
def get_items_preprocessor(self):
# custom preprocessor code
pass
def get_data_loader(self):
# load data here
pass
def get_query_retriever(self):
# retrieve queries
pass
def get_manager(self):
# return manager
pass
def get_inference_client_factory(self):
# return inference factory
pass
def get_category_selector(self):
# return category selector
pass
def get_max_similar_categories(self):
return 10
def get_max_margin(self):
return 0.5
Usage¶
Extend CategoriesFineTuningMethod to create new plugins that require category-based fine-tuning. Ensure all abstract methods are implemented.
Documentation for CategoriesFineTuningMethod.get_category_selector¶
Functionality¶
This method returns an instance of AbstractSelector. It allows implementers to define custom category selection logic for fine-tuning methods. Subclasses must implement this method to provide the logic needed to select appropriate categories.
Parameters¶
- (None) The method does not take parameters apart from self.
Usage¶
Override this abstract method in your subclass to return an instance of AbstractSelector. This instance is used during fine-tuning to choose relevant categories.
Example¶
class MyFineTuningMethod(CategoriesFineTuningMethod):
def get_category_selector(self) -> AbstractSelector:
# Return an instance of a custom selector
return MyCategorySelector()
Documentation for CategoriesFineTuningMethod.get_max_similar_categories¶
Functionality¶
Returns the maximum number of similar categories to be retrieved. This abstract method should be implemented by concrete classes to control the retrieval process in a fine-tuning setting.
Parameters¶
This method does not accept any parameters.
Usage¶
Purpose - Limit and control the number of similar categories retrieved for further fine-tuning.
Example¶
class CustomCategoriesMethod(CategoriesFineTuningMethod):
def get_category_selector(self):
# implementation details
pass
def get_max_similar_categories(self) -> int:
return 5
def get_max_margin(self) -> float:
return 0.3
Documentation for CategoriesFineTuningMethod.get_max_margin¶
Functionality¶
Returns a float that defines the maximum margin or distance/similarity threshold for category retrieval.
Parameters¶
This method takes no parameters.
Usage¶
Implement this method in your subclass to return a float that signifies the allowed maximum margin when comparing categories.
Example¶
class MyFineTuningMethod(CategoriesFineTuningMethod):
def get_category_selector(self):
# provide implementation
def get_max_similar_categories(self):
return 5
def get_max_margin(self):
return 0.75
Documentation for PluginManager¶
Functionality¶
PluginManager manages plugin discovery, registration, and retrieval for fine-tuning methods. It loads all plugins that extend the FineTuningMethod interface and makes them available through a simple API. This design enables dynamic extensibility for the application.
Motivation¶
This class supports dynamic loading of alternative fine-tuning methods. It allows developers to extend system capabilities without altering core logic, promoting modularity and ease of integration.
Inheritance¶
PluginManager does not inherit from any other class. However, the plugins registered must be subclasses of FineTuningMethod, ensuring a consistent interface.
Usage¶
- Instantiate PluginManager.
- Call discover_plugins(directory) to load plugins from a given folder.
- Retrieve a plugin using get_plugin(name) or inspect available plugins via the plugin_names property.
Example¶
pm = PluginManager()
pm.discover_plugins("plugins")
plugin = pm.get_plugin("example_plugin")
if plugin:
# Use plugin methods as needed
pass
Documentation for PluginManager.plugin_names¶
Functionality¶
Returns a list of registered plugin names from the internal manager. It retrieves the keys from the plugins dictionary.
Parameters¶
- None
Usage¶
- Purpose: Get a list of plugin names for discovery and usage.
Example¶
plugin_manager = PluginManager()
names = plugin_manager.plugin_names
print(names)
Documentation for PluginManager.get_plugin¶
Functionality¶
Returns a plugin instance for a given plugin name from the internal registry. If the plugin is not registered, it returns None.
Parameters¶
name
: A string identifier for selecting a plugin.
Usage¶
- Purpose: Retrieve and use a plugin based on its name.
Example¶
plugin = plugin_manager.get_plugin('example_plugin')
if plugin:
plugin.do_something()
Documentation for PluginManager.discover_plugins¶
Functionality¶
This method dynamically discovers and loads plugin classes extending FineTuningMethod from a given directory. It imports Python modules, checks for valid plugin meta information, and registers plugins configured in the settings.
Parameters¶
directory
: A string specifying the path to search for plugin modules. Only files ending with ".py" (except "init.py") are considered.
Usage¶
- Purpose: Load and register plugin classes for fine-tuning.
Example¶
Assuming plugins are located in a folder named "plugins":
from embedding_studio.core.plugin import PluginManager
manager = PluginManager()
manager.discover_plugins("plugins")
Documentation for PluginManager._register_plugin¶
Functionality¶
Registers a plugin in the manager. It performs a type check to ensure the provided plugin is an instance of FineTuningMethod and stores it in the plugins dictionary using plugin.meta.name as the key. If the plugin is not a valid instance, a ValueError is raised.
Parameters¶
plugin
: The plugin instance to register. It must be an instance of FineTuningMethod and contain a meta attribute.
Usage¶
- Purpose - To register a plugin so it can be later retrieved by its name using get_plugin.
Example¶
manager = PluginManager()
manager._register_plugin(plugin_instance)