Embedding Studio

Turning your embedding model into a search engine.

Get started Support us on GitHub

Embeddings-in-the-loop

Embedding Studio adapts search results nearly online and runs automatic fine-tuning procedure.

Smart search query parser

Combine structured and unstructured search with our LLM for natural query search parsing.

Collect user search sessions

Use collected clicks for resource saving and accuracy effective fine-tuning procedure.

Python ecosystem

Use Embedding Studio together with your preferred Python libraries.

Plugin-based system

Connect your own Vector DB, Data Source and Embedding model by implementing a Python plugin.

Use your models

Import embeddings model for your data type.

Open Source

All our code is open sourced under the Apache 2.0 license.

Easy deployment

Our goal is to make it simple for customers. You can deploy it and calibrate it without much effort.

MLflow Native

We use MLflow as a fine-tuning progress tracking system by default.

Tasks solved with Embedding Studio

Embedding Studio is a framework which allows you transform your Vector Database into a feature-rich Search Engine.

Semantic Search

Semantic search engine

Your users look into some images inspired by True Detective or Breaking Bad series, or some blogposts related to the vector search topic.

Using multimodal embeddings like CLIP or BLIP, a vector database like QDrant or Milivus, and Embedding Studio you can easily allow your users to do it, without using a hashtags, classifiers and classic text indexes.

Similarity Search

Similarity search engine

Your IT company is at the stage, when users pushing tons of support tickets, bug reports and customer requests, and your support is looking into tons of logs and messages to find similar cases and a solution.

Using embedding like sentence-transformers/all-MiniLM-L6-v2, a vector database and Embedding Studio, you can easily implement internal tickets search and grouping, which will be fine-tuned with your specific domain.

Structured and Unstructured

Structured and unstructured search

It's a highly rare case when a company will use unstructured search as is. And by searching brick red houses san francisco area for april user definitely wants to find some houses in San Francisco for a month-long rent in April, and then maybe brick-red houses. So, companies need to mix structured and unstructured search.

Embedding Studio team decided to dive into LLM instruct fine-tuning for Zero-Shot query parsing task to close the first gap while a company doesn't have any rules and data being collected, or even eliminate exhausted rules implementation, but in the future.

Structured search with Unstructured queries

Structured search / unstructured queries

You don't need any unstructured search underneath you search system, that's ok, especially if you're a service like taxi or booking, because your customers usually want to arrive exactly to 221B Baker Street, London, and nothing very similar. You'll probably try to implement search queries parser.

So, Embedding Studio can help you too in two ways:

1. Zero-Shot queries parser, which doesn't need anything except filters schema.
2. With capability of vector databases and Embedding Studio you can implement query to filters mapper, which will improve session by session.