Collect clickstream¶
A clickstream is a sequence of user search sessions. A search session is a series of user clicks on search results. Clickstreams are used to gather user feedback and improve the model based on it.
Search session¶
A search session is a series of user clicks on search results. A search session is identified by a unique identifier
session_id
.
Typically, the workflow involves the creation of a search session, followed by populating it with user clicks. Subsequently, the session is closed and submitted for fine-tuning. It's worth noting that the same search session can be submitted for fine-tuning multiple times, including for different algorithms. This flexibility allows for iterative refinement and enhancement of the search model.
To register a new search session, send a POST request to the /api/v1/clickstream/session
endpoint:
curl -X POST 'http://localhost:5000/api/v1/clickstream/session' \
-H 'Content-Type: application/json' \
-d '{
"session_id": "c327c30bd3db459093e5f5d254bdd144,
"search_query": "user search query",
"search_results":[
{
"object_id": "2f827cbb9fe544cdaca44b2dafd37785",
"rank": 0.3
"meta": {}
},
{
"object_id": "e000f005dc1343ea8ddf234c557832a5"
},
{
"object_id": "96365929fa5b4286b3a04639cbc7c1d5"
}
],
"search_meta": {},
"user_id" : "b604bbb1a57a418d925ebc9d615de99a",
"created_at": 1703376873,
}'
where:
session_id
- unique search session identifiersearch_query
- user's search querysearch_results
- list of search result object ids with optional rank and metasearch_meta
- (optional) meta-information about the search query (any data)user_id
- (optional) your user identifiercreated_at
- (optional) utc timestamp of session start. Value by default: current time on server
Events¶
Events refer to any occurrences that you can register within a search session. For instance, these can include user clicks or other user interactions. In the context of a search session, events serve as recorded actions that contribute to improvement of the search experience.
To add events (clicks) to registered search session, send a POST request to the /api/v1/clickstream/session/events
endpoint:
curl -X POST 'http://localhost:5000/api/v1/clickstream/session/events' \
-H 'Content-Type: application/json' \
-d '{
"session_id": "search_session_id",
"events": [
{
"event_id": "472cc4232af2449a97317d0057c9a2cf",
"object_id": "2f827cbb9fe544cdaca44b2dafd37785",
"meta": {},
"event_type": "click",
"created_at": 1703376873
},
{
"event_id": "124b6bc39e9f43e7b5be6b992d94f505",
"object_id": "2f827cbb9fe544cdaca44b2dafd37785",
"meta": {},
"created_at": 1703376874
},
{
"event_id": "b0f4adcac63b4a35a02a45f06b0b1cb9",
"object_id": "e000f005dc1343ea8ddf234c557832a5",
}
]
}'
where
session_id
- id of registered search sessionevents
- list of new session events with:event_id
- event identifier (must be unique for this session)object_id
- search result object idmeta
- (optional) any meta dataevent_type
- (optional) event type. Value by default:click
created_at
- (optional) utc timestamp of event. Value by default: current time on server
After you have gathered the required amount of user feedback, you can release the session batch and use it for
fine-tuning.
To do this, send a POST request to the /api/v1/clickstream/internal/batch/release
endpoint:
curl -X POST '127.0.0.1:5000/api/v1/clickstream/internal/batch/release' \
-H 'Content-Type: application/json' \
-d '{
"release_id": "090bda77a5e048d59a1e310dc20fbd6d"
}'
where release_id
- unique identifier of release (idempotency key).
In response, you will receive:
{
"batch_id": "65844a671089823652b83d43"
}
where batch_id
- unique identifier of clickstream session batch - it can be used to run fine-tuning