Refactored API Design to center around model resource. First draft can be found here
For faiss, we will introduce additional functionality to add support for faiss indices that require training. With this change, we introduce a new resource: models. A model is an empty, trained native library index that can be used to initialize another native library index during ingestion. A model will be stored as a document in the model system index, which has the following mapping:
{
"state": keyword,
"created_timestamp": date,
"description": keyword,
"error": keyword,
"model_blob": binary,
"engine": keyword,
"space_type": keyword,
"dimension": int
}
state — Model state. Can either be CREATED, FAILED, TRAINING
created_timestamp — Time at which the model was created.
description — Model description a user can provide to add additional details about a model.
error — Message provided to user to communicate why model is in failed state.
model_blob — Base64 encoded representation of the model.
engine — Engine this model was created by.
space_type — Space this model was built with.
dimension — Dimension this model supports.
GET /_plugins/_knn/models/{model_id}?<filter_field_1>&<filter_field_2>
{
"my_model_id": {
"state": "CREATED",
"created_timestamp": "10-31-21 02:02:02",
"description": "Model trained with dataset X",
"error": "",
"model_blob": "cdscsacsadcsdca",
"engine": "faiss",
"space_type": "l2",
"dimension": 128
},
...
}
model_id — [Optional] Specify which model to return information for. If not specified, all model information will be returned.
filter_field — Fields to include. If not specified, all fields are returned.
DELETE /_plugins/_knn/models/{model_id}
{
"acknowledged": true
}
model_id — [Required] Model to delete
PUT /_plugins/_knn/models/{model_id}
{
"description": "Model trained with dataset X",
"model_blob": "cdscsacsadcsdca",
"engine": "faiss",
"space_type": "l2",
"dimension": 128
}
{
"acknowledged": true
}
POST /_plugins/_knn/models
{
"description": "Model trained with dataset X",
"model_blob": "cdscsacsadcsdca",
"engine": "faiss",
"space_type": "l2",
"dimension": 128
}
{
"model_id": "my_model_identifier"
}
description — [Optional] Model description a user can provide to add additional details about a model.
model_blob — Base64 encoded representation of the model.
engine — Engine this model was created by.
space_type — Space this model was built with.
dimension — Dimension this model supports.
PUT /_plugins/_knn/models/<model_id>/_train?preference=<node_id>
{
"train_index": "train-index-name",
"train_field": "train-field-name",
"dimension": 16,
"method": {
"name":"ivf",
"engine":"faiss",
"space_type": "l2",
"parameters":{
"ncentroids":128,
"coarse_quantizer":{
"name":"ivf",
"parameters":{
"ncentroids":15
}
},
"encoder":{
"name":"pq",
"parameters":{
"code_size":8
}
},
}
}
}
{
"acknowledged": true
}
POST /_plugins/_knn/models/_train?preference=<node_id>
{
"train_index": "train-index-name",
"train_field": "train-field-name",
"dimension": 16,
"method": {
"name":"ivf",
"engine":"faiss",
"space_type": "l2",
"parameters":{
"ncentroids":128,
"coarse_quantizer":{
"name":"ivf",
"parameters":{
"ncentroids":15
}
},
"encoder":{
"name":"pq",
"parameters":{
"code_size":8
}
},
}
}
}
{
"model_id": "my_model_identifier"
}
node_id — User's preference for node to execute training.
train_index — OpenSearch index from which to pull the training data.
train_field — Field of train_index from which to pull training data.
dimension — Dimension the model should be built for.
method — Method definition to produce the model.