Skip to content

Instantly share code, notes, and snippets.

@jmazanec15
Created September 14, 2021 20:55
Show Gist options
  • Save jmazanec15/07ef267c7c85c805ccd728b3340ebcd1 to your computer and use it in GitHub Desktop.
Save jmazanec15/07ef267c7c85c805ccd728b3340ebcd1 to your computer and use it in GitHub Desktop.
Former (2) API Description for faiss model support

Update on Proposed APIs

Refactored API Design to center around model resource. First draft can be found here

For faiss, we will introduce additional functionality to add support for faiss indices that require training. With this change, we introduce a new resource: models. A model is an empty, trained native library index that can be used to initialize another native library index during ingestion. A model will be stored as a document in the model system index, which has the following mapping:

{
    "state": keyword,
    "created_timestamp": date,
    "description": keyword, 
    "error": keyword,
    "model_blob": binary,
    "engine": keyword,
    "space_type": keyword,
    "dimension": int
} 

state — Model state. Can either be CREATED, FAILED, TRAINING

created_timestamp — Time at which the model was created.

description — Model description a user can provide to add additional details about a model.

error — Message provided to user to communicate why model is in failed state.

model_blob — Base64 encoded representation of the model.

engine — Engine this model was created by.

space_type — Space this model was built with.

dimension — Dimension this model supports.

Get

GET /_plugins/_knn/models/{model_id}?<filter_field_1>&<filter_field_2>

{
    "my_model_id": {
        "state": "CREATED",
        "created_timestamp": "10-31-21 02:02:02",
        "description": "Model trained with dataset X", 
        "error": "",
        "model_blob": "cdscsacsadcsdca",
        "engine": "faiss",
        "space_type": "l2",
        "dimension": 128
    },
    ...
} 

model_id — [Optional] Specify which model to return information for. If not specified, all model information will be returned.

filter_field — Fields to include. If not specified, all fields are returned.

Delete

DELETE /_plugins/_knn/models/{model_id}

{
    "acknowledged": true
}

model_id — [Required] Model to delete

Upload

PUT /_plugins/_knn/models/{model_id}
{
    "description": "Model trained with dataset X", 
    "model_blob": "cdscsacsadcsdca",
    "engine": "faiss",
    "space_type": "l2",
    "dimension": 128
}

{
    "acknowledged": true
}


POST /_plugins/_knn/models
{
    "description": "Model trained with dataset X", 
    "model_blob": "cdscsacsadcsdca",
    "engine": "faiss",
    "space_type": "l2",
    "dimension": 128
}

{
    "model_id": "my_model_identifier"
}

description — [Optional] Model description a user can provide to add additional details about a model.

model_blob — Base64 encoded representation of the model.

engine — Engine this model was created by.

space_type — Space this model was built with.

dimension — Dimension this model supports.

Train

PUT /_plugins/_knn/models/<model_id>/_train?preference=<node_id>
{
  "train_index": "train-index-name",
  "train_field": "train-field-name",
  "dimension": 16,
  "method": {
      "name":"ivf",
      "engine":"faiss",
      "space_type": "l2",
      "parameters":{
         "ncentroids":128,
         "coarse_quantizer":{
            "name":"ivf",
            "parameters":{
                "ncentroids":15
            }
        },
        "encoder":{
            "name":"pq",
            "parameters":{
                "code_size":8
            }
        },
      }
  }
}

{
    "acknowledged": true
}

POST /_plugins/_knn/models/_train?preference=<node_id>
{
  "train_index": "train-index-name",
  "train_field": "train-field-name",
  "dimension": 16,
  "method": {
      "name":"ivf",
      "engine":"faiss",
      "space_type": "l2",
      "parameters":{
         "ncentroids":128,
         "coarse_quantizer":{
            "name":"ivf",
            "parameters":{
                "ncentroids":15
            }
        },
        "encoder":{
            "name":"pq",
            "parameters":{
                "code_size":8
            }
        },
      }
  }
}

{
    "model_id": "my_model_identifier"
}

node_id — User's preference for node to execute training.

train_index — OpenSearch index from which to pull the training data.

train_field — Field of train_index from which to pull training data.

dimension — Dimension the model should be built for.

method — Method definition to produce the model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment