DS1 Text Embeddings Inference
Scroll down for code samples, example requests and responses. Select a language for code samples from the tabs above or the mobile navigation menu.
Text Embedding Webserver
License: Proprietary - Takara.AI
Text Embeddings Inference
decode
POST /decode
Decode input ids
Body parameter
{
"ids": [
0
],
"skip_special_tokens": true
}Parameters
| Name | In | Type | Required | Description |
|---|---|---|---|---|
| body | body | DecodeRequest | true | none |
Example responses
200 Response
[
"test"
]400 Response
{
"error": "Batch is empty",
"error_type": "empty"
}413 Response
{
"error": "Batch size error",
"error_type": "validation"
}422 Response
{
"error": "Tokenization error",
"error_type": "tokenizer"
}Responses
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | Decoded ids | DecodeResponse |
| 400 | Bad Request | Batch is empty | ErrorResponse |
| 413 | Payload Too Large | Batch size error | ErrorResponse |
| 422 | Unprocessable Entity | Tokenization error | ErrorResponse |
embed
POST /embed
Get Embeddings. Returns a 424 status code if the model is not an embedding model.
Body parameter
{
"dimensions": 0,
"inputs": "string",
"normalize": true,
"prompt_name": "string",
"truncate": false,
"truncation_direction": "Left"
}Parameters
| Name | In | Type | Required | Description |
|---|---|---|---|---|
| body | body | EmbedRequest | true | none |
Example responses
200 Response
[
[
0,
1,
2
]
]400 Response
{
"error": "Batch is empty",
"error_type": "empty"
}413 Response
{
"error": "Batch size error",
"error_type": "validation"
}422 Response
{
"error": "Tokenization error",
"error_type": "tokenizer"
}424 Response
{
"error": "Inference failed",
"error_type": "backend"
}429 Response
{
"error": "Model is overloaded",
"error_type": "overloaded"
}Responses
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | Embeddings | EmbedResponse |
| 400 | Bad Request | Batch is empty | ErrorResponse |
| 413 | Payload Too Large | Batch size error | ErrorResponse |
| 422 | Unprocessable Entity | Tokenization error | ErrorResponse |
| 424 | Failed Dependency | Embedding Error | ErrorResponse |
| 429 | Too Many Requests | Model is overloaded | ErrorResponse |
health
GET /health
Health check method
Example responses
503 Response
{
"error": "unhealthy",
"error_type": "unhealthy"
}Responses
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | Everything is working fine | None |
| 503 | Service Unavailable | Text embeddings Inference is down | ErrorResponse |
get_model_info
GET /info
Text Embeddings Inference endpoint info
Example responses
200 Response
{
"auto_truncate": true,
"docker_label": "null",
"max_batch_requests": 0,
"max_batch_tokens": "2048",
"max_client_batch_size": "32",
"max_concurrent_requests": "128",
"max_input_length": "512",
"model_dtype": "float16",
"model_id": "thenlper/gte-base",
"model_sha": "fca14538aa9956a46526bd1d0d11d69e19b5a101",
"model_type": {
"classifier": {
"id2label": {
"0": "LABEL"
},
"label2id": {
"LABEL": 0
}
}
},
"sha": "null",
"tokenization_workers": "4",
"version": "0.5.0"
}Responses
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | Served model info | Info |
metrics
GET /metrics
Prometheus metrics scrape endpoint
Example responses
200 Response
"string"Responses
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | Prometheus Metrics | string |
similarity
POST /similarity
Get Sentence Similarity. Returns a 424 status code if the model is not an embedding model.
Body parameter
{
"inputs": {
"sentences": [
"What is Machine Learning?"
],
"source_sentence": "What is Deep Learning?"
},
"parameters": {
"prompt_name": "string",
"truncate": false,
"truncation_direction": "Left"
}
}Parameters
| Name | In | Type | Required | Description |
|---|---|---|---|---|
| body | body | SimilarityRequest | true | none |
Example responses
200 Response
[
0,
1,
0.5
]400 Response
{
"error": "Batch is empty",
"error_type": "empty"
}413 Response
{
"error": "Batch size error",
"error_type": "validation"
}422 Response
{
"error": "Tokenization error",
"error_type": "tokenizer"
}424 Response
{
"error": "Inference failed",
"error_type": "backend"
}429 Response
{
"error": "Model is overloaded",
"error_type": "overloaded"
}Responses
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | Sentence Similarity | SimilarityResponse |
| 400 | Bad Request | Batch is empty | ErrorResponse |
| 413 | Payload Too Large | Batch size error | ErrorResponse |
| 422 | Unprocessable Entity | Tokenization error | ErrorResponse |
| 424 | Failed Dependency | Embedding Error | ErrorResponse |
| 429 | Too Many Requests | Model is overloaded | ErrorResponse |
tokenize
POST /tokenize
Tokenize inputs
Body parameter
{
"add_special_tokens": true,
"inputs": "string",
"prompt_name": "string"
}Parameters
| Name | In | Type | Required | Description |
|---|---|---|---|---|
| body | body | TokenizeRequest | true | none |
Example responses
200 Response
[
[
{
"id": 0,
"special": false,
"start": 0,
"stop": 2,
"text": "test"
}
]
]400 Response
{
"error": "Batch is empty",
"error_type": "empty"
}413 Response
{
"error": "Batch size error",
"error_type": "validation"
}422 Response
{
"error": "Tokenization error",
"error_type": "tokenizer"
}Responses
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | Tokenized ids | TokenizeResponse |
| 400 | Bad Request | Batch is empty | ErrorResponse |
| 413 | Payload Too Large | Batch size error | ErrorResponse |
| 422 | Unprocessable Entity | Tokenization error | ErrorResponse |
openai_embed
POST /v1/embeddings
OpenAI compatible route. Returns a 424 status code if the model is not an embedding model.
Body parameter
{
"dimensions": 0,
"encoding_format": "float",
"input": "string",
"model": "string",
"user": "string"
}Parameters
| Name | In | Type | Required | Description |
|---|---|---|---|---|
| body | body | OpenAICompatRequest | true | none |
Example responses
200 Response
{
"data": [
{
"embedding": [
0.1
],
"index": "0",
"object": "embedding"
}
],
"model": "thenlper/gte-base",
"object": "list",
"usage": {
"prompt_tokens": "512",
"total_tokens": "512"
}
}400 Response
{
"message": "Batch is empty",
"type": "empty"
}413 Response
{
"message": "Batch size error",
"type": "validation"
}422 Response
{
"message": "Tokenization error",
"type": "tokenizer"
}424 Response
{
"message": "Inference failed",
"type": "backend"
}429 Response
{
"message": "Model is overloaded",
"type": "overloaded"
}Responses
| Status | Meaning | Description | Schema |
|---|---|---|---|
| 200 | OK | Embeddings | OpenAICompatResponse |
| 400 | Bad Request | Batch is empty | OpenAICompatErrorResponse |
| 413 | Payload Too Large | Batch size error | OpenAICompatErrorResponse |
| 422 | Unprocessable Entity | Tokenization error | OpenAICompatErrorResponse |
| 424 | Failed Dependency | Embedding Error | OpenAICompatErrorResponse |
| 429 | Too Many Requests | Model is overloaded | OpenAICompatErrorResponse |
Schemas
ClassifierModel
{
"id2label": {
"0": "LABEL"
},
"label2id": {
"LABEL": 0
}
}Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| id2label | object | true | none | none |
| » additionalProperties | string | false | none | none |
| label2id | object | true | none | none |
| » additionalProperties | integer | false | none | none |
DecodeRequest
{
"ids": [
0
],
"skip_special_tokens": true
}Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| ids | InputIds | true | none | none |
| skip_special_tokens | boolean | false | none | Whether to skip special tokens (defaults to true if not provided) |
DecodeResponse
[
"test"
]Properties
None
EmbedRequest
{
"dimensions": 0,
"inputs": "string",
"normalize": true,
"prompt_name": "string",
"truncate": false,
"truncation_direction": "Left"
}Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| dimensions | integer¦null | false | none | The number of dimensions that the output embeddings should have. If not set, the original shape of the representation will be returned instead. |
| inputs | Input | true | none | none |
| normalize | boolean | false | none | Whether to normalize embeddings (defaults to true if not provided) |
| prompt_name | string¦null | false | none | The name of the prompt that should be used by for encoding. If not set, no prompt will be applied. Must be a key in the sentence-transformers configuration prompts dictionary.For example if prompt_name is "query" and the prompts is {"query": "query: ", ...},then the sentence "What is the capital of France?" will be encoded as "query: What is the capital of France?" because the prompt text will be prepended before any text to encode. |
| truncate | boolean¦null | false | none | none |
| truncation_direction | TruncationDirection | false | none | none |
EmbedResponse
[
[
0,
1,
2
]
]Properties
None
Embedding
[
0.1
]Properties
oneOf
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | [number] | false | none | none |
xor
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | string | false | none | none |
EmbeddingModel
{
"pooling": "cls"
}Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| pooling | string | true | none | none |
EncodingFormat
"float"Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | string | false | none | none |
Enumerated Values
| Property | Value |
|---|---|
| anonymous | float |
| anonymous | base64 |
ErrorResponse
{
"error": "string",
"error_type": "Unhealthy"
}Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| error | string | true | none | none |
| error_type | ErrorType | true | none | none |
ErrorType
"Unhealthy"Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | string | false | none | none |
Enumerated Values
| Property | Value |
|---|---|
| anonymous | Unhealthy |
| anonymous | Backend |
| anonymous | Overloaded |
| anonymous | Validation |
| anonymous | Tokenizer |
| anonymous | Empty |
Info
{
"auto_truncate": true,
"docker_label": "null",
"max_batch_requests": 0,
"max_batch_tokens": "2048",
"max_client_batch_size": "32",
"max_concurrent_requests": "128",
"max_input_length": "512",
"model_dtype": "float16",
"model_id": "thenlper/gte-base",
"model_sha": "fca14538aa9956a46526bd1d0d11d69e19b5a101",
"model_type": {
"classifier": {
"id2label": {
"0": "LABEL"
},
"label2id": {
"LABEL": 0
}
}
},
"sha": "null",
"tokenization_workers": "4",
"version": "0.5.0"
}Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| auto_truncate | boolean | true | none | none |
| docker_label | string¦null | false | none | none |
| max_batch_requests | integer¦null | false | none | none |
| max_batch_tokens | integer | true | none | none |
| max_client_batch_size | integer | true | none | none |
| max_concurrent_requests | integer | true | none | Router Parameters |
| max_input_length | integer | true | none | none |
| model_dtype | string | true | none | none |
| model_id | string | true | none | Model info |
| model_sha | string¦null | false | none | none |
| model_type | ModelType | true | none | none |
| sha | string¦null | false | none | none |
| tokenization_workers | integer | true | none | none |
| version | string | true | none | Router Info |
Input
"string"Properties
oneOf
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | InputType | false | none | none |
xor
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | [InputType] | false | none | none |
InputIds
[
0
]Properties
oneOf
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | [integer] | false | none | none |
xor
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | [array] | false | none | none |
InputType
"string"Properties
oneOf
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | string | false | none | none |
xor
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | [integer] | false | none | none |
ModelType
{
"classifier": {
"id2label": {
"0": "LABEL"
},
"label2id": {
"LABEL": 0
}
}
}Properties
oneOf
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | object | false | none | none |
| » classifier | ClassifierModel | true | none | none |
xor
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | object | false | none | none |
| » embedding | EmbeddingModel | true | none | none |
xor
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | object | false | none | none |
| » reranker | ClassifierModel | true | none | none |
OpenAICompatEmbedding
{
"embedding": [
0.1
],
"index": "0",
"object": "embedding"
}Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| embedding | Embedding | true | none | none |
| index | integer | true | none | none |
| object | string | true | none | none |
OpenAICompatErrorResponse
{
"code": 0,
"error_type": "Unhealthy",
"message": "string"
}Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| code | integer(int32) | true | none | none |
| error_type | ErrorType | true | none | none |
| message | string | true | none | none |
OpenAICompatRequest
{
"dimensions": 0,
"encoding_format": "float",
"input": "string",
"model": "string",
"user": "string"
}Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| dimensions | integer¦null | false | none | none |
| encoding_format | EncodingFormat | false | none | none |
| input | Input | true | none | none |
| model | string¦null | false | none | none |
| user | string¦null | false | none | none |
OpenAICompatResponse
{
"data": [
{
"embedding": [
0.1
],
"index": "0",
"object": "embedding"
}
],
"model": "thenlper/gte-base",
"object": "list",
"usage": {
"prompt_tokens": "512",
"total_tokens": "512"
}
}Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| data | [OpenAICompatEmbedding] | true | none | none |
| model | string | true | none | none |
| object | string | true | none | none |
| usage | OpenAICompatUsage | true | none | none |
OpenAICompatUsage
{
"prompt_tokens": "512",
"total_tokens": "512"
}Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| prompt_tokens | integer | true | none | none |
| total_tokens | integer | true | none | none |
SimilarityInput
{
"sentences": [
"What is Machine Learning?"
],
"source_sentence": "What is Deep Learning?"
}Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| sentences | [string] | true | none | A list of strings which will be compared against the source_sentence. |
| source_sentence | string | true | none | The string that you wish to compare the other strings with. This can be a phrase, sentence, or longer passage, depending on the model being used. |
SimilarityParameters
{
"prompt_name": "string",
"truncate": false,
"truncation_direction": "Left"
}Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| prompt_name | string¦null | false | none | The name of the prompt that should be used by for encoding. If not set, no prompt will be applied. Must be a key in the sentence-transformers configuration prompts dictionary.For example if prompt_name is "query" and the prompts is {"query": "query: ", ...},then the sentence "What is the capital of France?" will be encoded as "query: What is the capital of France?" because the prompt text will be prepended before any text to encode. |
| truncate | boolean¦null | false | none | none |
| truncation_direction | TruncationDirection | false | none | none |
SimilarityRequest
{
"inputs": {
"sentences": [
"What is Machine Learning?"
],
"source_sentence": "What is Deep Learning?"
},
"parameters": {
"prompt_name": "string",
"truncate": false,
"truncation_direction": "Left"
}
}Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| inputs | SimilarityInput | true | none | none |
| parameters | SimilarityParameters¦null | false | none | none |
SimilarityResponse
[
0,
1,
0.5
]Properties
None
SimpleToken
{
"id": 0,
"special": false,
"start": 0,
"stop": 2,
"text": "test"
}Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| id | integer(int32) | true | none | none |
| special | boolean | true | none | none |
| start | integer¦null | false | none | none |
| stop | integer¦null | false | none | none |
| text | string | true | none | none |
TokenizeInput
"string"Properties
oneOf
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | string | false | none | none |
xor
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | [string] | false | none | none |
TokenizeRequest
{
"add_special_tokens": true,
"inputs": "string",
"prompt_name": "string"
}Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| add_special_tokens | boolean | false | none | Whether to add special tokens (defaults to true if not provided) |
| inputs | TokenizeInput | true | none | none |
| prompt_name | string¦null | false | none | The name of the prompt that should be used by for encoding. If not set, no prompt will be applied. Must be a key in the sentence-transformers configuration prompts dictionary.For example if prompt_name is "query" and the prompts is {"query": "query: ", ...},then the sentence "What is the capital of France?" will be encoded as "query: What is the capital of France?" because the prompt text will be prepended before any text to encode. |
TokenizeResponse
[
[
{
"id": 0,
"special": false,
"start": 0,
"stop": 2,
"text": "test"
}
]
]Properties
None
TruncationDirection
"Left"Properties
| Name | Type | Required | Restrictions | Description |
|---|---|---|---|---|
| anonymous | string | false | none | none |
Enumerated Values
| Property | Value |
|---|---|
| anonymous | Left |
| anonymous | Right |