API Examples

This page demonstrates practical examples of using the DS1 embedding model through different interfaces and languages.

Exploring the API

Viewing the OpenAPI Schema

DS1 provides a complete OpenAPI specification at /api-doc/openapi.json on your deployed endpoint. You can retrieve and explore it using standard HTTP requests.

Using wget:

bash

wget https://your-sagemaker-endpoint-url/api-doc/openapi.json

Using curl:

bash

curl https://your-sagemaker-endpoint-url/api-doc/openapi.json | jq

Using Python:

python

import requests

endpoint_url = "https://your-sagemaker-endpoint-url"
schema = requests.get(f"{endpoint_url}/api-doc/openapi.json").json()

print(schema)

This schema documents all available endpoints, request formats, and response structures for DS1.

Python with Boto3 (AWS SageMaker)

The most common way to interact with DS1 is through AWS SageMaker endpoints using the Boto3 library.

Basic Usage

python

import boto3
import json

# Initialize SageMaker runtime client
sagemaker_runtime = boto3.client('sagemaker-runtime', region_name='eu-west-2')

# Your DS1 endpoint name
ENDPOINT_NAME = 'your-ds1-endpoint-name'

# Sample text to embed
texts = ["What is machine learning?", "How does deep learning work?"]

# Create request payload
payload = {
    "inputs": texts
}

# Invoke the endpoint
response = sagemaker_runtime.invoke_endpoint(
    EndpointName=ENDPOINT_NAME,
    ContentType='application/json',
    Body=json.dumps(payload)
)

# Parse the response
result = json.loads(response['Body'].read().decode())
embeddings = result if isinstance(result, list) else result.get("embeddings", [])

print(f"Generated {len(embeddings)} embeddings")
print(f"Embedding dimension: {len(embeddings[0])}")

Batch Processing

python

def embed_texts_batch(texts, batch_size=32):
    """
    Embed multiple texts in batches for optimal performance
    """
    all_embeddings = []
    
    for i in range(0, len(texts), batch_size):
        batch = texts[i:i + batch_size]
        payload = {"inputs": batch}
        
        response = sagemaker_runtime.invoke_endpoint(
            EndpointName=ENDPOINT_NAME,
            ContentType='application/json',
            Body=json.dumps(payload)
        )
        
        result = json.loads(response['Body'].read().decode())
        embeddings = result if isinstance(result, list) else result.get("embeddings", [])
        all_embeddings.extend(embeddings)
        
        print(f"Processed batch {i//batch_size + 1}/{(len(texts)-1)//batch_size + 1}")
    
    return all_embeddings

# Example usage
documents = [
    "Document 1 content here...",
    "Document 2 content here...",
    "Document 3 content here...",
]

embeddings = embed_texts_batch(documents, batch_size=32)

OpenAI Compatible API

DS1 provides an OpenAI-compatible endpoint at POST /v1/embeddings for seamless integration with existing OpenAI clients.

Using OpenAI Python Library

python

from openai import OpenAI

# Initialize client pointing to your SageMaker endpoint
client = OpenAI(
    api_key="your-api-key",
    base_url="https://your-sagemaker-endpoint-url"
)

# Create embeddings using OpenAI-compatible interface
response = client.embeddings.create(
    model="DS1-EN-V1",
    input=["What is artificial intelligence?", "Explain machine learning"]
)

# Extract embeddings
embeddings = [item.embedding for item in response.data]
print(f"Generated {len(embeddings)} embeddings")

Direct HTTP Requests

For language-agnostic integration, you can make direct HTTP requests to the SageMaker endpoint.

cURL Example

bash

curl -X POST https://your-sagemaker-endpoint-url/embed \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": ["Hello world", "How are you?"]
  }'

Response Format

json

[
  [0.123, -0.456, 0.789, ...],
  [0.234, -0.567, 0.890, ...]
]

Error Handling

DS1 returns standard HTTP error codes with descriptive error messages:

Status Code	Meaning	Example
200	Success	Valid embeddings returned
400	Bad Request	Empty batch or invalid input
413	Payload Too Large	Batch size exceeds limits
422	Unprocessable Entity	Tokenization error
424	Failed Dependency	Model inference failed
429	Too Many Requests	Service overloaded

Handling Errors in Python

python

import json
from botocore.exceptions import ClientError

try:
    response = sagemaker_runtime.invoke_endpoint(
        EndpointName=ENDPOINT_NAME,
        ContentType='application/json',
        Body=json.dumps({"inputs": texts})
    )
    result = json.loads(response['Body'].read().decode())
except ClientError as e:
    error_code = e.response['Error']['Code']
    error_message = e.response['Error']['Message']
    print(f"Error {error_code}: {error_message}")
except json.JSONDecodeError:
    print("Failed to parse response")

Performance Tips

Batch Requests: Group multiple texts into a single request rather than calling individually
Connection Reuse: Keep the SageMaker client connection open across multiple requests
Batch Size: Test sizes between 16-32 texts per batch for optimal throughput
Caching: Store computed embeddings to avoid redundant API calls

For detailed API specifications, see the OpenAPI Reference.

API Examples ​

Exploring the API ​

Viewing the OpenAPI Schema ​

Python with Boto3 (AWS SageMaker) ​

Basic Usage ​

Batch Processing ​

OpenAI Compatible API ​

Using OpenAI Python Library ​

Direct HTTP Requests ​

cURL Example ​

Response Format ​

Error Handling ​

Handling Errors in Python ​

Performance Tips ​

API Examples

Exploring the API

Viewing the OpenAPI Schema

Python with Boto3 (AWS SageMaker)

Basic Usage

Batch Processing

OpenAI Compatible API

Using OpenAI Python Library

Direct HTTP Requests

cURL Example

Response Format

Error Handling

Handling Errors in Python

Performance Tips