API Examples
This page demonstrates practical examples of using the DS1 embedding model through different interfaces and languages.
Exploring the API
Viewing the OpenAPI Schema
DS1 provides a complete OpenAPI specification at /api-doc/openapi.json on your deployed endpoint. You can retrieve and explore it using standard HTTP requests.
Using wget:
wget https://your-sagemaker-endpoint-url/api-doc/openapi.jsonUsing curl:
curl https://your-sagemaker-endpoint-url/api-doc/openapi.json | jqUsing Python:
import requests
endpoint_url = "https://your-sagemaker-endpoint-url"
schema = requests.get(f"{endpoint_url}/api-doc/openapi.json").json()
print(schema)This schema documents all available endpoints, request formats, and response structures for DS1.
Python with Boto3 (AWS SageMaker)
The most common way to interact with DS1 is through AWS SageMaker endpoints using the Boto3 library.
Basic Usage
import boto3
import json
# Initialize SageMaker runtime client
sagemaker_runtime = boto3.client('sagemaker-runtime', region_name='eu-west-2')
# Your DS1 endpoint name
ENDPOINT_NAME = 'your-ds1-endpoint-name'
# Sample text to embed
texts = ["What is machine learning?", "How does deep learning work?"]
# Create request payload
payload = {
"inputs": texts
}
# Invoke the endpoint
response = sagemaker_runtime.invoke_endpoint(
EndpointName=ENDPOINT_NAME,
ContentType='application/json',
Body=json.dumps(payload)
)
# Parse the response
result = json.loads(response['Body'].read().decode())
embeddings = result if isinstance(result, list) else result.get("embeddings", [])
print(f"Generated {len(embeddings)} embeddings")
print(f"Embedding dimension: {len(embeddings[0])}")Batch Processing
def embed_texts_batch(texts, batch_size=32):
"""
Embed multiple texts in batches for optimal performance
"""
all_embeddings = []
for i in range(0, len(texts), batch_size):
batch = texts[i:i + batch_size]
payload = {"inputs": batch}
response = sagemaker_runtime.invoke_endpoint(
EndpointName=ENDPOINT_NAME,
ContentType='application/json',
Body=json.dumps(payload)
)
result = json.loads(response['Body'].read().decode())
embeddings = result if isinstance(result, list) else result.get("embeddings", [])
all_embeddings.extend(embeddings)
print(f"Processed batch {i//batch_size + 1}/{(len(texts)-1)//batch_size + 1}")
return all_embeddings
# Example usage
documents = [
"Document 1 content here...",
"Document 2 content here...",
"Document 3 content here...",
]
embeddings = embed_texts_batch(documents, batch_size=32)OpenAI Compatible API
DS1 provides an OpenAI-compatible endpoint at POST /v1/embeddings for seamless integration with existing OpenAI clients.
Using OpenAI Python Library
from openai import OpenAI
# Initialize client pointing to your SageMaker endpoint
client = OpenAI(
api_key="your-api-key",
base_url="https://your-sagemaker-endpoint-url"
)
# Create embeddings using OpenAI-compatible interface
response = client.embeddings.create(
model="DS1-EN-V1",
input=["What is artificial intelligence?", "Explain machine learning"]
)
# Extract embeddings
embeddings = [item.embedding for item in response.data]
print(f"Generated {len(embeddings)} embeddings")Direct HTTP Requests
For language-agnostic integration, you can make direct HTTP requests to the SageMaker endpoint.
cURL Example
curl -X POST https://your-sagemaker-endpoint-url/embed \
-H "Content-Type: application/json" \
-d '{
"inputs": ["Hello world", "How are you?"]
}'Response Format
[
[0.123, -0.456, 0.789, ...],
[0.234, -0.567, 0.890, ...]
]Error Handling
DS1 returns standard HTTP error codes with descriptive error messages:
| Status Code | Meaning | Example |
|---|---|---|
| 200 | Success | Valid embeddings returned |
| 400 | Bad Request | Empty batch or invalid input |
| 413 | Payload Too Large | Batch size exceeds limits |
| 422 | Unprocessable Entity | Tokenization error |
| 424 | Failed Dependency | Model inference failed |
| 429 | Too Many Requests | Service overloaded |
Handling Errors in Python
import json
from botocore.exceptions import ClientError
try:
response = sagemaker_runtime.invoke_endpoint(
EndpointName=ENDPOINT_NAME,
ContentType='application/json',
Body=json.dumps({"inputs": texts})
)
result = json.loads(response['Body'].read().decode())
except ClientError as e:
error_code = e.response['Error']['Code']
error_message = e.response['Error']['Message']
print(f"Error {error_code}: {error_message}")
except json.JSONDecodeError:
print("Failed to parse response")Performance Tips
- Batch Requests: Group multiple texts into a single request rather than calling individually
- Connection Reuse: Keep the SageMaker client connection open across multiple requests
- Batch Size: Test sizes between 16-32 texts per batch for optimal throughput
- Caching: Store computed embeddings to avoid redundant API calls
For detailed API specifications, see the OpenAPI Reference.