Text Embeddings

Models

DS1 currently provides one embedding model with the following specifications:

Model Name	Dimension	Context Length	Tokenizer	Description
DS1-EN-V1	512 (L2 normalized)	512 tokens	30k WordPiece	High-performance English text retrieval model

The DS1 embedding model uses the WordPiece tokenizer with the following characteristics:

Tokenizer Type	Vocabulary Size	Special Tokens
WordPiece	30,000	[PAD], [UNK], [CLS]

DS1 is a text-only embedding model optimized for English language text retrieval and semantic search applications.

Want to know about additional models? Check out our FAQ.