Skip to content

Text Embeddings

Models

DS1 currently provides one embedding model with the following specifications:

Model NameDimensionContext LengthTokenizerDescription
DS1-EN-V1512 (L2 normalized)512 tokens30k WordPieceHigh-performance English text retrieval model

Tokenization

The DS1 embedding model uses the WordPiece tokenizer with the following characteristics:

Tokenizer TypeVocabulary SizeSpecial Tokens
WordPiece30,000[PAD], [UNK], [CLS]

Modality

DS1 is a text-only embedding model optimized for English language text retrieval and semantic search applications.

Want to know about additional models? Check out our FAQ.