Available endpoints and models
The following four endpoints are available:
- /v1/models
- /v1/chat/completions
- /v1/embeddings
- /rerank
For the chat completions endpoint, the model gpt-oss-120b is currently available. This is a pure text model.
The embeddings endpoint uses the model bge-m3, which converts texts into vector representations.
The rerank endpoint uses the model bge-reranker-v2-m3. This model is employed together with bge-m3 and its purpose is to select the most relevant top‑K results from the hits generated by the embedding model.
