Discover our latest hand-picked models and deploy them as dedicated endpoints in minutes.
Explore inference tasks, from text generation to image synthesis.
Browse our curated collections of models grouped by family and use case. Compare and find the right fit for your needs.
Deepseek
5 items
These cutting-edge models from Chinese AI lab DeepSeek punch well above their weight. They deliver impressive reasoning, coding, and math skills while remaining open and surprisingly affordable to run. Ideal for research assistants, chatbots, and intelligent search applications.
Embeddings
20 items
Convert real-world data into simplified numerical representations that capture semantic meaning and syntactic relationship. Perfect for building semantic search engines, RAG pipelines, recommendation systems, and clustering applications.
Gemma
9 items
Google's lightweight open models prove great things come in small packages. Built from the same technology that powers Gemini, they deliver strong reasoning, coding, and conversational skills. Capable enough for serious work, yet compact enough to run almost anywhere.
gpt-oss
3 items
OpenAI's gpt-oss is an open‑source family of lightweight, fine‑tuned GPT models that deliver high‑quality text generation while staying transparent, customizable, and free for commercial and research use.
Inferentia 2
7 items
Our models optimized to run on AWS Inferentia 2, Amazon's purpose-built ML accelerator. Designed for high-throughput, cost-efficient inference at scale. Ideal for production deployments that need the performance of dedicated silicon without the cost of traditional GPU infrastructure.
Llama
8 items
Meta Llama is a versatile suite of open‑source language models that combine efficiency with cutting‑edge performance. From lightweight chatbots to research‑grade generators, each model delivers fast, context‑aware responses while staying accessible and customizable.
Llama.cpp
21 items
Discover our curated LlamaCPP model collection, optimized for fast, lightweight inference on any device. Powered by LlamaCPP’s native C++ engine, each model runs without Python dependencies, delivering low‑latency responses even on modest CPUs or GPUs.
Mistral
6 items
French AI lab Mistral AI delivers world-class open-weight models with top-tier reasoning and instruction-following. They're fast and efficient thanks to sliding window attention. Built in Europe, they're a natural fit for EU teams prioritizing data sovereignty and regulatory compliance.
OCR
5 items
Optical Character Recognition models that convert images, scanned documents, and PDFs into machine-readable text. Ideal for document digitization, data extraction, and handwriting recognition.
PaliGemma
3 items
Google's family of open vision-language models built on the Gemma architecture. PaliGemma combines image and text understanding, making it great for tasks like image captioning, visual question answering, and document understanding. Optimized for low‑resource environments.
Qwen
30 items
Alibaba's family of open-weight large language models, covering text, code, math, and multimodal tasks. Qwen models come in a wide range of sizes and specializations, consistently ranking among the top open models for their class.