Model Catalogue
Curated local and open-source AI models. From community workhorses to experimental frontier architectures.
Spotlight — June 2026
Alibaba's Mixture-of-Experts flagship activates only 22B of its 27B parameters per token. It offers frontier-class reasoning with remarkable efficiency. Hybrid thinking mode toggles chain-of-thought on or off at inference time. Outperforms GPT-4o on multiple benchmarks while remaining fully locally runnable on high-end consumer rigs.
Based on Hermes-3-Llama-3.1-405B, built for thoughtful discussion and helpful, structured reasoning. Strong multilingual generation, precise instruction-following, and clean formatting across technical and analytical tasks.
The newest Hermes generation in an efficient 70B class. Balanced speed and reasoning depth with controlled verbosity, reliable structured output, and strong code generation, debugging, and optimization.
Phi-Reasoning delivers long, thorough reasoning responses for deep research and complex problem solving. Optimal for in-depth analysis, academic assistance, and technical explanations. Can analyse PDF documents.
A 27B dense hybrid model pairing Gated Delta Networks with attention for long-context efficiency. Strong at coding, reasoning, and agent tasks across 201 languages and dialects under Apache 2.0.
A 12B collaboration between Mistral and NVIDIA, designed to be a drop-in efficient workhorse. State-of-the-art multilingual coverage, large context, and strong instruction following on a single consumer GPU.
A compact, efficiency-first assistant model tuned for fast everyday reasoning, summarization, and chat. Designed to deliver responsive performance on modest hardware while staying lightweight.
Poolside's leading coding agent model for intricate software development and agentic engineering workflows. Excels at repository-scale reasoning, debugging, refactoring, and autonomous development with strong tool-calling.
An advanced model combining deep engineering productivity with speed and emotional intelligence. Excels at autonomously driving end-to-end project execution, debugging, and log analysis. Broad file processing support.
A fast, next-generation model combining engineering productivity with speed. Stands out at autonomously driving end-to-end project execution and handling complex tasks like log analysis and debugging.
A cheap MoE reasoning model designed for coding and agent performance. Dynamically routes tokens through specialized experts with a hybrid attention architecture for strong scalability and fast inference.
A unified iteration consolidating several Mistral models into one system: Magistral's reasoning, Pixtral's multimodal understanding, and Devstral's agentic coding. Handles analysis, software dev, and visual tasks in one workflow.
A cost-effective multimodal foundation model that efficiently processes text, images, and video — prioritizing speed and affordability. Massive 1M-token context with adjustable verbosity and temperature controls.
Start running open models today. No API keys, no subscriptions, no data leaving your machine.