raullenchai/vllm-mlx
OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.
0Active
On the radar — signal detected
Stars
2
Forks
0
Contributors
14
Language
Python
Score updated Mar 22, 2026
// SUBSCRIBE
The repos that moved this week, why they matter, and what to watch next. One email. No noise.