For builders
Just switch inference URL.
A drop-in replacement for your existing inference setup. Point your code at our endpoint — same request format, same response shape, same streaming behavior.
Works with LangChain, LiteLLM, n8n, and any SDK that speaks the standard chat completions API. Run Llama, Mistral, Gemma, and more.