Our Smart LLM Router v2.0 is a query complexity classifier that automatically routes each incoming message to the optimal AI provider.
Simple queries like "What are your hours?" get routed to Groq's Llama 3 model — which runs on their free tier with sub-100ms latency. Complex queries that require reasoning, nuanced understanding, or multi-step logic get routed to Anthropic's Claude 3.5 Sonnet or OpenAI's GPT-4o.
The result? A 60% reduction in AI costs for our tenants while maintaining the same quality of responses. The classifier runs as a lightweight first-pass before the main LLM call, adding only ~15ms of latency.
Here's how it works under the hood:
1. **Query Classification** — A lightweight model analyzes the incoming message for complexity signals: question type, required context depth, and domain specificity.
2. **Provider Selection** — Based on the classification, the router selects the optimal provider from the configured pool (Groq, Anthropic, OpenAI, Google).
3. **Failover Logic** — If the primary provider fails or times out, the router automatically retries with the next-best provider.
4. **Cost Tracking** — Every request is logged with token usage and cost, giving tenants full visibility into their AI spend.
This is available now for all Pro and Enterprise plans. Starter plans use a single provider (configurable in settings).