Skip to main content

Smart Routing

Tiny Claw’s smart routing system classifies queries by complexity and routes them to appropriate LLM providers. Simple queries go to cheap models, complex queries go to powerful models. This cuts LLM costs by 60–80% compared to using a single expensive model for everything.
Inspired by ClawRouter, simplified from 14 dimensions to 8 for Tiny Claw’s lightweight architecture.

How It Works

1

Classify Query

Analyze incoming message across 8 dimensions to compute complexity score
2

Determine Tier

Map score to tier: simple, moderate, complex, or reasoning
3

Route to Provider

Orchestrator selects appropriate provider based on tier and availability
4

Execute & Learn

Query is executed, and routing metrics are tracked for future optimization

Query Tiers

packages/types/src/index.ts
export type QueryTier = 'simple' | 'moderate' | 'complex' | 'reasoning';

Simple

Greetings, factual lookups, short confirmationsExamples:
  • “Hello”
  • “What is TypeScript?”
  • “Thanks!”

Moderate

Explanations, code snippets, light analysisExamples:
  • “Explain how async/await works”
  • “Write a function to validate email”
  • “What’s the difference between let and const?”

Complex

Multi-step tasks, debugging, architecture designExamples:
  • “Refactor this API to use dependency injection”
  • “Debug this TypeScript type error”
  • “Design a caching layer for this service”

Reasoning

Deep analysis, proofs, multi-constraint optimizationExamples:
  • “Prove this algorithm is O(n log n)”
  • “Compare trade-offs between microservices and monolith for this use case”
  • “Design a distributed consensus protocol”

8-Dimension Classifier

Dimension Weights

packages/router/src/classifier.ts
const DIMENSIONS: Array<{
  name: string;
  weight: number;
  score: (text: string, tokens: number) => DimensionResult;
}> = [
  { name: 'reasoning',    weight: 0.20 },  // Analytical/logical keywords
  { name: 'code',         weight: 0.18 },  // Code syntax/technical terms
  { name: 'multiStep',    weight: 0.15 },  // Sequential task markers
  { name: 'technical',    weight: 0.12 },  // Technical domain terms
  { name: 'promptLength', weight: 0.10 },  // Token count
  { name: 'simple',       weight: 0.10 },  // Simple language (negative score)
  { name: 'constraints',  weight: 0.08 },  // Constraint/requirement keywords
  { name: 'creative',     weight: 0.07 },  // Creative task markers
];

Scoring Algorithm

Weight: 20%Keywords:
  • prove, theorem, derive
  • step by step, chain of thought
  • analyze, evaluate, critique
  • compare and contrast
  • explain why, what causes
Scoring:
  • 0 matches: 0.0
  • 1 match: 0.3
  • 2+ matches: 0.6–1.0

Tier Boundaries

packages/router/src/classifier.ts
const TIER_BOUNDARIES = {
  simple:   -0.05,  // score < -0.05
  moderate:  0.15,  // score -0.05 to 0.15
  complex:   0.35,  // score 0.15 to 0.35
  // reasoning: score >= 0.35
};

Classification Result

packages/router/src/classifier.ts
export interface ClassificationResult {
  tier: QueryTier;           // Determined complexity tier
  score: number;             // Raw weighted score across all dimensions
  confidence: number;        // 0–1 confidence based on distance from boundaries
  signals: string[];         // Human-readable signals that contributed
}

Example Classification

import { classifyQuery } from '@tinyclaw/router';

const result = classifyQuery('Debug this TypeScript type error in the authentication module');

// Result:
// {
//   tier: 'complex',
//   score: 0.28,
//   confidence: 0.89,
//   signals: [
//     'code (debug, typescript, error)',
//     'technical (authentication)',
//     'multi-step (1 marker)'
//   ]
// }

Provider Orchestrator

The orchestrator manages multiple providers and routes queries based on tier.

Configuration

packages/router/src/orchestrator.ts
export interface OrchestratorConfig {
  providers: ProviderTierConfig[];
  fallbackChain?: string[];
}

export interface ProviderTierConfig {
  provider: Provider;
  tiers: QueryTier[];
  priority: number;
}

Example Setup

import { ProviderOrchestrator } from '@tinyclaw/router';

const orchestrator = new ProviderOrchestrator({
  providers: [
    {
      provider: ollamaCloudSimple,  // kimi-k2.5:cloud
      tiers: ['simple', 'moderate'],
      priority: 1,
    },
    {
      provider: ollamaCloudPowerful, // gpt-oss:120b-cloud
      tiers: ['complex', 'reasoning'],
      priority: 1,
    },
    {
      provider: openaiGPT4,
      tiers: ['reasoning'],
      priority: 2,  // Fallback for reasoning if Ollama unavailable
    },
  ],
  fallbackChain: ['ollama-cloud', 'openai', 'anthropic'],
});

Routing Logic

1

Classify Query

Determine tier using 8-dimension classifier
2

Filter Providers

Get all providers that support this tier
3

Check Availability

Call provider.isAvailable() for each
4

Sort by Priority

Lower priority number = higher preference
5

Select Provider

Use first available provider
6

Fallback

If all tier-specific providers fail, try fallback chain

Route Method

packages/router/src/orchestrator.ts
async route(message: string): Promise<RouteResult> {
  // Step 1: Classify
  const classification = classifyQuery(message);
  
  // Step 2: Filter providers by tier
  const candidates = this.config.providers
    .filter(p => p.tiers.includes(classification.tier))
    .sort((a, b) => a.priority - b.priority);
  
  // Step 3: Check availability and select
  for (const candidate of candidates) {
    const available = await candidate.provider.isAvailable();
    if (available) {
      return {
        provider: candidate.provider,
        tier: classification.tier,
        confidence: classification.confidence,
        signals: classification.signals,
      };
    }
  }
  
  // Step 4: Fallback
  for (const providerId of this.config.fallbackChain || []) {
    const fallback = this.findProviderById(providerId);
    if (fallback && await fallback.isAvailable()) {
      return {
        provider: fallback,
        tier: classification.tier,
        confidence: classification.confidence,
        signals: [...classification.signals, 'fallback'],
      };
    }
  }
  
  throw new Error('No available providers for this query tier');
}

Cost Savings

Before Smart Routing

All queries go to GPT-4:
Daily usage:
- 500 queries
- 100% go to GPT-4 ($0.03/1K tokens)
- Average: 500 tokens input + 200 tokens output per query

Cost per query:
  Input:  500 tokens * $0.03 / 1000 = $0.015
  Output: 200 tokens * $0.06 / 1000 = $0.012
  Total:  $0.027 per query

Daily cost: 500 * $0.027 = $13.50
Monthly cost: $13.50 * 30 = $405

After Smart Routing

Queries distributed by tier:
Daily usage:
- 500 queries total
  - 200 simple (40%) → Ollama Cloud (free tier)
  - 150 moderate (30%) → Ollama Cloud (free tier)
  - 120 complex (24%) → gpt-oss:120b-cloud ($0.005/1K)
  - 30 reasoning (6%) → GPT-4 ($0.03/1K)

Costs:
  Simple:    200 * $0.000 = $0.00
  Moderate:  150 * $0.000 = $0.00
  Complex:   120 * $0.004 = $0.48
  Reasoning:  30 * $0.027 = $0.81

Daily cost: $1.29
Monthly cost: $1.29 * 30 = $38.70

Savings: $405 - $38.70 = $366.30/month (90% reduction)
Actual savings depend on:
  • Your query distribution (more simple queries = more savings)
  • Provider pricing (Ollama Cloud free tier is extremely generous)
  • Provider availability (fallbacks may increase costs)

Provider Registry

The registry manages provider plugins and their tier assignments:
packages/router/src/provider-registry.ts
export interface ProviderRegistry {
  register(config: ProviderTierConfig): void;
  unregister(providerId: string): void;
  getForTier(tier: QueryTier): Provider[];
  getAll(): Provider[];
}

Dynamic Registration

import { createProviderRegistry } from '@tinyclaw/router';

const registry = createProviderRegistry();

// Register Ollama Cloud (built-in)
registry.register({
  provider: ollamaCloud,
  tiers: ['simple', 'moderate', 'complex'],
  priority: 1,
});

// Register OpenAI (plugin)
registry.register({
  provider: openaiProvider,
  tiers: ['complex', 'reasoning'],
  priority: 2,
});

// Register Anthropic (plugin)
registry.register({
  provider: anthropicProvider,
  tiers: ['reasoning'],
  priority: 3,
});

Confidence Scoring

Confidence indicates how sure the classifier is about the tier assignment:
packages/router/src/classifier.ts
function computeConfidence(score: number): number {
  const boundaries = [-0.05, 0.15, 0.35];
  let minDistance = Infinity;
  
  for (const boundary of boundaries) {
    const distance = Math.abs(score - boundary);
    if (distance < minDistance) minDistance = distance;
  }
  
  // Sigmoid: 1 / (1 + exp(-k * distance)), k=12 for steep curve
  return 1 / (1 + Math.exp(-12 * minDistance));
}
Interpretation:
  • confidence > 0.9 — Very confident, score is far from boundaries
  • 0.7 < confidence ≤ 0.9 — Confident
  • 0.5 < confidence ≤ 0.7 — Uncertain, near a boundary
  • confidence ≤ 0.5 — Very uncertain, could go either way

Using Confidence

const result = await orchestrator.route(message);

if (result.confidence < 0.6) {
  // Low confidence — consider bumping to next tier for safety
  console.log('Low confidence, considering tier escalation');
}

Performance

Classification Speed

Sub-millisecond classification (pure RegEx + math)

Zero Dependencies

No external APIs, runs 100% offline

Adaptive

Learns from provider availability and routing metrics

Transparent

Returns human-readable signals for debugging

Debugging

Signals provide transparency into classification decisions:
const result = classifyQuery(
  'Refactor this API to use dependency injection and add comprehensive unit tests'
);

console.log(result);

// Output:
// {
//   tier: 'complex',
//   score: 0.31,
//   confidence: 0.87,
//   signals: [
//     'code (api, unit tests)',
//     'technical (api, dependency injection)',
//     'multi-step (2 markers)',
//     'long prompt'
//   ]
// }

Next: Security (SHIELD)

Learn about runtime SHIELD.md enforcement