@tinyclaw/matcher
Three-dimensional text matching that handles synonyms, typos, and partial matches without external embedding APIs.Installation
Overview
The hybrid matcher combines three scoring dimensions:- Keyword overlap - TF-IDF-like weighting with stop-word filtering
- Fuzzy matching - Levenshtein distance for typo tolerance
- Synonym expansion - Built-in + user-extensible synonym groups
API Reference
createHybridMatcher(config?)
Create a hybrid matcher instance.
Parameters:
Minimum combined score to consider a match.
Weights for each scoring dimension. Must sum to 1.0.
keyword- Keyword overlap weightfuzzy- Fuzzy matching weightsynonym- Synonym expansion weight
HybridMatcher
HybridMatcher Interface
Score how well two text strings match semantically.Returns a
MatchResult with:score- Combined weighted score (0.0–1.0)keywordScore- Keyword overlap sub-scorefuzzyScore- Fuzzy/Levenshtein sub-scoresynonymScore- Synonym expansion sub-score
findBest
(query: string, candidates: Array<{id: string, text: string}>) => {id: string, result: MatchResult} | null
Find the best match from a list of candidates.Returns
null if no candidate exceeds minScore.Register a custom synonym group. All words in the group are considered equivalent.
Usage Examples
Basic Scoring
Finding Best Match
Custom Synonyms
Built-in Synonyms
The matcher includes 20 synonym groups covering common agent task vocabulary:Developer roles
Developer roles
- developer, engineer, coder, programmer
Research & analysis
Research & analysis
- research, analyze, investigate, study, examine
Content creation
Content creation
- write, compose, draft, author, create
Design & planning
Design & planning
- design, architect, blueprint, plan, layout
Testing & validation
Testing & validation
- test, verify, validate, check, assess
Bug fixing
Bug fixing
- fix, repair, patch, resolve, debug
Review processes
Review processes
- review, evaluate, critique, audit, inspect
Documentation
Documentation
- document, describe, explain, annotate, record
Optimization
Optimization
- optimize, improve, enhance, refine, tune
Deployment
Deployment
- deploy, release, ship, publish, launch
Scoring Algorithm
1. Tokenization
Text is tokenized by:- Converting to lowercase
- Removing punctuation
- Splitting on whitespace
- Filtering stop words (“the”, “is”, “and”, etc.)
- Removing tokens shorter than 3 characters
2. Keyword Score
3. Fuzzy Score
4. Synonym Score
5. Combined Score
Performance
- Tokenization: O(n) where n = text length
- Keyword scoring: O(min(q, t)) where q, t = token counts
- Fuzzy scoring: O(q × t × L) where L = average token length
- Memory: O(synonyms) - constant per matcher instance
- Typical latency: 1-5ms for short texts (<200 chars)
Use Cases
Delegation Reuse
Find existing sub-agents that can handle similar tasks:Template Matching
Match tasks to role templates:Comparison with Vector Search
| Feature | Hybrid Matcher | Vector/Embedding Search |
|---|---|---|
| Dependency | Zero (built-in) | OpenAI API or local model |
| Cost | Free | $0.0001/token (OpenAI) |
| Latency | 1-5ms | 50–200ms (API) or 10–50ms (local) |
| Offline | ✅ Fully offline | ❌ Requires API or model |
| Typo tolerance | ✅ Levenshtein-based | ❌ Embedding-dependent |
| Interpretability | ✅ Clear scoring breakdown | ❌ Opaque vector similarity |
Related Packages
- @tinyclaw/delegation - Uses matcher for sub-agent reuse
- @tinyclaw/memory - Uses FTS5 instead for full-text semantic search