Wayfinder Router routes queries between local and hosted LLMs

New CLI tool offers deterministic routing of LLM queries between local and cloud models with no extra calls. Balances cost, latency and privacy.

Wayfinder Router routes queries between local and hosted LLMs

Wayfinder Router

A GitHub project called

wayfinder-routeritsthelore
View on GitHub โ†’
appeared on Hacker News with a Python CLI that assigns prompts to either a local model or a hosted one. The tool examines prompt length, structure, and wording cues instead of calling another model for the decision. It produces a numeric score and a route recommendation in a few microseconds while running completely offline. The approach targets cost control by keeping simple prompts on smaller local hardware and sending only complex ones to paid APIs.

How the routing works

Wayfinder parses the prompt text for explicit markers. It counts tokens, detects headings, bullet lists, fenced code blocks, and inline math or constraint phrases. These features feed a deterministic scoring function that outputs a single value. Above a user-set threshold the prompt goes to the larger model; below it stays local. No training step or external API is required at inference time.

Users supply their own calibration data to adjust the threshold. The repository includes a small benchmark set and a Makefile target that runs the scorer against sample prompts. Because the logic lives in plain Python, the decision remains identical across runs and machines. This removes variance that appears when a second model judges routing.

Comparison to model-based routers

Other routing systems rely on a classifier or an LLM call to pick the target. RouteLLM trains on preference pairs and still invokes the classifier at runtime. Hosted options such as NotDiamond or OpenRouter Auto keep the routing logic on their servers. Wayfinder differs by shipping only the structural scorer. It therefore avoids added latency, extra token spend, and any requirement to send the original prompt to a third party just to decide where it should run.

The trade-off is that the structural rules may mis-route edge cases that a learned model would catch. The project documentation lists accuracy figures on its benchmark set but does not claim to beat every trained router. For workloads where prompt patterns stay consistent, the fixed rules deliver predictable cost savings without the overhead of another model call.

Practical setup and constraints

Installation follows standard Python packaging from the pyproject.toml file. After cloning the repository a user can run the CLI against a text file or stdin and receive a JSON object containing the score and recommended route. The same binary works inside a Docker container or as a lightweight sidecar in a larger pipeline.

The scorer currently handles English prose and common code blocks. Prompts that mix languages or use unusual formatting may need threshold retuning. The code is released under an open license and includes unit tests plus a changelog that tracks rule adjustments. No API keys are stored or required for the routing step itself.

FAQs

Does Wayfinder require any model to be loaded for the routing decision? No. The decision uses only local text analysis and runs without network access or model weights.

Can the threshold be changed after installation? Yes. The CLI accepts a configuration file or command-line flag that overrides the default cutoff value.

How does calibration on custom data affect results? Users run the supplied benchmark script on their own prompt corpus and adjust the threshold until the split between local and hosted models matches observed cost or quality needs.

---

๐Ÿ“– Related articles

Need a consultation?

I help companies and startups build software, automate workflows, and integrate AI. Let's talk.

Get in touch
โ† Back to blog