Learn · Retrieval

Hybrid retrieval, and why vector search isn’t enough

Pure vector RAG is the most common first build and the most common reason a decisioning agent cites the wrong passage. Hybrid retrieval runs vector similarity, full-text lexical search, and typed graph traversal over the same query and merges them — so the passages the agent grounds on are the passages a human reviewer would have pulled.

The failure mode hybrid retrieval fixes

Embed a query, find the nearest chunks, hand them to the model. It works in demos and quietly fails in production the first time a query hinges on an exact token. ‘Is adalimumab covered under this plan?’ depends on the literal drug name and a specific plan clause — both of which a vector search can rank below a fluent-but-irrelevant paragraph about biologics in general.

When the retrieval is wrong, everything downstream is wrong with total confidence. The model reasons beautifully over the wrong passages and cites them. In a chatbot that‘s an annoyance; in a claim adjudication it’s a defensible-looking incorrect denial.

Three methods, one query

Vector similarity

Embeddings capture meaning, so a query about ‘turning away a request’ can find a passage about ‘declining authorization’ even with no shared words.

Full-text search

Classic lexical ranking (BM25-style) guarantees exact tokens — codes, names, identifiers — surface when they appear, regardless of embedding geometry.

Graph traversal

Typed entities and relations let retrieval walk from a hit to its neighbours: the plan, the covered procedures, the prior decisions on the same subject.

How Vihaya uses it

In Vihaya the hybrid layer is the Context Mesh — a tenant-local store that holds typed entities, relations, knowledge chunks, and episodic memory together. Every agent reads from it with all three methods and writes its decisions back, so future runs on the same subject have history to draw on. Implementation details stay between Vihaya and the design-partner customer.

Hybrid retrieval FAQ

What is hybrid retrieval?

Retrieval that runs more than one search method over the same query and merges the results: vector similarity for semantic match, full-text (lexical) search for exact terms, and optionally graph traversal over typed relations. The merged set is more precise and more complete than any single method alone.

Why does vector search miss things?

Vector search ranks by semantic similarity in embedding space. That’s ideal for paraphrase but weak on exact tokens — a specific drug name, a clause number, an account ID — which can sit far apart in embedding space from a query that uses them. Lexical search catches those exactly.

What does graph traversal add?

Some questions are relationships, not similarities: ‘which policy covers this procedure for this plan tier.’ A typed entity-relation graph lets retrieval start from a hit and walk to related entities — the covered procedures, the plan rules, the prior decisions — picking up context a flat search never would.

How does this affect citations?

Citations are only useful if the retrieved passages are the ones a reviewer would have pulled by hand. Hybrid retrieval raises that hit rate, so the citation attached to a decision points at genuinely load-bearing source text rather than a loosely-similar paragraph.

Next step

Want to see this in your environment?

30-minute discovery call. We follow up with a draft SOW shortly after.

Talk to us about a pilot