🔍 DPR: Semantic Search & Passage Retrieval Demo

How it works

This app uses Dense Passage Retrieval (DPR) — a neural approach to information retrieval that understands meaning, not just keywords.

Unlike traditional keyword search (BM25 / TF-IDF), DPR encodes questions and passages into dense vector representations and finds the best match via dot-product similarity in embedding space.

💬 Try these example questions

🔄 Dual-Encoder Workflow

Step	Component	What happens
1	Context Encoder	Each passage in the knowledge base is encoded into a 768-dim vector (done once at startup)
2	Question Encoder	Your query is encoded into a 768-dim vector (done at search time)
3	Dot Product	We compute `score = Q · Pᵀ` between the query vector and every passage vector
4	Ranking	Passages are ranked by score — the highest score = most semantically relevant

💡 Key insight: "Who landed on the Moon?" will match the Apollo 11 passage even though the passage never contains the word "landed on the Moon" verbatim — because the meaning is captured in the vector space.

Models used:

🔵 Question Encoder: facebook/dpr-question_encoder-single-nq-base
🟢 Context Encoder: facebook/dpr-ctx_encoder-single-nq-base
📄 Paper: Dense Passage Retrieval for Open-Domain Question Answering (Karpukhin et al., EMNLP 2020)