Rather than score search results on the probability that the query is relevant to a document, BM25 provides a ranking of probability. That’s because the probability the query appears in the document doesn’t actually matter to the results. This is a heuristic that makes the algorithm efficient and provide excellent results.
See also:
- Personal indexing service uses BM25 as well as vector search
- BM25 is often used as part of reciprocal rank fusion
Links to this note
-
Prompt Engineering for Llms - Literature Notes
Notes from reading Prompt Engineering for LLMs by John Berryman and Albert Ziegler.