Anthropic, has announced a breakthrough in Retrieval-Augmented Generation (RAG) technology. The new method, dubbed "Contextual Retrieval," addresses a critical limitation in traditional RAG systems by preserving context when retrieving information from large knowledge bases.
The new method combines two key techniques: Contextual Embeddings and Contextual BM25. According to Anthropic's tests, this approach can reduce the number of failed retrievals by 49% compared to traditional RAG systems. When combined with a reranking step, the improvement jumps to a 67% reduction in failed retrievals.
Contextual Retrieval works by prepending chunk-specific explanatory context to each piece of information before it's embedded in the knowledge base. This approach helps maintain the relevance and meaning of individual chunks of text, even when they're separated from their original context.
Anthropic has made implementation of Contextual Retrieval accessible to developers through a cookbook, allowing for easy deployment with their AI model, Claude. The company also highlighted the cost-effectiveness of this method when used with their prompt caching feature, estimating a one-time cost of $1.02 per million document tokens to generate contextualised chunks.
The research team at Anthropic conducted extensive experiments across various knowledge domains, including codebases, fiction, ArXiv papers, and science papers. They found that Contextual Retrieval consistently improved performance across all tested embedding models and retrieval strategies.