Silicon Valley-based Contextual AI, founded by RAG pioneers Douwe Kiela and Amanpreet Singh, has unveiled RAG 2.0, a platform that promises to enhance retrieval-augmented generation (RAG) performance by up to 10 times, potentially transforming enterprise AI applications.

Contextual AI, a startup backed by NVIDIA's NVentures, builds upon the foundational work of CEO Douwe Kiela and CTO Amanpreet Singh, who introduced the concept of retrieval-augmented generation in a seminal 2020 paper.

RAG, which allows large language models (LLMs) to access and incorporate real-time data, has become crucial for enterprises seeking to leverage AI effectively. Kiela explains the motivation behind their work: "When ChatGPT happened, we saw this enormous frustration where everybody recognized the potential of LLMs, but also realised the technology wasn't quite there yet. We knew that RAG was the solution to many of the problems."

The key innovation of RAG 2.0 lies in its tight integration of the retriever architecture with the LLM's generator. This unified approach allows for simultaneous tuning of both components through back propagation, resulting in significant gains in precision and response quality.

According to Kiela, this integration can lead to dramatic improvements in efficiency: "A 70-billion-parameter model that would typically require significant compute resources could instead run on far smaller infrastructure, one built to handle only 7 billion parameters without sacrificing accuracy."

Contextual AI's platform employs a "mixture of retrievers" approach to tackle the challenge of identifying relevant information across various data formats. This method combines different RAG types with a neural reranking algorithm, enabling the system to efficiently process and prioritise information from diverse sources such as text, video, and PDFs.

The startup's technology has attracted significant interest, with the company recently closing an $80 million Series A funding round. As a member of NVIDIA Inception, a program designed to nurture startups, Contextual AI developed its retrievers using NVIDIA's Megatron LM on a mix of NVIDIA H100 and A100 Tensor Core GPUs hosted in Google Cloud.

Kiela emphasises the versatility of their solution: "Because of its highly optimised architecture and lower compute demands, RAG 2.0 can run in the cloud, on premises or fully disconnected. And that makes it relevant to a wide array of industries, from fintech and manufacturing to medical devices and robotics."

With approximately 50 employees, Contextual AI plans to double its workforce by the end of the year, signalling strong growth prospects for the company and its technology.

As enterprises continue to grapple with the challenges of implementing effective AI solutions, Contextual AI's RAG 2.0 platform represents a significant advancement in the field. By dramatically improving the performance and efficiency of AI systems, this technology could pave the way for more widespread adoption of AI across various industries and use cases.



Share this post
The link has been copied!