Mistral AI and NVIDIA have unveiled Mistral NeMo 12B, a state-of-the-art language model designed for easy customisation and deployment in enterprise applications, offering high performance across diverse tasks.

Guillaume Lample, co founder and chief scientist of Mistral AI, emphasised the collaboration's significance: "We are fortunate to collaborate with the NVIDIA team, leveraging their top-tier hardware and software. Together, we have developed a model with unprecedented accuracy, flexibility, high-efficiency and enterprise-grade support and security thanks to NVIDIA AI Enterprise deployment."

The 12-billion-parameter model, trained on the NVIDIA DGX Cloud AI platform, boasts a 128K context length, allowing it to process extensive and complex information more coherently and accurately. It excels in multi-turn conversations, maths, common sense reasoning, world knowledge, and coding.

Key features of Mistral NeMo 12B include:

- Released under the Apache 2.0 licence

- Uses FP8 data format for reduced memory size and faster deployment

- Packaged as an NVIDIA NIM inference microservice for easy deployment

- Designed to fit on a single NVIDIA L40S, GeForce RTX 4090, or RTX 4500 GPU

The model's development leveraged NVIDIA's full stack, including TensorRT-LLM for accelerated inference performance and the NVIDIA NeMo development platform for building custom generative AI models.

Mistral NeMo 12B is immediately available for testing on ai.nvidia.com, with a downloadable NIM coming soon. Its flexibility allows for deployment in cloud, data centre, or RTX workstation environments.

The launch of Mistral NeMo 12B represents a sitep forward in enterprise AI capabilities. By combining Mistral AI's expertise with NVIDIA's advanced hardware and software ecosystem, this new model is poised to open up new opportunities for companies across various industries.



Share this post
The link has been copied!