Meta's Fundamental AI Research (FAIR) team has announced the release of several new research artifacts and models aimed at advancing machine intelligence and supporting open science. The announcement, made on October 18, includes updates to existing models and the introduction of new technologies across various AI domains.
One of the highlights is the release of Meta Segment Anything Model 2.1 (SAM 2.1), an update to the popular image and video segmentation model. SAM 2.1 boasts improved performance, particularly in handling visually similar and small objects, as well as enhanced occlusion handling. The release includes a new developer suite with training code and web demo components, facilitating easier adoption and customisation by the AI community.
Meta FAIR has also introduced Spirit LM, an open-source multimodal language model that integrates speech and text. This model aims to address limitations in current text-to-speech pipelines by preserving expressive aspects of speech. Two versions have been developed: Spirit LM Base, which uses phonetic tokens, and Spirit LM Expressive, which incorporates pitch and style tokens to capture tonal information.
In an effort to enhance large language model (LLM) efficiency, Meta has released Layer Skip, a solution that accelerates LLM generation times without specialised hardware. The release includes inference code and fine-tuned checkpoints for models such as Llama 3, Llama 2, and Code Llama, offering up to 1.7x performance boost.
The research team has also shared SALSA, a code package enabling researchers to benchmark AI-based attacks on post-quantum cryptography standards. This work aims to validate the security of future cryptographic systems against potential machine learning-based vulnerabilities.
Other releases include:
- Meta Lingua: A lightweight codebase for efficient language model training at scale.
- Meta Open Materials 2024: A dataset and models to accelerate inorganic materials discovery.
- MEXMA: A novel pre-trained cross-lingual sentence encoder covering 80 languages.
- Self-Taught Evaluator: A method for generating synthetic preference data to train reward models without human annotations.
Mark Zuckerberg, in a recent open letter cited in the announcement, emphasised the potential of open-source AI to "increase human productivity, creativity, and quality of life" while advancing economic growth and scientific research.