X.ai, the artificial intelligence company founded by Elon Musk, announced the release of Grok-1.5V on April 12,2024.
Grok-1.5V is described as a cutting-edge multimodal large language model capable of understanding and processing both text and visual information. Grok-1.5V can interpret a wide range of visual data, including documents, diagrams, charts, screenshots, and photographs, in addition to its text capabilities.
Grok-1.5V combines natural language processing with advanced computer vision capabilities. This combination gives an understanding of both textual and visual information, such as multi-disciplinary reasoning, document comprehension, and real-world spatial understanding.
According to X.AI, Grok-1.5V outperforms its peers on a variety of tasks, including the newly introduced RealWorldQA benchmark, which measures a model's ability to understand and reason about real-world spatial relationships.
X.ai, founded in July 2023, has quickly established itself as a leader in the field of artificial intelligence. The company's first large language model, Grok-1, was released just four months after its inception and was subsequently open-sourced in March 2024. With the introduction of Grok-1.5V, X.ai continues to push the boundaries of what is possible with AI technology.