Mistral's Large 2: A response to meta and OpenAI's newest models
Mistral's Large 2 challenges Meta and OpenAI's latest models with its cutting-edge features and performance. Discover how it stands out in the AI landscape.
When it rains, it pours in the realm of frontier AI models. On Wednesday, Mistral unveiled its new flagship model, Large 2, positioning it as a direct competitor to the latest high-performance models from OpenAI and Meta. Mistral claims Large 2 excels in code generation, mathematics, and reasoning, matching the capabilities of its leading counterparts.
This release comes just a day after Meta introduced its latest open-source model, Llama 3.1 405B. Mistral asserts that Large 2 sets a new standard for performance and cost-efficiency in open models, showcasing impressive benchmarks to support this claim. Remarkably, Large 2 achieves superior code generation and mathematical performance compared to Llama 3.1 405B, despite having only 123 billion parameters, which is less than a third of its competitor's size.
A significant improvement in Large 2 is its reduced tendency to generate incorrect or fabricated information, a common issue with AI models. Mistral focused on training the model to be more accurate and transparent, recognizing when it doesn't have sufficient information instead of fabricating plausible-sounding responses.
Mistral, a Paris-based AI startup, recently secured $640 million in a Series B funding round led by General Catalyst, reaching a valuation of $6 billion. Despite being a newer player in AI, Mistral is quickly making strides with cutting-edge models. However, it’s worth noting that Mistral’s models, like many others, are not entirely open-source—commercial use requires a paid license. Implementing such large models also demands substantial expertise and infrastructure, which limits accessibility.
Unlike some of its competitors, Mistral Large 2 and Meta's Llama 3.1 lack multimodal capabilities. OpenAI leads in this area, offering models that can process both text and images simultaneously, a feature that some startups are striving to incorporate.
Large 2 features a 128,000-token window, allowing it to process extensive amounts of data in a single prompt—equivalent to about 300 pages. The model also offers enhanced multilingual support, understanding a broad array of languages including English, French, German, Spanish, Italian, Portuguese, Arabic, Hindi, Russian, Chinese, Japanese, and Korean, as well as 80 coding languages. Notably, Large 2 is designed to produce more concise responses compared to many leading AI models, which often generate excessively verbose outputs.
You can access Mistral Large 2 via Google Vertex AI, Amazon Bedrock, Azure AI Studio, and IBM watsonx.ai. It is also available on Mistral’s La Plateforme under the name "mistral-large-2407" and can be tested for free on the startup’s ChatGPT competitor, Le Chat.