xAI by Elon Musk Unveils Grok 3 – An Overview of What Sets It Apart From Other LLMs

The progress made in a year and a half is astonishing.

Feb 19, 2025

Elon Musk's xAI has just unveiled Grok 3, the latest iteration of Grok. Grok 3 seems to be proof that the laws of scale are not over. This opinion continues to gain ground in Silicon Valley, and it has to be said that several elements fuel this opinion. Starting with the computing capacity dedicated to Grok 3 training.

In the spring of 2024, xAI set out to build a cluster of 100,000 H100 GPUs. Faced with the 18 to 24-month lead times announced by data center suppliers, the company had invested in a former Electrolux factory in Memphis (Tennessee). It took around four months to install 100,000 GPUs (120 MW). And three more to increase to 200,000 (250 MW). Tesla Megapack batteries were deployed to back up the generators and absorb the fluctuations in demand inherent in the Grok 3 drive.

xAI says it is working on a cluster “5 times more powerful”, in this case, 1.2 GW. Elon Musk mentions NVIDIA GB200 accelerator cards, which combine a Grace CPU and Blackwell GPUs.

Reasoning and inference scaling

The inference scaling much talked about with DeepSeek, also appears beneficial to Grok 3. It goes hand in hand with its reasoning capabilities. xAI has integrated it in the form of a “Big Brain” mode. By activating it, the model is given more time and computational resources to deepen its thinking. This reflection is the subject of a chain of thought accessible to the user. But not in its entirety. The aim is to avoid distillation. That is, using the outputs of one model to drive another. A practice of which OpenAI - and, with it, Washington - accused DeepSeek.

The latter used almost exclusively reinforcement learning (without fine-tuning) to develop the reasoning capacities of its latest models. xAI does not reveal its recipe but seems to have followed the same path. However, it claims to have limited the exercise to math and code problems, with Grok 3 then managing to generalize.

Following on from reasoning skills, xAI introduces, like many others before it, “deep search” functionality. The promise, by now, is well known: to perform in a few minutes tasks that could take several hours. The model cites its sources and displays its stages of progress.

Grok 3 stands out from the competition

xAI announces evaluation results on the AIME (math), GPQA (science), and LiveCodeBench (coding) benchmarks, but does not specify the conditions under which they were carried out. For more meaningful indicators, we can turn to Chatbot Arena. Here's how it works: the user submits a query, receives responses from two models, and selects the best one. After around 8,000 evaluations, Grok 3's ELO exceeded 1,400, putting it in the lead.

Towards a sharper separation between Grok and X

The rollout of Grok 3 on X began on Monday, February 17, 2025, for Premium+ subscribers. It will also be available in the Grok mobile app, currently available on iOS. The same app will also allow subscribers to take out a specific subscription called SuperGrok. Expected to cost $30/month ($300/year), it will give access to more queries with reasoning and in-depth search. As well as unlimited image generation.

The most recent versions of Grok 3 will not be available on the mobile application but on the grok.com website. xAI intends to make the mini version of Grok 3 available “free to all in the next few days”. A native voice mode (without using the text modality) should follow before Grok 3 becomes available on the API. As for the opening of Grok-2 weights, it's a matter of months, according to xAI.

Grok 3 still suffers from inconsistencies

During its presentation, Grok 3 showed itself to be more at ease with physical simulation (calculating and rendering, in a 3D plane, a viable trajectory for a round trip between Earth and Mars) than with the creation of a video game. The latter was to combine the rules of Tetris (among other things, make all complete lines disappear) and Bejeweled (make all alignments of three jewels of the same color disappear).

The reasoning mode enables Grok 3 to solve certain problems that state-of-the-art models are unable to solve. For example, determining the amount of computation used to train GPT-2 from the scientific article devoted to it by OpenAI. Or, more prosaically, identifying that 9.11 < 9.9 (which is not obvious for LLM) and that there are three r's in “strawberry”...

On Simon Willison's (co-creator of Django) famous test consisting of generating a 2D vector image of a pelican riding a bicycle, Grok 3 doesn't fare as well as Claude, among others. Humor is not his strong point either.

Bluffing results for xAI in just a year and a half of activity

In the space of a year and a half, xAI will have achieved - if its announced performance is anything to go by - the level of state-of-the-art models. The company officially launched its activities in July 2023. Five months later, it opened its Grok chatbot in beta. The underlying LLM was Grok-1. It was the result of training Grok-0 (33B) and then improving its coding and, already, reasoning skills. It was now up to the level of GPT-3.5. Even Claude 2 in language processing.

At the end of the year, the chatbot arrived on X, for Premium+ subscribers. In March 2024, xAI published the weights of Grok-1, in a base version dated October 2023. We noted the adoption of a MoE (Mixture of Experts) architecture, in which specialized models coexist and are activated according to queries. Released shortly afterward, Grok-1.5 had, at least on paper, improved problem-solving capabilities. At the same time, the context window was extended from 8k to 128k. xAI then added vision.

Then in the summer of 2024 came the Grok-2 beta, for X Premium subscribers. Its ELO on Chatbot Arena (around 1,280) was below that of GPT-4o and Gemini 1.5. Grok was opened up to all social network users in December 2024, with the addition of web search and quotes, as well as a new model for image generation.

Share AI & Quantum Computing Newsletter

Google Proposes a Hydride Quantum Simulator That Could Revolutionize Physics. Here’s an Overview.

Sylvain Saurel

Feb 8

Google Proposes a Hydride Quantum Simulator That Could Revolutionize Physics. Here’s an Overview.

Google continues to work hard to be one of the big winners in the quantum computing industry of the future. Google's quantum physics researchers have just developed an innovative approach to quantum simulation that uses both analog and digital methods.

Read full story

Will Quantum Computing Revolutionize AI?

Sylvain Saurel

Jan 29

Will Quantum Computing Revolutionize AI?

In recent years, quantum computers have raised hopes of profoundly transforming the fields of innovation and research. From cryptography to the challenges of superconductivity, not to mention drug design, these computers of tomorrow, powered by the quantum properties of matter, offer the possibility of breakthroughs. The reason? Their computing power is…

Read full story

But What Is Quantum Computing, Anyway?

Sylvain Saurel

Jan 15

Every time Google or IBM makes a breakthrough in quantum computing, you read a bunch of articles about how this emerging industry is going to change the world of the future. You watch videos telling you that the Bitcoin system will soon be in danger, just like the rest of the banking system, due to the incredible computational capabilities of quantum co…

Read full story

AI & Quantum Computing Newsletter

xAI by Elon Musk Unveils Grok 3 – An Overview of What Sets It Apart From Other LLMs

The progress made in a year and a half is astonishing.

Reasoning and inference scaling

Grok 3 stands out from the competition

Towards a sharper separation between Grok and X

Grok 3 still suffers from inconsistencies

Bluffing results for xAI in just a year and a half of activity

Google Proposes a Hydride Quantum Simulator That Could Revolutionize Physics. Here’s an Overview.

Will Quantum Computing Revolutionize AI?

But What Is Quantum Computing, Anyway?

Discussion about this post