Samsung's 7-Million Parameter AI Model Beats Giants 10,000 Times Its Size

Imagine spending $100 million to train an AI model with hundreds of billions of parameters, only to watch it fail at solving a basic puzzle. But now imagine a researcher in Montreal building a model with just 7 million parameters—trained on a fraction of the budget—that beats your flagship model at reasoning.

That's exactly what happened this week.

Alexia Jolicoeur-Martineau, a senior AI researcher at Samsung's Advanced Institute of Technology in Montreal, just released the Tiny Recursive Model (TRM)—a neural network so small it contains just 7 million parameters, yet it outperforms models 10,000 times larger on some of the toughest reasoning benchmarks in AI research. According to a titled "Less is More: Recursive Reasoning with Tiny Networks," this tiny model achieved 45% accuracy on ARC-AGI-1 and 8% on ARC-AGI-2—benchmarks specifically designed to test abstract reasoning that should be easy for humans but difficult for AI.

Those numbers surpass DeepSeek R1 (671 billion parameters), OpenAI's o3-mini, and Google's Gemini 2.5 Pro on the same tests, despite TRM using less than 0.01% of their parameters. The model also crushed other reasoning challenges: 87.4% accuracy on Sudoku-Extreme and 85.3% on Maze-Hard, setting new state-of-the-art records.

In the GitHub repository releasing the code, Jolicoeur-Martineau's message was pointed:

"The idea that one must rely on massive foundational models trained for millions of dollars by some big corporation in order to achieve success on hard tasks is a trap. Currently, there is too much focus on exploiting LLMs rather than devising and expanding new lines of direction. With recursive reasoning, it turns out that 'less is more': you don't always need to crank up model size in order for a model to reason and solve hard problems. A tiny model pretrained from scratch, recursing on itself and updating its answers over time, can achieve a lot without breaking the bank."

The secret? Rather than scaling up parameters, TRM uses a recursive reasoning process—essentially making the model review and improve its own work in loops, like a person re-reading their own draft and fixing mistakes with each pass. The model uses just two layers and recursively updates both its internal reasoning and its proposed answer up to 16 times per problem.

But here's the kicker: when the research team tried to make the model bigger by adding more layers, performance actually declined due to overfitting. Smaller really was better.

The internet had thoughts:

The timing of this release is particularly notable. While tech giants continue pouring hundreds of millions into training ever-larger models—GPT-4's training reportedly cost over $100 million—Samsung just proved that architectural innovation might matter more than raw scale. The model was trained in under three days on standard GPUs rather than requiring massive data center infrastructure.

Of course, not everyone was ready to declare victory for the little guy. Some researchers pointed out important context:

Fair point. TRM was specifically designed for structured, grid-based reasoning problems like puzzles and mazes—not general language tasks. It's a specialist, not a generalist. The model was trained on around 1,000 examples per benchmark with heavy data augmentation, and its recursive approach allows it to essentially "think longer" on problems during test time.

Still, the implications are hard to ignore. If a model with 7 million parameters can beat models with 671 billion parameters at reasoning—even on a narrow set of tasks—what does that say about the industry's $100 million training runs?

The model is already , meaning anyone can download, modify, and use it—even for commercial applications. Whether TRM represents a fundamental shift in AI development or just a clever optimization for specific problem types, it's at least raising the question Silicon Valley seems reluctant to ask: Maybe bigger isn't always better.

Read more