They hid the number zero from an artificial intelligence system in the hope that it would discover it on its own; what happened teaches us an important lesson about the future of AI

Image Autor
Published On: July 1, 2026 at 5:00 PM
Follow Us

Can a machine learn mathematics so deeply that it invents a missing idea on its own? A new Princeton University study puts that question into one of the simplest places imaginable, basic arithmetic without the number zero.

The researchers trained language models based on the GPT-2 architecture on single-digit arithmetic problems that did not include zero, then tested whether those models could handle zero when it appeared later. The answer was a reality check for AI research. The models did not independently make the leap, although they learned much faster once they were shown a small number of examples.

Why zero matters

Zero may look simple. It is the number on a keyboard that means nothing, the balance in an empty account, the temperature mark we notice on a freezing morning, or the answer when nothing is left.

In mathematics, though, zero is not just a blank space. It works as a number, a placeholder, and a concept that changes how arithmetic behaves. That is why it makes a surprisingly sharp test for artificial intelligence.

The study, titled “Nothing from Something: Can a Language Model Discover 0?” asked whether a model could move beyond what it had been shown. In other words, could it infer a new mathematical structure from nearby examples, or would it need direct help?

A test built to be small

The setup was intentionally modest. The team used simple arithmetic rather than advanced algebra or proof writing, because a smaller problem makes the failure or success easier to measure.

The models were based on GPT-2-size systems, with 12 layers, 12 attention heads, and about 124 million parameters. Some models had language pretraining, while others did not, giving the researchers a way to test whether exposure to ordinary text helped with mathematical generalization.

That matters because many AI systems look impressive when they solve problems similar to their training data. The harder question is whether they can step outside that familiar zone. That is where the zero test comes in.

YouTube: @code4AI

The models hit a wall

When zero was held back during training, the models could not reliably generalize to it at test time. The result held even for models that had been pretrained on language.

That is a big distinction. Language pretraining gave the models broad exposure to text, patterns, and perhaps everyday ideas like “nothing,” but it did not make them spontaneously discover zero in this arithmetic setting.

It sounds almost too basic. Yet that is the point. If a neural model struggles to infer zero in a clean, simple test, researchers have reason to be careful when making sweeping claims about AI discovering new mathematics.

Language still gave the models a boost

The story does not end with failure. Once the models were given a small number of arithmetic examples involving zero, their performance improved substantially.

For the GPT-2-size model pretrained on filtered OpenWebText, 64 zero examples, equal to 0.64% of the training data, were enough to push test accuracy above 60%. By 1,024 examples, or 10.24% of the training data, the model reached average test accuracy above 90%.

Here is the interesting part. The pretrained model needed about 50% fewer examples than a comparable model without language pretraining to reach the same level of accuracy. In practical terms, language did not hand the model the concept of zero, but it seems to have made the learning path shorter.

Was zero special?

The researchers also tested what happened when other digits were removed from training. This helped them ask a sharper question. Was zero uniquely hard, or were edge numbers generally difficult?

The answer was more nuanced. In base-10 arithmetic, zero and nine were among the hardest digits to generalize to. In base-8 tests, zero and seven were hardest, with the top digit playing a special role because it triggers carrying.

Middle digits were easier. The authors suggest that the models may be better at interpolating between familiar neighboring digits than extrapolating to numbers at the edge of the system. That is a small technical point, but it says a lot about how these systems may “learn” number patterns.

What this reveals about AI discovery

The study lands at a moment when AI companies and researchers are talking more openly about automated mathematics. Some systems can now perform strongly on difficult contests, and that has sparked serious debate about whether AI could help expand human knowledge.

But benchmarks can be tricky. A model may do well because it has absorbed similar structures during training, not because it has created a new idea from scratch. The Princeton test tries to separate those two things in a very controlled way.

That makes the result useful beyond arithmetic. In science, climate modeling, materials research, energy systems, and other fields, the dream is not just faster pattern matching. It is reliable discovery. The trouble is, true discovery needs more than polished answers.

A small experiment with a large warning

This research does not show that language models are useless at math. It shows something more careful and, for the most part, more important. Some kinds of mathematical generalization remain hard, even when the problem looks easy to people.

It also suggests that language may act like scaffolding. It can help a model learn a new concept faster once examples appear, but it may not be enough to make the model invent the concept by itself.

So, can a language model discover zero? In this experiment, not without help. And that simple answer may be one of the most valuable things AI researchers can hear right now.

The study was published on arXiv.


Image Autor

Sonia Ramírez

Journalist with more than 13 years of experience in radio and digital media. I have developed and led content on culture, education, international affairs, and trends, with a global perspective and the ability to adapt to diverse audiences. My work has had international reach, bringing complex topics to broad audiences in a clear and engaging way.

Leave a Comment