Thinking Past the Answer

Abstract

Large Reasoning Models improve performance by producing explicit intermediate reasoning traces with additional test-time compute. However, longer reasoning is not always beneficial. We ask whether a model that has already reached the correct answer continues to refine that answer or instead drifts away from it. To study this, we introduce a prefix-level trajectory evaluation protocol grounded in reasoning sufficiency: the minimum reasoning budget required for a model to first generate the correct answer. This separates verbose overthinking, where additional reasoning is redundant but harmless, from harmful overthinking, where continued reasoning destabilizes an already-correct trajectory. Across multimodal and language-only benchmarks, stopping at the first correct prefix improves accuracy over default reasoning, revealing that current models are limited not only by their ability to reason, but also by their inability to stop at the right time.

Reasoning sufficiency as difficulty

We evaluate a reasoning trace prefix by prefix. For each partial trace, the model is forced to provide an answer. The first prefix that yields the correct answer defines the empirical sufficient reasoning budget for that model and instance. Any reasoning beyond that point is overthinking; if the final answer remains correct, it is verbose, and if the final answer becomes incorrect, it is harmful.

Key idea. Difficulty should be tied to the minimum compute needed to first reach correctness, not to the total length of a model-generated chain of thought.

Main findings

Reasoning length is a poor proxy for difficulty. Large Reasoning Models often reach the correct answer early, then continue generating long traces that are not required for correctness.
Models frequently reason past correct intermediate states. Stopping at the first correct prefix can substantially outperform default full-length reasoning, showing that additional reasoning can be harmful rather than merely redundant.
Free-form generation exposes harmful overthinking more sharply. Without a fixed answer set, unconstrained reasoning is more likely to drift away from an already-correct answer.
Reasoning trajectories are non-monotonic. After first reaching correctness, the probability of staying correct drops as models continue reasoning.
Efficiency methods reduce verbosity, but not necessarily harmful overthinking. Shorter traces remove wasted computation, yet they do not reliably prevent correctness deviations.
Harmful overthinking also appears in language-only reasoning. The phenomenon is not only caused by visual drift; similar instability emerges on math-heavy and knowledge-heavy language benchmarks.

Why does reasoning become harmful?

In harmful trajectories, the model first reaches a correct answer and then changes it. We analyze the segment from the last correct prefix to the final trace and categorize deviations into visual errors, calculation errors, and logical errors. The dominant causes are logical drift and visual reinterpretation rather than arithmetic mistakes.

BibTeX

@misc{caldarella2026overthinking,
  title        = {Thinking Past the Answer: Evaluating Harmful Overthinking in Large Reasoning Models},
  author       = {Caldarella, Simone and Talon, Davide and Ricci, Elisa and Aljundi, Rahaf and Mancini, Massimiliano},
  year         = {2026},
  note         = {Preprint}
}

Thinking Past the Answer:Evaluating Harmful Overthinking in Large Reasoning Models

Abstract

Reasoning sufficiency as difficulty

Main findings

Why does reasoning become harmful?

BibTeX

Thinking Past the Answer:
Evaluating Harmful Overthinking in Large Reasoning Models