[CS.AI] SuperThoughts: Revolutionary Reasoning in Superpo...

Long Chain-of-Thought (CoT) reasoning enhances problem-solving in large language models (LLMs) but is computationally expensive due to sequential token generation. Recent works have attempted reasoning in continuous latent spaces to bypass discrete token generation, yet they often encounter training instability and struggle to scale to complex, long-horizon tasks due to a lack of supervisory signals.

To address this, we introduce SuperThoughts, which compresses pairs of consecutive CoT tokens into single latent representations and decodes two tokens per step via a lightweight Multi-Token Prediction (MTP) module. This approach preserves discrete token supervision during training while doubling throughput during inference.

We fine-tune Qwen2.5-Math-1.5B-Instruct, Qwen2.5-Math-7B-Instruct, and Qwen2.5-Math-14B-Instruct, evaluating on MATH500, AMC, OlympiadBench, and GPQA-Diamond. With a confidence-based adaptive mechanism that reverts to standard decoding when uncertain, SuperThoughts achieves approximately 20-30% reduction in CoT length while maintaining accuracy with minimal degradation (1-2 points drop on most tasks).

Blogger's Review: The introduction of SuperThoughts offers an efficient solution for long-chain reasoning, compressing tokens and utilizing multi-token prediction to significantly enhance inference speed. Its performance in complex tasks makes it a noteworthy development in LLM capabilities.

[CS.AI] SuperThoughts: Revolutionary Reasoning in Superposition