CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

1DeepSeek-AI, 2Shanghai Jiao Tong University, 3HKUST

Abstract

CodeI/O is a novel approach that transforms code-based reasoning patterns into natural language formats to enhance Large Language Models' reasoning capabilities. Unlike traditional methods focusing on specific skills, our approach systematically extracts universal reasoning primitives while maintaining procedural rigor, enabling better performance across various reasoning tasks.

Key Features & Contributions

  • 🔄 Universal Transformation: Converts diverse code patterns into natural language Chain-of-Thought rationales
  • 🧠 Syntax-Decoupled: Decouples reasoning from code syntax while preserving logical structure
  • 📊 Multi-Task Enhancement: Improves performance across symbolic, scientific, logic, math, commonsense and code reasoning
  • ✨ Fully-Verifiable: Supports precise prediction verification through cached ground-truth matching or code re-execution
  • 🚀 Advanced Iteration: Enhanced version (CodeI/O++) with multi-turn revision for better accuracy

Core Data Construction Pipeline

We begin with collecting raw code files from various sources. They are then transformed into a unified format. Next, I/O pairs are sampled from the transformed functions through code execution. Finally, the complete training dataset is assembled, using both the elements in the unified format and responses collected from LLMs.


overview

Data Visualization

ioexample

Comparison with Baselines

We train the models with in a two stage manner: first train on CodeI/O(++), then train on general SFT data.

This is to enhance the model's reasoning ability while keeping its general instruction following ability.


mainres

Analysis

Key ablations we tested and the number of training samples under each condition.


The effect of different data formats. We make bold the highest and underline the lowest scores.


The scaling effect of CodeI/O data in the first stage training.


Data quality comparison w.r.t. different prompts to ablate on syntehsis models.

The necessity of two stage training: always better than mixed single stage.

Inconsistent effects of further increased revision turns on different models.

BibTeX

@article{li2025codeio,
  title={CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction},
  author={Li, Junlong and Guo, Daya and Yang, Dejian and Xu, Runxin and Wu, Yu and He, Junxian},
  journal={arXiv preprint arXiv:2502.07316},
  year={2025}
}