CodeI/O is a novel approach that transforms code-based reasoning patterns into natural language formats to enhance Large Language Models' reasoning capabilities. Unlike traditional methods focusing on specific skills, our approach systematically extracts universal reasoning primitives while maintaining procedural rigor, enabling better performance across various reasoning tasks.
We begin with collecting raw code files from various sources. They are then transformed into a unified format. Next, I/O pairs are sampled from the transformed functions through code execution. Finally, the complete training dataset is assembled, using both the elements in the unified format and responses collected from LLMs.
We train the models with in a two stage manner: first train on CodeI/O(++), then train on general SFT data.
This is to enhance the model's reasoning ability while keeping its general instruction following ability.
Key ablations we tested and the number of training samples under each condition.
The effect of different data formats. We make bold the highest and underline the lowest scores.
The scaling effect of CodeI/O data in the first stage training.
Data quality comparison w.r.t. different prompts to ablate on syntehsis models.
The necessity of two stage training: always better than mixed single stage.
Inconsistent effects of further increased revision turns on different models.
@article{li2025codeio,
title={CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction},
author={Li, Junlong and Guo, Daya and Yang, Dejian and Xu, Runxin and Wu, Yu and He, Junxian},
journal={arXiv preprint arXiv:2502.07316},
year={2025}
}