Abstract
Modern large language models struggle with complex mathematical reasoning tasks despite their impressive performance across various natural language processing applications. The primary challenge lies in their inability to detect and correct errors during multi-step problem-solving, where early mistakes cascade through subsequent reasoning steps.
This paper presents a practical framework enabling language models to systematically verify and refine their solutions through structured self-correction. Our approach organizes the problem-solving process into three distinct stages: generating initial solutions with explicit identification of critical reasoning steps, analyzing these solutions for potential errors, and producing corrected final answers.
Video Overview
Research Walkthrough
Watch a detailed explanation of the S2C framework, methodology, and key results
Key Results
49.9%
Accuracy on GSM8K
60%
Relative Improvement
78%
Error Correction Rate
Read the Full Paper
Access Restricted
To read or download the full research paper, please complete the following steps and enter the access password.
Follow on GitHub
Follow the repository and star it to stay updated with the latest research.
Follow on GitHubSubscribe on YouTube
Subscribe to our channel for video explanations and research walkthroughs.
Subscribe on YouTubeRequest Permission
Send an email requesting access. Include your name and institution.
Send Request EmailEnter Access Password
After completing the steps above and receiving your password, enter it below to unlock the full paper.
Incorrect password. Please verify you have completed all steps and try again.
Access granted! You can now view and download the paper below.
Key Contributions
- Three-stage self-correction framework (Generator, Critic, Synthesizer)
- Novel three-phase training methodology combining supervised learning with reinforcement techniques
- Comprehensive experimental validation on GSM8K mathematical reasoning benchmark
- 60% relative improvement in accuracy (from 31.2% to 49.9%)
- Detailed error analysis revealing 78% correction success rate on computational errors
- Superior computational efficiency compared to ensemble methods
Contact
Corresponding Authors:
Md Anisur Rahman Chowdhury - engr.aanis@gmail.com
Pratham Patel - patel292@gannon.edu
Co-Authors:
Shahajada Jawar - shahajadajawar@gmail.com
Kefei Wang - wang039@gannon.edu