LongReD: Mitigating Short-Text Degradation of Long-Context LLMs via Restoration Distillation

Overview

We are the first to systematically analyze the reasons for the short-text ability degradation in long-context models:
- Distribution Drift
- Catastrophic Forgetting
We propose LongReD, which mitigates the short-text performance decline via simulating the original distributions before extension.
- Long-Text Training
- Short-Text Distillation
- Short-to-Long Distillation
Experiment results accross 17 datasets covering 6 capacities demonstrate that LongReD can achieve better short-text performances while keeping comparable or better long-context performances.

Installation

The code is built based on EasyContext framework. For runing the code, you should first install packages from requirements.txt

pip install -r requirements.txt

Training

We provide four training scripts for LongReD-C, LongReD-U, long+cpt, and mix+cpt for Llama-3-8B to extend its context window to 32K.

You can run the following scripts:

bash ./scripts/train_longred_cream.sh
bash ./scripts/train_longred_uniform.sh
bash ./scripts/train_long_cpt.sh
bash ./scripts/train_mix_cpt.sh

Results

@article{dong2025longred,
  title={LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation},
  author={Dong, Zican and Li, Junyi and Jiang, Jinhao and Xu, Mingyu and Zhao, Wayne Xin and Wang, Bingning and Chen, Weipeng},
  journal={arXiv preprint arXiv:2502.07365},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
accelerate_configs		accelerate_configs
easy_context		easy_context
figs		figs
scripts		scripts
README.md		README.md
requirements.txt		requirements.txt
train.py		train.py
train_longred_cream.py		train_longred_cream.py
train_longred_uniform.py		train_longred_uniform.py
train_mix_cpt.py		train_mix_cpt.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LongReD: Mitigating Short-Text Degradation of Long-Context LLMs via Restoration Distillation

Overview

Installation

Training

Results

About

Releases

Packages

Languages

RUCAIBox/LongReD

Folders and files

Latest commit

History

Repository files navigation

LongReD: Mitigating Short-Text Degradation of Long-Context LLMs via Restoration Distillation

Overview

Installation

Training

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages