FADS-ICL

Source code for the paper "Feature-Adaptive and Data-Scalable In-Context Learning" in ACL 2024

Preparation

Environment

The code is tested under python==3.8.18, torch==1.12.0 and transformers==4.39.0, though the requirement of spefic version is not very strict, run with no bugs, then you are set.

Note: Almost all experiments is conducted on a single NVIDIA A800-SXM4-80GB, except for the llama-30B model which requires two. Besides, bitsandbytes (we use the 0.41.2 version) is needed in quantification for llama2-70B.

Model

Prepare your LLM (gpt2, llama or llama2) in ./llm/, I personally prefer download them myself and configure the local path in scripts.

Data

Download dataset and unzip them in ./data.
The structure of the project looks like:

.
├── run_icl.sh
├── run_fads-icl.sh
├── icl.py
├── fads-icl.py
├── utils
│   ├── anchor.py
│   ├── dataset.py
│   ├── __init__.py
│   └── template.py
├── llm
│   └── gpt2-xl
│       ├── config.json
│       ├── merges.txt
│       ├── pytorch_model.bin
│       ├── tokenizer.json
│       └── vocab.json
└── data
    └── sst2
        ├── dev_subsample.jsonl
        ├── test.jsonl
        └── train.jsonl

Run

Run FADS-ICL or In-Context Learning as follows, check the configuration in the script including dataset, llm, seed, etc.

bash run_fads-icl.sh

or

bash run_icl.sh

Results

For the SST dataset, you shall get exact results w.r.t. random seeds as follows (invariant to different environment possibly):

Seed	1	2	3	4	5
In-Context Learning (gpt2-xl, 16-shot)	0.8438	0.8125	0.7227	0.8633	0.8242
FADS-ICL (gpt2-xl, 16-shot)	0.9063	0.8594	0.7344	0.9297	0.9023
FADS-ICL (gpt2-xl, 128-shot)	0.8945	0.8789	0.8828	0.8906	0.8984

Full results are listed in the paper (see Table 2 and Table 3).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

FADS-ICL

Preparation

Environment

Model

Data

Run

Results

Files

README.md

Latest commit

History

README.md

File metadata and controls

FADS-ICL

Preparation

Environment

Model

Data

Run

Results