Modify the name of baselines and the catalogue. (#115)

* Add files via upload * Add files via upload * Add files via upload * Update README.md * Rename figure4overview.pdf to overview.pdf * Update README.md * Add files via upload ALL IMAGES IN THE PAPER.THE FORMAT IS PGN. * Update README.md * Update README.md * Update README.md * Delete assets/strategy_compare.pdf * Delete assets/GPU_memory_usage.pdf * Delete assets/Exp-Adaptive.pdf * Delete assets/Multi-LORA.pdf * Delete assets/Throuput_compare.pdf * Delete assets/abnormal_and_normal.pdf * Delete assets/data_distribution.pdf * Delete assets/early_stop_and_original.pdf * Delete assets/early_stop_example.pdf * Delete assets/gpu-memory-utilization.pdf * Delete assets/overview.pdf * Delete assets/pad.pdf * Delete assets/Adaptive_scheduling.png * Delete assets/pad_example.png * Delete assets/strategy_compare.png * Delete assets/minpad.png * Delete assets/join-accuracy-and-loss.png * Delete assets/early_stop_example.png * Delete assets/early_stop_and_original.png * Delete assets/different_sequence_length.png * Delete assets/data_distribution.png * Delete assets/abnormal_and_normal.png * Delete assets/LoRA_and_MultiLoRA.png * Delete assets/Exp-Mem.png * Delete assets/gpu-memory-utilization.png * Update README.md * Update README.md * modify README,add supported models table and example * Modify the name of baselines.example:SYNC->Alpaca-Parallel.Modify the Catalogue.Add more explanation.
TUDB-Labs · Dec 6, 2023 · 8fd0283 · 8fd0283
1 parent 3c6e9db
commit 8fd0283
Showing 1 changed file with 28 additions and 11 deletions.
diff --git a/README.md b/README.md
@@ -14,6 +14,7 @@ ASPEN (a.k.a Multi-Lora Fine-Tune) is an open-source framework for fine-tuning L
 ## Contents
 
 - [Updates](#updates)
+- [Supported Models](#Models)
 - [Overview](#overview)
 - [Getting Started](#Quickstart)
 - [Installation](#Installation)
@@ -26,6 +27,21 @@ ASPEN (a.k.a Multi-Lora Fine-Tune) is an open-source framework for fine-tuning L
 - Support multiple LLaMA fine-tuning
 - On the way, Baichuan
 
+## Models
+
+|                                 | Model                                          | Model size      |
+|---------------------------------|------------------------------------------------|-----------------|
+| <input type="checkbox" checked> | [ChatGLM](https://github.com/THUDM/ChatGLM-6B) | 6B              |
+| <input type="checkbox" checked> | [ChatGLM2](https://github.com/THUDM/ChatGLM2-6B) | 6B/12B          |
+| <input type="checkbox">         | [ChatGLM3](https://github.com/THUDM/ChatGLM3)  | 6B                 |                 |
+| <input type="checkbox" checked> | [LLaMA](https://github.com/facebookresearch/llama) | 7B//13B/33B/65B |
+| <input type="checkbox" checked> | [LLaMA-2](https://huggingface.co/meta-llama)   | 7B/13B/70B      |
+| <input type="checkbox">         | [Baichuan](https://github.com/baichuan-inc/Baichuan-13B) | 7B/13B          |
+| <input type="checkbox">         | [Baichuan2](https://github.com/baichuan-inc/Baichuan2) | 7B/13B                  |
+
+> **Example:** Use our system  to improve the LLaMa-2 fine-tuning with less resources
+>https://www.kaggle.com/code/rraydata/multi-lora-example/notebook
+
 ## Overview
 
 **ASPEN** is a high-throughput LLM fine-tuning framework based on LoRA and QLoRA, compatible with HuggingFace-Transformers LLaMA Models and ChatGLM Models.
@@ -48,45 +64,46 @@ ASPEN requires [PyTorch](https://pytorch.org/) and [NVIDIA CUDA](https://develop
 
 Environment: NVIDIA RTX A6000 with Intel Xeon Silver 4314 on Ubuntu 22.04.3
 
-Baseline: We utilized the widely adopted [Alpaca-LoRA](https://github.com/tloen/alpaca-lora) as a foundation. On a single GPU, we independently ran multiple Alpaca-LoRA processes in parallel (marked as *Baseline@SYNC*) and sequentially (marked as *Baseline@SEQ*), forming two baseline methods for the experiments.
+Baseline: We utilized the widely adopted [Alpaca-LoRA](https://github.com/tloen/alpaca-lora) as a foundation. On a single GPU, we independently ran multiple Alpaca-LoRA processes in parallel (marked as *Baseline@Alpaca-Parallel*) and sequentially (marked as *Baseline@Alpaca-Seq*), forming two baseline methods for the experiments. We test this on A100, and rest of results are based on the same GPU configure.
 
 #### Training Latency and Throughput
 
 Method|Latency|Throughput
 :---:|:---:|:---:
-Baseline@SEQ|10.51h|608.41 token/s
-Baseline@SYNC|9.85h|649.30 token/s
+Baseline@Alpaca-Seq|10.51h|608.41 token/s
+Baseline@Alpaca-Parallel|9.85h|649.30 token/s
 ASPEN|9.46h|674.58 token/s
 
-We conducted four identical fine-tuning jobs with same dataset and same hyper-parameters, incorporating two baselines and ASPEN. During the experimental process, we collected the completion times for each task in the baseline methods and calculated the time taken by the slowest task as the *Training Latency*. As shown in Table, ASPEN exhibits lower *Training Latency* compared to both baseline methods. Specifically, ASPEN is 9.99% faster than *Baseline@SEQ* and 3.92% faster than *Baseline@SYNC*.
+We conducted four identical fine-tuning jobs with same dataset and same hyper-parameters, incorporating two baselines and ASPEN. During the experimental process, we collected the completion times for each task in the baseline methods and calculated the time taken by the slowest task as the *Training Latency*. As shown in Table, ASPEN exhibits lower *Training Latency* compared to both baseline methods. Specifically, ASPEN is 9.99% faster than *Baseline@Alpaca-Seq* and 3.92% faster than *Baseline@Alpaca-Parallel*.
 <div align="center"><img src="./assets/throughput_compare.png" width="100%"></div>
 
-#### Video Memory Usage
 
+
+#### Video Memory Usage
 <div align="center"><img src="./assets/GPU_memory_usage.png" width="100%"></div>
 
-We conducted several fine-tuning jobs with same dataset and `batch_size = {2,4, 6, 8}`, incorporating  *Baseline@SYNC* and ASPEN. 
+We conducted several fine-tuning jobs with same dataset and `batch_size = {2,4, 6, 8}`, incorporating  *Baseline@Alpaca-Parallel* and ASPEN. 
 
-*Baseline@SYNC* triggered OOM error after 3 parallel tasks when batch size = 8, while ASPEN can handle twice that amount.
+*Baseline@Alpaca-Parallel* triggered OOM error after 3 parallel tasks when batch size = 8, while ASPEN can handle twice that amount.
 
 #### Batching Strategies
 
 Method|Training Latency|Peak Memory Usage|Average GPU Utilization|Training Throughput
 :---:|:---:|:---:|:---:|:---:
-Baseline@SEQ|27.73h|10.68GB|79.39%|653.35 token/s
+Baseline@Alpaca-Seq|27.73h|10.68GB|79.39%|653.35 token/s
 ASPEN@M1|36.82h|23.82GB|96.52%|672.54 token/s
 ASPEN@M2|39.14h|23.86GB|96.41%|671.28 token/s
 ASPEN@M3|22.97h|23.85GB|95.22%|674.41 token/s
 
-We conducted four fine-tuning jobs with different dataset but same hyper-parameters, incorporating  *Baseline@SEQ* and ASPEN. 
+We conducted four fine-tuning jobs with different dataset but same hyper-parameters, incorporating  *Baseline@Alpaca-Seq* and ASPEN. 
 
 During the experimental process, we collected following metrics:
  + *Training Latency* = Job completion time
  + *Throughput* = The number of passed tokens in model forward process / training latency
  + *Memory Usage* = Peak video memory usage
  + *GPU Utilization* = Average GPU utilization
 
-All metrics are computed for each job. `M1, M2, M3` represent three batch strategies of ASPEN: *Optimal-Fit, Trivial, and Fast-Fit*. `BASELINE` denotes *Baseline@SEQ*.
+All metrics are computed for each job. `M1, M2, M3` represent three batch strategies of ASPEN: *Optimal-Fit, Trivial, and Fast-Fit*. `BASELINE` denotes *Baseline@Alpaca-Seq*.
 
 The *Optimal-Fit* strategy performs the best across all four metrics, while the other two strategies also outperform the baseline method other than training latency.
 ### Use Cases:
@@ -145,7 +162,7 @@ Submit a pull request with a detailed explanation of your changes.
 Please cite the repo if you use the code in this repo.
 ```bibtex
 @misc{Multi-LoRA,
-  author = {Zhengmao, Ye\textsuperscript{*} and Dengchun, Li\textsuperscript{*} and Tingfeng, Lan and Yanbo, Liang and Yexi, Jiang and Jie, Zuo and Hui, Lu and Lei, Duan and Mingjie, Tang},
+  author = {Zhengmao, Ye\textsuperscript{*} and Dengchun, Li\textsuperscript{*} and Jingqi, Tian and Tingfeng, Lan and Yanbo, Liang and Yexi, Jiang and Jie, Zuo and Hui, Lu and Lei, Duan and Mingjie, Tang},
   title = {ASPEN: Efficient LLM Model Fine-tune and Inference via Multi-Lora Optimization},
   year = {2023},
   publisher = {GitHub},