Korean-Open-platypus 데이터셋을 활용하여 llama-2-ko를 fine-tuning한 Korean-Platypus model
🐳KoR-Orca-Platypus-13B🥮:
🐳Korean-OpenOrca-13B:
🐳OpenOrca-KO:
🐳KOR-OpenOrca-Platypus:
본 연구는 (주)마커와 (주)미디어그룹사람과숲의 오픈소스 LLM 연구 컨소시엄에서 진행되었습니다.
Model | Average | Ko-ARC | Ko-HellaSwag | Ko-MMLU | Ko-TruthfulQA | Ko-CommonGen V2 | Dataset | Base_model |
---|---|---|---|---|---|---|---|---|
🐳KoR-Orca-Platypus-13B | 50.13 | 42.06 | 53.95 | 42.28 | 43.55 | 68.78 | KOR-OpenOrca-Platypus | ko-en-llama2-13b |
🐳Korean-OpenOrca-13B | 47.85 | 43.09 | 54.13 | 40.24 | 45.22 | 56.57 | 🐳OpenOrca-KO | ko-en-llama2-13b |
KoT-Platypus2-13B | 49.55 | 43.69 | 53.05 | 42.29 | 43.34 | 65.38 | KoCoT | KO-platypus2-13B |
KO-platypus2-13B | 47.90 | 44.20 | 54.31 | 42.47 | 44.41 | 54.11 | KOpen-platyus | ko-en-llama2-13b |
-
2023.10.14
- Llama2-13B를 KOR-Orca-Platypus 데이터셋을 활용하여 fine-tuning한 🐳KoR-Orca-Platypus-13B Model 제작 완료.
- HuggingFace KO-LLM 리더보드 1등 달성.
-
2023.10.09
- Llama2-13B를 OpenOrca-KO를 활용하여 fine-tuning한 🐳Korean-OpenOrca-13B Model 제작 완료.
- HuggingFace KO-LLM 리더보드
5등(3등) 달성.
### KO-Platypus
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
repo = "kyujinpy/Korean-OpenOrca-13B"
OpenOrca = AutoModelForCausalLM.from_pretrained(
repo,
return_dict=True,
torch_dtype=torch.float16,
device_map='auto'
)
OpenOrca_tokenizer = AutoTokenizer.from_pretrained(repo)
from datasets import load_dataset
# dataset testing
dataset = load_dataset("kyujinpy/OpenOrca-KO") # But currently, private repo. Please wait!
It is public state!
🐳OpenOrca
Kopen-Platypus🥮
🐳OpenOrca-KO
Platypus
llama-2-ko
ko-en-llama2
🐳Korean-OpenOrca-13B
- Make KOR-OpenOrca
- Share huggingface repo
- Combined platypus+OpenOrca datasets
- Make KOR-OpenOrca-Platypus
- Share evaluation results
- Share datasets