-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feat] 타 벤치마크 데이터셋 증강 코드 추가 #66
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
주석 덕분에 코드 편하게 읽었습니다. 고생하셨습니다!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
리뷰 확인부탁드립니다~
|
||
# HAERAE 데이터를 모두 넣어줄 빈 데이터셋을 선언한다 | ||
train_agg = pd.read_csv("./data/agg_other_benchmarks/train_agg.csv") | ||
data_haerae_agg = pd.DataFrame(columns=train_agg.columns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
코드 실행 문들을 if __name__ == "__main__":
으로 감싸는게 어떨까요?
data_agg_HAE-RAE_total.py
Outdated
|
||
|
||
# 저장할 파일명을 선언해준다 | ||
def get_category_initials(name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
이 함수는 선언 위치를 위쪽 함수들이 존재하는 곳으로 옮기면 좋을 것 같아요
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
좋은 피드백 감사합니다~ 수정 완료 됐습니다!
📝 Summary
타 벤치마크 증강 실험 후 점수 향상이 있었던 HAE-RAE 데이터셋 증강 코드 추가
✅ Checklist
- [ ] 문서 업데이트가 포함되었습니다.📄 Description
HAE-RAE 데이터셋을 내부 카테고리 별로 형태를 가공하고 train 데이터셋에 취합하는 코드입니다.
💡 Notice (Optional)
해당 코드 실행 시 필요한 데이터셋은 규정 상 레포에 업로드 불가합니다.
🔗 Related Issue(s)
close #62