Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Before fine-tuning, is it necessary to preprocess the data with normalization or other techniques? #161

Open
yAoOw opened this issue Dec 4, 2024 · 3 comments

Comments

@yAoOw
Copy link

yAoOw commented Dec 4, 2024

i use ETTh1.csv dataset to finetune moirai_1.1_small model, The fine-tuning process went smoothly. and i got the ckpt file name like 'epoch=0-step=100.ckpt'. ,then i use the ckpt file to run eval.py,and got a large MSE ,28.866286. and forcast output look bad .
So I was wondering if the large fluctuations in the raw data were the cause. I applied three different normalization methods to preprocess the raw dataset, then reran the eval.py file and achieved a much better MSE value.

My question is: Is there an officially recommended method for data normalization? Thank you!

@chenghaoliu89
Copy link
Contributor

Hi @yAoOw, I think 100 step is insufficient for fine-tuning. Could you decrease the learning rate and increase the batch size. In the default pipeline, the converge termination creterion is based on the validatation performance. Could you check the MSE peformance on validation set about before/after fine-tuning?

@chenghaoliu89
Copy link
Contributor

In terms of normalization, morai model include the instance normalization. But, LSF benchmark require another dataset-level normalization, which can be found in

def scale(self, data, start, end):
. If you follow the default fine-tuning pipeline, it will directly use raw data for model fine-tune instead of the normalized one.

Any suggestion @zqiao11 ?

@zqiao11
Copy link
Contributor

zqiao11 commented Dec 4, 2024

Hi @yAoOw. For normalization, I think you can follow the standard normalization process for LSF dataset. See discussion here: #31 (comment). You can implement that approach based on the codes @chenghaoliu89 shared above.

In my experience, this normalization method works well. However, since Moirai uses large context length, finetuning on small datasets like Etth1 and Etth2 is quite tricky (easy to get overfitted). You can try with Ettm1, Ettm2 and Weather, using small lr like 1e-6~1e-7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants