Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于loss的设置 #60

Open
foxkw opened this issue Mar 19, 2021 · 3 comments
Open

关于loss的设置 #60

foxkw opened this issue Mar 19, 2021 · 3 comments

Comments

@foxkw
Copy link

foxkw commented Mar 19, 2021

您好,我有2个问题想请教。

1)请问为何在val模式下, Loss的值没有求平均呢,即:最后在tensorboard上显示的loss值。
`

        val_loss += loss.item()
        val_loss_seg += loss_seg.item()
        val_loss_exist += loss_exist.item()

        progressbar.set_description("batch loss: {:.3f}".format(loss.item()))
        progressbar.update(1)

progressbar.close()
iter_idx = (epoch + 1) * len(train_loader)  # keep align with training process iter_idx
tensorboard.scalar_summary("val_loss", val_loss, iter_idx)

`

2)关于分布式训练
我发现用一张显卡时, train_loss =a 。 用2张显卡是train_loss≈2a 。
是因为求和了吗。 若求和了,不需要求平均吗, 因为感觉有点问题。

期待您的解答!

@harryhan618
Copy link
Owner

loss不平均也行吧,反正你也要调整learning rate,或者不同loss之间的balance weight

@foxkw
Copy link
Author

foxkw commented Apr 22, 2021

loss不平均也行吧,反正你也要调整learning rate,或者不同loss之间的balance weight

大佬,在训练CULane数据集时,我发现在前10个eopch,train_loss下降, val_loss经常维持在某个数值左右(例如0.5),请问您有遇到过吗?不知道这是否是过拟合

@harryhan618
Copy link
Owner

调参挺麻烦的。我之前不知道怎么就调出了一个。。。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants