You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During distributed learning, the LogReport class cannot aggregate reports from all processes. This should be fixed because it can cause bugs when using extensions or options that depend on the values in the logs. For example, there are issues when using EarlyStoppingTrigger or ReduceLROnPlateau.
To address this problem, we need to modify the LogReport class so that it gathers the summary objects from all processes at the trigger point and recalculates the averages. We should also add an "writer_rank" option to ensure that only one process creates the log file.
The text was updated successfully, but these errors were encountered:
During distributed learning, the LogReport class cannot aggregate reports from all processes. This should be fixed because it can cause bugs when using extensions or options that depend on the values in the logs. For example, there are issues when using EarlyStoppingTrigger or ReduceLROnPlateau.
To address this problem, we need to modify the LogReport class so that it gathers the summary objects from all processes at the trigger point and recalculates the averages. We should also add an "writer_rank" option to ensure that only one process creates the log file.
The text was updated successfully, but these errors were encountered: