You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SAM2 is a video segmentation method and is thus initialized by a segmentation mask. This mask is obtained using groundtruth bounding box to estimate the mask by SAM2. In general, this feature should be integrated into the tracker class, but we are still on it (will be available soon). For the sake of simplicity of the integration, we provide the initialization masks, which were used in experiments and were estimated using SAM2 model.
Why do you use predicted mask to init in box datasets rather than GT bbox?
The text was updated successfully, but these errors were encountered: