Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autodistill Yolov8 training not resuming after 40th epoch #4

Open
Samyak-Jayaram opened this issue Oct 13, 2024 · 0 comments
Open

Autodistill Yolov8 training not resuming after 40th epoch #4

Samyak-Jayaram opened this issue Oct 13, 2024 · 0 comments

Comments

@Samyak-Jayaram
Copy link

Minimum reproducible example

from autodistill_yolov8 import YOLOv8

import os
HOME = os.getcwd()

DATA_YAML_PATH = f"{HOME}/dataset/data.yaml"
TRAINED_MODEL_PATH = f"{HOME}/runs/detect/train/weights/last.pt"

model2 = YOLOv8(TRAINED_MODEL_PATH)

model2.train(DATA_YAML_PATH,resume=True,device=[0])

Terminal message

Ultralytics YOLOv8.2.103 Python-3.11.5 torch-2.4.1+cu124 CUDA:0 (NVIDIA GeForce GTX 1650, 4096MiB)
engine\trainer: task=detect, mode=train, model=d:\1_DSCE\Major_project\yolo\runs\detect\train\weights\last.pt, data=d:\1_DSCE\Major_project\yolo/dataset/data.yaml, epochs=50, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=[0], workers=8, project=None, name=train2, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=d:\1_DSCE\Major_project\yolo\runs\detect\train\weights\last.pt, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show_boxes=True, line_width=None, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=True, opset=None, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.0, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, bgr=0.0, mosaic=1.0, mixup=0.0, copy_paste=0.0, auto_augment=randaugment, erasing=0.4, crop_fraction=1.0, cfg=None, tracker=botsort.yaml, save_dir=runs\detect\train2

Resuming training d:\1_DSCE\Major_project\yolo\runs\detect\train\weights\last.pt from epoch 41 to 50 total epochs
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs\detect\train2
Starting training for 50 epochs...
Closing dataloader mosaic
albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8))

it has been stuck at this for like an hour,I can resume training through cpu from 41st epoch but that's not feasible as my dataset is large and my cpu starts heating up too much.
This stuck issue happens even if dataset is small like 500 images with gpu.

if i try to change close_mosiac in args.yaml file,as soon as i run that code snippet for training close_mosiac is set to 10,idk how.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant