Autodistill Yolov8 training not resuming after 40th epoch #4

Samyak-Jayaram · 2024-10-13T09:36:20Z

Minimum reproducible example

from autodistill_yolov8 import YOLOv8

import os
HOME = os.getcwd()

DATA_YAML_PATH = f"{HOME}/dataset/data.yaml"
TRAINED_MODEL_PATH = f"{HOME}/runs/detect/train/weights/last.pt"

model2 = YOLOv8(TRAINED_MODEL_PATH)

model2.train(DATA_YAML_PATH,resume=True,device=[0])

Terminal message

Ultralytics YOLOv8.2.103 Python-3.11.5 torch-2.4.1+cu124 CUDA:0 (NVIDIA GeForce GTX 1650, 4096MiB)
engine\trainer: task=detect, mode=train, model=d:\1_DSCE\Major_project\yolo\runs\detect\train\weights\last.pt, data=d:\1_DSCE\Major_project\yolo/dataset/data.yaml, epochs=50, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=[0], workers=8, project=None, name=train2, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=d:\1_DSCE\Major_project\yolo\runs\detect\train\weights\last.pt, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show_boxes=True, line_width=None, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=True, opset=None, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.0, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, bgr=0.0, mosaic=1.0, mixup=0.0, copy_paste=0.0, auto_augment=randaugment, erasing=0.4, crop_fraction=1.0, cfg=None, tracker=botsort.yaml, save_dir=runs\detect\train2

Resuming training d:\1_DSCE\Major_project\yolo\runs\detect\train\weights\last.pt from epoch 41 to 50 total epochs
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs\detect\train2
Starting training for 50 epochs...
Closing dataloader mosaic
albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8))

it has been stuck at this for like an hour,I can resume training through cpu from 41st epoch but that's not feasible as my dataset is large and my cpu starts heating up too much.
This stuck issue happens even if dataset is small like 500 images with gpu.

if i try to change close_mosiac in args.yaml file,as soon as i run that code snippet for training close_mosiac is set to 10,idk how.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autodistill Yolov8 training not resuming after 40th epoch #4

Autodistill Yolov8 training not resuming after 40th epoch #4

Samyak-Jayaram commented Oct 13, 2024

Autodistill Yolov8 training not resuming after 40th epoch #4

Autodistill Yolov8 training not resuming after 40th epoch #4

Comments

Samyak-Jayaram commented Oct 13, 2024

Minimum reproducible example

model2.train(DATA_YAML_PATH,resume=True,device=[0])

Terminal message