Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Epoch - YOLOv2 #1

Open
groot-1313 opened this issue Mar 8, 2018 · 1 comment
Open

Epoch - YOLOv2 #1

groot-1313 opened this issue Mar 8, 2018 · 1 comment

Comments

@groot-1313
Copy link

groot-1313 commented Mar 8, 2018

Hi,
(YOLOv2)
Great work here! I was wondering, since the loss starts from a very high value how many epochs did it take for you achieve the results you've displayed? And, what is the final expected loss to stop the training process?

@groot-1313 groot-1313 changed the title Epoch Epoch - YOLOv2 Mar 8, 2018
@lhk
Copy link
Owner

lhk commented Mar 9, 2018

Thanks :)

The training is done in the demo ipython notebook. It looks like this:

for i in range(20):
    history=detection_model.fit_generator(train_gen, 6400//batch_size, 
                                          epochs=5, 
                                          callbacks=[nan_terminator],
                                          validation_data = val_gen,
                                          validation_steps = 1600//batch_size,
                                          #use_multiprocessing=False)
                                          workers =4,
                                          max_queue_size=24)
    histories.append(history)
    times.append(time.time())

I'm training 20 times on 5 epochs with 6400 samples = 640000 samples.
The weird outer loop allows me to break out of the training (ctrl + c) and still keep the history of most of it.

Please note: This is just to demonstrate that my loss function works. I'm only retraining the last few layers.
The feature extraction is pre-trained with transfer learning.

In general, it is very hard to make predictions about the number of training samples necessary.
They will vary wildly depending on your data.
I'm training on original VOC data, so this is pretty much what this architecture was made for.
With other datasets, the training times go up significantly, orders of magnitude. This is due to the fact that previous layers have to be trained, too.

The final expected loss is almost impossible to predict.
It depends very much on the network architecture: The yolo loss looks at all the cells defined in the output and sums the loss for every one of them. So a different resolution will change the loss.
Depending on your data, you will have to tune the different weights for the loss (objectness, class, location, etc) this will also change the loss.
I'm afraid, for this I can't give any number at all.

In my experience, even if you thought that the curve has flattened and no more improvement can be made, it pays to continue training.
Just train 2 or 3 times as long as you would think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants