Multi-threaded data loading and augmentation? #50
Labels
improvement
Something which would improve current status, but not add anything new
low priority
Not urgent and won't degrade with time
Current state
The current process for loading data during training is:
FoldYielder
BatchYielder
. Either the entire fold is then loaded to device at once, or mini-batches are loaded to device one at a timeThe current process for loading data during predicting is:
FoldYielder
Problems
Possible solutions
BatchYielder
to load minibatches to device in the background, reducing the memory overhead whilst not leading to delays.BatchYielder
with, or inherit from, a PyTorch Dataloader, which includes multi-threaded workers (although I find that they're slower than single-core...)The text was updated successfully, but these errors were encountered: