Skip to content

Commit

Permalink
Update docs/source/getting_started/faqs_and_tips.md
Browse files Browse the repository at this point in the history
  • Loading branch information
snarayan21 authored Nov 4, 2024
1 parent 254fba7 commit d594912
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion docs/source/getting_started/faqs_and_tips.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ The `epoch_size` attribute of StreamingDataset is the number of samples per epoc
`StreamingDataset` is the dataset class. It can take in multiple streams, which are just data sources. It combines these streams into a single dataset. `StreamingDataset` does not *stream* data, as continuous bytes; instead, it downloads shard files to enable a continuous flow of samples into the training job. `StreamingDataset` is an `IterableDataset` as opposed to a map-style dataset -- samples are retrieved as needed.

### I wrapped my streaming dataloader with HuggingFace's `accelerate` dataloader wrapper and my run is hanging, what should I do?
When using HF `accelerate` with `streaming` for training, do not wrap the dataloader as this will cause the run to fail.
When using HF Accelerate with Streaming for training, do not wrap the DataLoader as this can may cause hangs during training. StreamingDataset ready for distributed training out of the box and does not need the wrapping that HF Accelerate provides.

## 🤓 Helpful Tips

Expand Down

0 comments on commit d594912

Please sign in to comment.