Parallelise folder import phase #75

pmonks · 2018-03-15T17:35:50Z

Currently the tool imports folders serially, in a first phase of import (this allows files to be efficiently batched in the second phase, without having to worry about parent folders - a significant performance improvement over earlier schemes that processed imports folder-by-folder).

Unfortunately because folders are inter-dependent (i.e. you can't import a child folder until the ancestor tree has been imported), parallelising this phase is more difficult than the file case, and was punted in v2.0 of the tool.

By requiring BulkImportSources to scan directories breadth-first, some level of parallelisation would become possible during the folder import phase. i.e. the first level folders would be imported serially, then each of those folders' sub-folder trees imported in parallel.

There are worst case corner cases that need some thought (e.g. when there are fewer first-level folders than the optimal number of threads in the thread pool), but in general this should markedly speed up the folder import phase for large folder trees.

The text was updated successfully, but these errors were encountered:

pmonks added the enhancement label Mar 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelise folder import phase #75

Parallelise folder import phase #75

pmonks commented Mar 15, 2018

Parallelise folder import phase #75

Parallelise folder import phase #75

Comments

pmonks commented Mar 15, 2018