-
Notifications
You must be signed in to change notification settings - Fork 42
SimpleFileExtractor
CBRAIN offers a way to mass-extract files out of existing datasets by using the tool SimpleFileExtractor.
You use it by selecting the set of files (FileCollections) you want to extract from, and then provide a list of patterns to match the files inside. The result will be a new FileCollection containing the extracted files.
For example, you have different CivetOutputs but are interested only in some of the thickness and surface files. To avoid having to download everything, you can select those CivetOuputs files and launch the SimpleFileExtractor tool.
Note:
You can not run the tool on more than about 5,000 input file collections. Please run it multiple times on different subsets of 5,000 inputs if you have larger input sets.
Note:
To avoid having to copy the entirety of the original (source) files, the tool will only run if the input data is stored locally. You need to select a version that runs on a particular server, depending on the location of your inputs. The mapping is as follows:
-
Use Beluga for files stored on: Local-Beluga
-
Use Cedar for files stored on: Local-Beluga
-
Use Graham for files stored on: Local-Graham
-
Use GrahamPlatform for files stored on: Local-GrahamPlatform
-
Use Converter-1 or Converter-2 for files stored on:
- MainStore
- NeuroHubStore
- SFTP-1
- SFTP-2
- NeuroHub-UKBB-Civet
- CONP-VisualWorkingMemory
- CONP-OpenPreventAD
- CONP-OpenPreventAD-BIDS-Subjects
- CONP-BigBrain-3DClassifiedVolumes
- CONP-BigBrain-3DSurfaces
- CONP-BigBrain
Once the tool is launched with the proper version, in the task parameters you can provide the file patterns of your interest:
All the files matching these patterns will be extracted (copied) into a new FileCollection.