You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From a recent analysis, my AiiDa repository of 31GB has ca.:
7GB (23%) occupied by retrieved restart files of Raspa
6GB(20%) occupied by output .data files of raspa
Other files related to Raspa (cif, simulate.input, _aidasubmit.sh, and force field .def files), are under 1GB, so no need to care now.
I suggest to take action to reduce the size of the repository, given the large number of RaspaCalculation that are called during our workchain.
Here we want to think about it because we have different type of data:
check values that are elaborated directly from the input (or default parameters) that we might want to keep for reproducibility
info about the cycles that are printed with PrintEvery and one want to check to see what is going on in the calculation. We generally use 10 checkpoints, so it is not a huge amount of data and we should keep it.
outputs, that can not be suppressed but are useless for some calculation (e.g., Gibbs/Widom null results, when doing GCMC calculations). We can not suppress them unless we change the raspa code.
Therefore, for the output files we need more discussion to reduce the hdd consumption.
The text was updated successfully, but these errors were encountered:
From a recent analysis, my AiiDa repository of 31GB has ca.:
Other files related to Raspa (cif, simulate.input, _aidasubmit.sh, and force field .def files), are under 1GB, so no need to care now.
I suggest to take action to reduce the size of the repository, given the large number of RaspaCalculation that are called during our workchain.
Restart files
I suggest not to retrieve them by default but use them locally as we do e.g., for cube files in
aiida-ddec
.However, I would leave an option for the user to retrieve them: this is the case of saturated system that require a long equilibration, or for the simulated annealing work chains of
aiida-lsmo
where a cif file is computed from the restart (https://github.com/lsmo-epfl/aiida-lsmo/blob/6c32eebdaefd11dc226d849415e363c6222a475a/aiida_lsmo/workchains/sim_annealing.py#L88-L106)Output files
Here we want to think about it because we have different type of data:
PrintEvery
and one want to check to see what is going on in the calculation. We generally use 10 checkpoints, so it is not a huge amount of data and we should keep it.Therefore, for the output files we need more discussion to reduce the hdd consumption.
The text was updated successfully, but these errors were encountered: