Skip to content
Germán Felipe Giraldo Villa edited this page Aug 4, 2021 · 4 revisions

Data creation

All data blocks that the agent creates are subscribed by RucioInjector's insertBlockRules(). The rules bind the block files to the storage site in which they were created (Usually T0_CH_CERN_Disk).

Data subscription

T0 configurations defines up to 3 end point site to send every new dataset (Rucio container) that the agent creates. RucioInjector's insertContainerRules() takes care of createing the rules to tranfer the data to those destinations.

Data deletion

Data that's no longer needed is deleted by RucioInjector's deleteBlocks(). Deletable blocks are first fetched querying the database. The query checks for the following conditions:

  • Block status is 'Closed'.
  • Block's container has been subscribed.
  • Block's container subscription is configured to delete its blocks (Always true for T0).
  • Block has not been deleted.
  • All workflows using the block are marked as completed in DBS Buffer.
  • All workflows using the block have been archived.

Right now, there T0 uses an archive delay of 168h (here), which means the data is not deleted for at least 7 days after it is created. Note that this delay starts counting after the last workflow that uses the file has been completed. This means that in practice, data stays at T0 Disk for more thatn 7 days. For example, RAW data is used for PromptReco workflows, which start 48h after a run is iserted, so RAW data is usually deleted at least 9 days after is created.

The query retrieves a list of blocks that match those conditions. After that, RucioInjector checks whether each block has been transferred to its final destination. If it has been, then the block rule that was defined during block creation is deleted using the --purge-replicas option.

Clone this wiki locally