Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for viewing data flow maps #26

Open
14 tasks
anjackson opened this issue Jan 10, 2025 · 0 comments
Open
14 tasks

Add support for viewing data flow maps #26

anjackson opened this issue Jan 10, 2025 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@anjackson
Copy link
Contributor

anjackson commented Jan 10, 2025

As part of the work to understand how digital preservation gets done in real institutions, I'm trying to understand what the overall flow of information is in these cases. i.e. tracing the path of the bitstreams from system to system at the overall organisational level.

I've come up with an approach based on space-time diagramming combined with Metro map styling. It takes a sequence of data event information and visualises it in quite a nice way. Some more features are needed still:

  • More diagrams, including very simple OAIS patterns, UKWA etc. And a way to flick between institutions and workflows.
  • When you click on one, update a panel that gives a place for lots of lovely gnarly detail and links. Or possibly copy these popovers
  • Allow places to include 'boundaries', and use the underling library to snake a 'river' through the diagram.
  • Support 'via' somehow?
  • If it works well, consider making it a reusable library, possibly integrating an editor into the 'practices' subsite.
  • Consider using the 45° lines for things that take a while, as an alternative corner layout.
  • Set up as an npm module, e.g. d3-dataflow-tube-map, to make reuse easier.
  • add 'domain' to places and switch the river there, maybe adding a "pseudo line" vertical to make that clear? Adding a domain adds a line along the side, adding a next domain adds a river, switches sides. So adding boundaries, in effect. Use community as bounds
  • Station names all have to be unique rather than just being labels. That should be changed.
  • The shiftCoords currently forced to be the same for both ends of a move/copy, which makes some alignments difficult. Need some way to let the start be the same as the parent, but then the final shift be different.
  • Multiple source-target pairs at the some timestamp would be useful sometimes.
  • Allow space/time spacing to be changed in the workflow config.
  • Nice error if a target of e.g. a delete does not exist.
  • Start/end labels not working very well. End labels work fine if there's no start. Maybe use a different approach to laying out the locations on the 'in' side. (on the suggested domain bracket instead).

Bigger ideas:

# Data types and descriptions:
data sip "Submission Information Package" color="#ff0000"

# Locations where data can be stored:
location producer "Producer"
location ingest "Ingest Storage"
location archive "Archival Storage"
location access "Access Storage"

# Domains where locations are maintained:
domain dc "Designated Community"
domain ar "The Archive"
domain man "Management"

# Then the sequence of events in this dataflow...

# We start by transferring a package from an external party:
start [email protected]
move [email protected] [email protected] "Transfer to the archive"
space

# We then prepare the item for ingest to the archival storage storage system:
derive sip@ingest aip@ingest "Generate AIP from SIP"
copy sip@ingest aip@archive "Copy to archival storage" 
# And delete the temporary files:
delete sip@ingest, aip@ingest
space

# When access is requested, we generate an access copy:
copy aip@archive aip@access "Retrieve the AIP"
derive aip@access dip@access "Generate the DIP"
copy dip@access [email protected] "Send the DIP"
delete aip@access, dip@access

# And we're done:
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: In Progress
Development

No branches or pull requests

1 participant