-
Hello, I found this repo when I was investigating whether there were any existing solutions for a problem I'm working on. I work in the water industry and we deal with networks of manholes and pipes: ^^ Sample of some data from our corporate GIS systems. Essentially these boil down to directed graphs. The data itself looks something similar to: In order to calculate risks of certain things happening in our network we often need to cascade information down the directed graph. Example of one such calculation (highly simplified):
Respective data flow: Is this something that can be ran/calculated in cascading? The data flow specifically looks very similar to the diagrams presented in the demonstration talk. My major concern is that I'm unsure how (if at all) these could then be applied to each node/object in the graph in an optimal manner, while also dealing with loops in the graph etc. Is this handled by The other issue is that in our case we won't only want to cascade down the directed graph, but also up. E.G. To calculate the time til treatment:
I.E. timeTilTreatment is based on downstream velocity, which in turn is calculated from upstream population. I think this may be handled by cascading, assuming there is the correct adapter to the network data... Any thoughts would be greatly appreciated! :) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
FWIW, Cascading isn't a graph API, but a data flow API, that is frequently used to perform complex work on an Apache Hadoop cluster for very large datasets. The corpus of data could be large, or an individual unit of work could be large, so that either doesn't fit into memory on a single machine. In that case, you could chain graph operations together into a flow by partitioning the work as streams. but I suspect you could perform each operation in memory, and if so, and using Java, i'd look at using JGraphT as a basis for the graph. this may or may not be helpful, but it isn't immediately obvious to me Cascading would be helpful for 'cascading' events through a graph. |
Beta Was this translation helpful? Give feedback.
FWIW, Cascading isn't a graph API, but a data flow API, that is frequently used to perform complex work on an Apache Hadoop cluster for very large datasets.
The corpus of data could be large, or an individual unit of work could be large, so that either doesn't fit into memory on a single machine.
In that case, you could chain graph operations together into a flow by partitioning the work as streams.
but I suspect you could perform each operation in memory, and if so, and using Java, i'd look at using JGraphT as a basis for the graph.
this may or may not be helpful, but it isn't immediately obvious to me Cascading would be helpful for 'cascading' events through a graph.