-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft - Bloom Filter Probabilistic Routing #5697
Conversation
… auto position updates
First thing: this can only be done for version 3.0, because extending the header would be a breaking change. Sounds interesting, I’m curious to the results of simulations. Note that the interactive simulator is hard to use for scale though, as you’ll be simulating all nodes on one machine. One thing I would change is the number of neighbors per hop. 20 is really a lot, while it’s much more common you exceed 3 hops (which is actually 2 hops because you'll need to add the first neighbors already when the hop limit is still 3). I’m also not sure if this overhead (duplicating the header size) is worth it if you only base the probability of rebroadcasting on it. With this you could also implement rebroadcasting until you don’t have unique receivers to serve after receiving other nodes’ rebroadcasts. |
Makes sense.
For sure. Will need some real world after simulating the effects. By the way, I'm having trouble with the simulator. Is there anything off the top of your head that might need to change given the expanded header? I changed the config value to 32 bytes and looked through the code but nothing jumped out at me. But for some reason the packet is losing the header value when flowing through the simulator.
This will be the interesting part. In an ideal state, all a nodes current neighbors should be in the filter. But the best we can do is approximate current neighbors because we don't have frequent heartbeats to say "I'm still here". Right now I'm filtering down to nodes heard directly within the last hour because that is the minimum interval a nodeinfo packet goes out. Open to thoughts on this. I don't think we can be sure that 20 is too much. For example, the Chicago mesh is very dense. I think during their scheduled NETs, 20 is low. And also, we can't pack the bloom filter with a fixed size with too many nodes or the False Positive Rate will increase. A node will more often think "My neighbor, X node, was already covered and is in the filter" when it actually is not. A non-zero false positive rate should be fine, but we wouldn't want it to climb too high.
Probabilistic forwarding can be aggressive but still leave room for some level of potential duplication. I view it as being almost inversely proportional to the false positive rate. For as often as we may accidentally think we cover a node, we allow some potential for rebroadcast. But your idea is super interesting. Would have to think about that. |
I also responded to your question here: #5629 (comment) The header is not sent in "raw" form to the simulator (or any client app); the fields are parsed and then added to the MeshPacket protobuf message. So you either have to add your field to that protobuf message and forward it in the simulator, or as workaround send it with the payload and parse it from there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change should not be a part of this feature, was already closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove. This is not ready to merge right now.
UPDATED APPROACH This has been a saga of changes based on excellent input from @GUVWAF. I wanted to capture the latest approach because it has evolved quite a bit since the beginning. Here is the "How it works": Packet Header
How a Node's Coverage Is Maintained:
How a Node's Coverage is Applied:
What hasn't changed:
Main Benefits
|
Update protobufs and classes
Update protobufs and classes
Update protobufs and classes
Yes, I think extending the While this might not be so important now that the full Also, have you every changed the hop limit for Next to this, I’m a bit skeptical about the hard-coded parameters like Looking forward to the new results. |
I can vary this as well.
No. I have not. I will try with a higher hop limit on managed flooding, but I don't think the purpose of this test is to see if we can craft a more perfect mesh using hop limits and node roles. The reality is most meshes lack the organization and knowledge to achieve this.
These are necessary to make testing more convenient and those values have a material impact on the coverage based algorithm. Each represents a careful mitigation of the edge cases where coverage based routing is weak due to the lower bandwidth less frequent updates from neighboring nodes. The effect of COVERAGE_RATIO_SCALE_FACTOR can be seen here in the linear line in red. It is used to vary how quickly a 100% probability of rebroadcast should happen. For example, if 20% of my nodes are new for this packet, we want that to rebroadcast. If the scale factor was 1, only 100% new coverage would guarantee rebroadcast. UNKNOWN_COVERAGE_REBROADCAST_PROBABILITY is used when a node first boots, or all of its coverage has fully aged out. Effectively, the node has 0 confidence as to other nodes around it. Therefore, we should prefer that it rebroadcasts because it is an ignorant node for the time being. RECENCY_THRESHOLD is the amount of time a neighboring node remains in a nodes coverage list. As the node nears the threshold, its contribution to the probability decays because our confidence in it still being nearby decays. Again, this is a mitigation tactic based on our imperfect means of tracking coverage.
I will need to run more tests, but I don't think the lora modem presets or packet length should impact this because all of these deal with coverage which doesn't change based on those two things. PERIOD may matter more because it is directly tied to establishing coverage knowledge. |
Sorry for delay. Simulations are coming, but running them is becoming very time consuming. I am almost done with some changes to the batch simulator so I can run these tests with a little more scale and less manual touch. |
It would be good to add more mobile nodes, in many meshes a high percentage of nodes are mobile (and increasing) without position knowledge. |
So I've been deep in this for 2+ weeks and I think I need to reset my brain a bit to make sense of the results - so many iterations, so many tests. What I think I've learned so far is... anything that could degrade moment-in-time knowledge of direct neighbors will degrade the performance of the coverage router. I am working to find the inflection point where the size of the mesh is so small that the sensitivity to coverage is too high, but it seems like < 35. In general, results degrade with the new layers of more "real world" simulation - which is intuitive, because things like mobile nodes w/out GPS and asymmetric links degrade any given node's ability to sense direct neighbors accurately over time. Here are the baseline results comparing managed flood (3 hops, 16byte header) with bloom router (15 hops, 32byte header) and a few other tunable constants that affect rebroadcast probability. NEW IMAGES COMING.... |
Well friends. After a lot of additional testing I have come to the conclusion that the coverage based router is not viable. Here are the main reasons:
In short, there is a narrow sweet spot where this works, but we can't detect when the network falls outside the efficacy range of the implementation. @GUVWAF - I hope the modifications to the simulator are still useful. |
That’s really a shame. It really looked promising at some point, but if it turns out not to give significant improvements in realistic scenarios, indeed I don’t think it’s worth investigating further. It’s anyway good to have considered the option and really want to thank you for doing this so extensively. The additions to the simulator are definitely useful. If you have time to separate those additions from the bloom filter implementation, that would be great, otherwise I will try to do it sometime. |
I will do that. Thank you! |
Farewell bloom. |
This is a draft for discussion and review:
Proposal
Expand
relay_node
to 4 bytes and include a 13-byte (104-bit) Bloom filter in the packet header. Each hop adds up to N uint32_t node IDs. The filter uses 2 hash functions.Why a Bloom Filter?
False Positives - A Known Limitation
Bloom filters can give false positives (claiming “node X is in the set” when it actually isn’t), but they never give false negatives (if it says “not in the set,” then it’s definitely not in the set).
For our routing use case, these occasional false positives are acceptable because we only need a rough idea of which nodes are covered to make probabilistic rebroadcast decisions.
Impact