-
Notifications
You must be signed in to change notification settings - Fork 995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUGFIX: [graph] Increase count of related elements fetched by Correlation Graph to 500000 #9240
base: master
Are you sure you want to change the base?
BUGFIX: [graph] Increase count of related elements fetched by Correlation Graph to 500000 #9240
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #9240 +/- ##
==========================================
- Coverage 65.26% 65.26% -0.01%
==========================================
Files 630 630
Lines 60311 60311
Branches 6777 6771 -6
==========================================
- Hits 39361 39359 -2
- Misses 20950 20952 +2 ☔ View full report in Codecov by Sentry. |
Thanks for your contributions ! Imagine you have a widely used entity like "APT28" or "USA", present in thousands of reports. GQL will wait for the complete resolution of this huge graph, which will take a long time. Display would suffer a lot too. |
I took this work a step further and have the correlation traversal algorithm traverse all related nodes of type |
ed0ab53
to
1ee9da9
Compare
That's a good point. Part of our work on #3227 for next milestone addressed this issue. It has been merged in #9175 Note that in the details view, the correlated containers are detected through common observables and indicators. To be consistent we also aligned in the graph view. Now, concerning the limitation & performances issues, this raised a lot of discussions. We think the best solution is a heavy refactoring of the graph code to fetch data bits by bits, paginating results and loading the graph incrementally. |
Thanks! I agree that the best implementation would be to do a paginated walk to incrementally build the graph by making smaller requests that don't overload the back-end. I don't understand well enough how to do that for these 2nd-degree related items, however (thus, my solution being to just set the limits really really high). I do think that having the ability to have the front-end query what the "total set" size would be ahead of time, so that it can render a progress meter during incremental graph build-out, would be a helpful UI element to it, so the end users know when the data is still being retrieved. The partial graph could be shown in real-time, and have additions to it added and displayed as they're fetched from the server. Additionally, having a "stop" and/or "pause" button that allows the data fetch/graph-build to be stopped would also be helpful. So, for larger graphs, if the end user happens to see the data they're interested in visible on the graph at some point during the paginated ingest, they could stop or pause the data fetch and then explore the portion of the graph that's been built up to that point. |
2bd60d5
to
76acd0f
Compare
That's where we want to go next, we're aligned on that :) I'm thinking step-by-step queries, all properly paginated. The frontend would query the main container, then it's content, then for each one the relate containers, etc. Knowing in advance the size of the full graph is challenging, but at least we can give visual feedback with all the info we can get, as soon as we get them. |
76acd0f
to
f89fbb0
Compare
b414944
to
c7f4cb7
Compare
The 500 upper limit is very low and results in a lot of correlation graphs displaying no correlations in the chart, while the main page of the Graph says there exist correlations. In some cases this limit was just 20.
For correlation graphs, include any cases, groupings, or reports as correlations, regardless of the current container type
This limit was still at 20, resulting in empty graphs when correlations were expected.
f89fbb0
to
ebcd2af
Compare
The 500 upper limit is very low and results in a lot of correlation graphs displaying no correlations in the chart, while the main page of the Graph says there exist correlations. In some cases this limit was just 20. This fixes a lot of the reported problems where related reporting is seen on the Overview page of a analysis/case/etc. but the Correlation Graph is blank (because it isn't fetching all nodes, and the correlations don't show in the first 500 fetched). In some scenarios, the graph does show some items, but is missing other expected ones.
Proposed changes
first: 500000
when fetching first-degree and second-degree related nodes for Correlation GraphsRelated issues
Checklist