-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add A/B comparison support #445
Comments
+1 I have also thought about this - maybe to start with a "compare" tab that allows you to upload two profiles - then shows a table where you can sort by total / self time +/-. A fancier version would be to render a profile diff like this to show which functions got cheaper or more expensive. @jlfwong how opposed are you to large UI additions to the website? Not sure what your philosophy is around keeping the app minimal |
I'm not opposed to large UI additions, with a few caveats...
Of course even if I drag my feet, if you have a version that works, you always have the option of self-hosting a version somewhere for people to play with :) |
@joaospinto @jlfwong I took some time to look into this in the past week - have a working self-hosted version here if either of you want to play around with it. It would definitely would need some cleanup and iteration like you describe above. I'm going to collect some feedback from some engineers I work with and then maybe can start a spec + work out the UI details with you. Here are some screenshots of how it currently works:
Initial thoughts:
function getCompareKey(frame: Frame) {
if (['anonymous', '(unnamed)'].includes(frame.name)) {
return `${frame.name}:${frame.file}:${frame.line}:${frame.col}`
}
// Assume that function names are unique per file if not anonymous
return `${frame.name}:${frame.file}`
}
@jlfwong maybe you could play around with the UI and we could start discussing the UX? The code itself is probably not worth looking at at the moment. Here are some example profiles that I have been using to test. |
An alternative would be to have some kind of global toggle to go between "single viewing mode" and "compare mode", and re-using the existing tabs. I did some explorations into red/green flamegraph rendering to show improvements / regressions, but the issue is that if you are looking at a non-aggregated timeseries view of a flamegraph, you really want to know if a specific function call got slower or faster, not if the total time spent in that function got slower or faster. That requires an algorithm for diffing two n-ary trees, which is hard and made my head hurt. It should theoretically be possible though. Here is an example flamegraph diff just matching on frames (showing self time change): There is also a lot more UI complexity introduced by these kinds of graphs, for example:
There is definitely a world in which frame-based flamegraph diffing is good enough (it does give useful signal) even though in certain cases it will mark something as green which actually is lower or red which actually is faster. Kind of depends on how fast and loose we want to fly with correctness / interpretability. @jlfwong maybe we can discuss a bit here first just to get your initial thoughts and then I can flesh out an approach? Definitely going to have less time once I head back to work but would love to work with you on it if you think it's valuable. Also no rush responding, happy new year 🥳 |
Here is a draft of the motivation to clarify exactly why I think this would be valuable: Often when we are doing performance optimization work, we are iteratively improving a certain part of a performance trace. This process looks like:
Generally I use the flamechart to get a view of what is slow, and then the sandwich view to see exactly what the total or self time difference was for a given function. This process is suboptimal as you only see changes if they are visually obvious in the flamegraph, or if you know what you’re looking for. This means that you could very well introduce an optimization that improves the thing you are targeting, but slows something else down. Without manually checking every part of the profile to see what got slower or faster, it’s hard to make sure that you aren’t introducing a regression. Additionally, it’s just not as convenient as having a view that tells you exactly what got slower or faster between profiles. Ideally we could make this process simpler by integrating a view into the speedscope UI that natively allows for comparing two profiles. This would reduce the burden of switching between tabs and also surface exactly what got slower or faster, instead of just what the user visually notices. I think there is a counterargument that says "it's not that hard just to look" and that this is kind of complicated and there are a lot of ways that incorrect / misleading information could be surfaced, so it's not worth it. From my perspective the compare tab showing the table view is a good middle ground, but open to discussion! |
@jlfwong What do you think about the proposal above? I've been playing around with its compare functionality and it has come in very handy. |
@jlfwong - Would you consider accepting this proposal? |
Hi, I wanted to check in about the progress of the A/B comparison feature for Speedscope that was discussed earlier by @zacharyfmarion and @jlfwong . I saw that there has been some work on a self-hosted version with a "Compare" tab and flamegraph diffing for profiling data. Is there any update on when this feature might be available in the main Speedscope project? Are there any blockers or plans for additional UI iterations? I would also be interested in helping test or provide feedback on the feature if it's ready for review. Looking forward to hearing more about it! Thanks! |
Hi! It would be nice to have an A/B mode for speedscope.
Let me clarify what I mean by this. Suppose I have a program, and gather its initial profiling data. This would be the "A" output. Then I go and make a bunch of changes, and gather some new profiling data. This would be the "B" output. Then I'd want to compare what changed between A and B.
This is a very common use case, and I wonder what your thoughts are about adding some native functionality for this.
Thanks!
The text was updated successfully, but these errors were encountered: