-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Could you help me figure out why my examination result shows rootba is slow than ceres and schur complement #2
Comments
Thanks for your interest. I'll have a look shortly. |
In general, the relative performance of the different methods in our experience can depend a lot on the hardware and number of CPU cores. One aspect is that from our experiments it seems to better take advantage of parallelization. Also, which method is faster depends a lot on the actual problem. We are not claiming that rootba has better runtime in all situations. That being said, I've tried your script with the current master and compiled with default settings on two machines, and this is what I get. 2013 Macbook (i7 with 8 virtual cores): Ubuntu 18.04 Desktop (Xeon W-2133 with 12 virtual cores): I'm not sure why you see something qualitatively very different. What hardware are you running on? Two thoughts:
|
Thank you for helping me! ! ! I will try it according to your suggestion. |
No, I expect different outcome with different number of threads. Note that |
Hello, |
That's a bit strange. Yeah, maybe it is an issue with TBB. Your ceres runtime is similar to my Linux box, but the others are much slower, which is very surprising if it does indeed use multi-threading. Ceres does not use TBB in our configuration AFAIK, so it could make sense. Maybe you can have a look yourself, but otherwise, you could post here your OS and maybe the full output of a fresh If you are using Ubuntu, you can double check which BLAS is configured with (just on case openblas got installed as a dependency of something):
|
Hi, I've been playing with this on a Macbook Air M1 with 8 and 4 threads. Using the |
That's very curious. Are you sure you have built all the binaries with the same configuration? Beware that by default Can you try wipe the bin and build folder and recompile all binaries? If you still see a difference another thing to confirm is that you are using the same config in all cases. Could you please paste the full command line call and output for all 3 runs? |
I am reading your paper <Square Root Bundle Adjustment for Large-Scale Reconstruction, CVPR2021>. Your idea of using QR decomposition instead of traditional Schur Complement is awesome. I have run your source code
rootba
. The result image is shown in the end of the issue. From the picture, we can see QR-32(single precision QR in rootba) is slow than ceres and schur complement. I was puzzle about it. Could you help me figure out it?The text was updated successfully, but these errors were encountered: