You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The route in commaai/openpilot#33840 had its camera bus spam CAN core resets due to ACK errors and a high transmit error count on ignition off.
This PR #1502 switched the logic from only resetting once when an error counter reached 100 to resetting continually while the transmit_error_counter > 127. Once this happens, any future ACK errors no longer need this tolerance to reset the CAN core. This causes the interrupt load of the panda to hover around 90%, slowing down/hanging SPI communication with pandad.
Things that mitigate this:
removing this busywait delay significantly lowers the interrupt load and communication is kept:
don't reset CAN cores when their transceivers are off due to ignition/power saving mode.
reset transit error counter
TODOs:
This should be common, figure out why this is rare. On another random Bronco route I see 2 CAN core resets on camera bus (flipped canState0) due to 100 busOffCnts. Also figure out how it didn't spam CAN core resets in this case. Figuring out these two questions should make it clearer which fix is appropriate.
The route in commaai/openpilot#33840 had its camera bus spam CAN core resets due to ACK errors and a high transmit error count on ignition off.
This PR #1502 switched the logic from only resetting once when an error counter reached 100 to resetting continually while the transmit_error_counter > 127. Once this happens, any future ACK errors no longer need this tolerance to reset the CAN core. This causes the interrupt load of the panda to hover around 90%, slowing down/hanging SPI communication with pandad.
Things that mitigate this:
panda/board/stm32h7/llfdcan.h
Line 18 in 1290588
panda/board/power_saving.h
Line 36 in 1290588
Giving the CAN irqs priorities of 1 and SPI1 and SPI2 (setting SPI1_IRQn and SPI2_IRQn irqs to -1 that):panda/board/stm32h7/llfdcan.h
Lines 135 to 142 in 1290588
TODOs:
Related PRs that touch this code:
The text was updated successfully, but these errors were encountered: