Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: Critical Mutex Deadlock #86499

Open
legoabram opened this issue Feb 28, 2025 · 1 comment
Open

net: Critical Mutex Deadlock #86499

legoabram opened this issue Feb 28, 2025 · 1 comment
Labels
area: Ethernet bug The issue is a bug, or the PR is fixing a bug platform: STM32 ST Micro STM32

Comments

@legoabram
Copy link
Collaborator

Describe the bug
Running on an STM32H753VG with ethernet, there is a condition where the rx_q[0] thread and sys_work_q running rs_timeout can get into a mutex deadlock. The work queue acquires the iface mutex, and the rx_q acquires the iface TX mutex, and then they both try to acquire the other mutex. See the call stacks and iface provided for more details.

To Reproduce
Unfortunately, given the nature of the issue, I don't really have a way to replicate this easily. I do have core dumps of a failed system however, so I can provide any information needed at any time.

Expected behavior
No mutex deadlock?

Impact
This is a massive showstopper. This disables our primary functionality in a way that our WDT can't detect. And since it locks up the work queue as well, it breaks several other operations our device needs. We can still recover failed prototype devices in the field thanks to additional debug mechanisms, but we can't go into production with this bug in place.

Logs and console output

Image

Image

Image

Environment

  • OS: Windows 11
  • Toolchain Zephyr SDK 0.17
  • Zephyr: 064fcfc
@legoabram legoabram added the bug The issue is a bug, or the PR is fixing a bug label Feb 28, 2025
@legoabram
Copy link
Collaborator Author

It's possible that this issue has already been addressed in a newer version of Zephyr, but we can't afford the time right now to upgrade if we don't know for certain it will fix the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Ethernet bug The issue is a bug, or the PR is fixing a bug platform: STM32 ST Micro STM32
Projects
None yet
Development

No branches or pull requests

2 participants