sched/spin_lock: rename raw_spin_* to spin_*_notrace #15705

anchao · 2025-01-27T03:05:06Z

Summary

sched/spin_lock: rename raw_spin_* to spin_*_notrace

raw_spin in the Linux kernel means disable the preemption before
holding the spin lock, not disable trace. Rename raw_spin to
spin_*_notrace to make the interface more semantically consistent.

sched/spinlock: add a new set of interfaces to disable preemption

spin_lock                ->      spin_lock_nopreempt
spin_trylock             ->      spin_trylock_nopreempt
spin_unlock              ->      spin_unlock_nopreempt
spin_lock_irqsave        ->      spin_lock_irqsave_nopreempt
spin_trylock_irqsave     ->      spin_trylock_irqsave_nopreempt
spin_unlock_irqrestore   ->      spin_unlock_irqrestore_nopreempt

sched/spinlock: replace all disable preemption case with spin_*_nopreempt

flags = spin_lock_irqsave(&g_can_lock);       ->   flags = spin_lock_irqsave_nopreempt(&g_can_lock);
sched_lock();                                 ->

spin_unlock_irqrestore(&g_can_lock, flags);   ->
sched_unlock();                               ->   spin_unlock_irqrestore_nopreempt(&g_can_lock, flags);

sched/spinlock: removed the semantics of disable preemption from default spin_lock

In the RTOS environment, spin_lock is simplified as much as possible to improve performance.
At the same time, we provide an set of APIs that disable preemption by default:

spin_lock                ->      spin_lock_nopreempt
spin_trylock             ->      spin_trylock_nopreempt
spin_unlock              ->      spin_unlock_nopreempt
spin_lock_irqsave        ->      spin_lock_irqsave_nopreempt
spin_trylock_irqsave     ->      spin_trylock_irqsave_nopreempt
spin_unlock_irqrestore   ->      spin_unlock_irqrestore_nopreempt

This also has the same intention with RT-Linux.

Signed-off-by: chao an [email protected]

Impact

N/A

Testing

ci-check

nuttxpr · 2025-01-27T03:42:18Z

[Experimental Bot, please feedback here]

Yes, this PR appears to meet the NuttX requirements, although it could be improved.

Here's a breakdown of why and suggestions for improvement:

Strengths:

Clear Summary: The summary explains the "why" (semantic consistency with Linux), the "what" (renaming and adding spinlock functions), and the "how" (new functions disable preemption).
Code Examples: Including the code changes as examples is helpful.
Signed-off-by: Indicates contributor agreement to the Developer Certificate of Origin.

Weaknesses & Suggestions:

Impact Understated: Marking impact as "N/A" is likely incorrect. Even seemingly minor changes to core functions like spinlocks can have significant impacts. Consider these:
- Impact on user: While the user might not need to change code directly, this changes the semantics of existing functions. This could lead to subtle bugs if users were relying on the old behavior. At a minimum, the PR should explain why user impact is believed to be minimal.
- Impact on build: Are any Kconfig options affected? Do any existing users need to change their configurations?
- Impact on hardware: Does this change affect any specific architectures? If it's generic scheduling code, probably not, but it's worth explicitly stating that.
- Impact on documentation: The documentation definitely needs updating to reflect the new functions and renamed functions. The PR should mention this and ideally include the documentation updates.
- Impact on compatibility: Does this break any existing code that used raw_spin_*? If so, how will users migrate their code? If not, explicitly state backward compatibility is maintained.
Testing Insufficient: "ci-check" is not sufficient testing information. The requirements ask for specific build hosts, targets, and logs. Provide details on the platforms tested and example logs demonstrating the change's correct behavior. Ideally, include a test case that specifically exercises the new preemption-disabling spinlocks.

In short: While the core information is present, this PR needs more detail in the Impact and Testing sections to fully meet the NuttX requirements. Be explicit about potential impacts even if you believe them to be minimal, and provide concrete testing evidence.

arch/arm/src/am335x/am335x_can.c

…erformance" This reverts commit ace7f0c.

raw_spin in the Linux kernel means disable the preemption before holding the spin lock, not disable trace. Rename raw_spin to spin_*_notrace to make the interface more semantically consistent. Signed-off-by: chao an <[email protected]>

spin_lock -> spin_lock_nopreempt spin_trylock -> spin_trylock_nopreempt spin_unlock -> spin_unlock_nopreempt spin_lock_irqsave -> spin_lock_irqsave_nopreempt spin_trylock_irqsave -> spin_trylock_irqsave_nopreempt spin_unlock_irqrestore -> spin_unlock_irqrestore_nopreempt Signed-off-by: chao an <[email protected]>

…qsave, since it will be called in func spin_lock_irqsave." This reverts commit 8f3a2a6.

…is no need to call it explicitly." This reverts commit b0af946.

This reverts commit 2779989.

…empt flags = spin_lock_irqsave(&g_can_lock); -> flags = spin_lock_irqsave_nopreempt(&g_can_lock); sched_lock(); -> spin_unlock_irqrestore(&g_can_lock, flags); -> sched_unlock(); -> spin_unlock_irqrestore_nopreempt(&g_can_lock, flags); Signed-off-by: chao an <[email protected]>

…ult spin_lock In the RTOS environment, spin_lock is simplified as much as possible to improve performance. At the same time, we provide an set of APIs that disable preemption by default: spin_lock -> spin_lock_nopreempt spin_trylock -> spin_trylock_nopreempt spin_unlock -> spin_unlock_nopreempt spin_lock_irqsave -> spin_lock_irqsave_nopreempt spin_trylock_irqsave -> spin_trylock_irqsave_nopreempt spin_unlock_irqrestore -> spin_unlock_irqrestore_nopreempt This also has the same intention with RT-Linux. Signed-off-by: chao an <[email protected]>

acassis · 2025-01-28T15:05:50Z

@anchao nice work! Please update our Documentation/ ( https://nuttx.apache.org/docs/latest/implementation/critical_sections.html ) to include these spinlock details!

xiaoxiang781216 · 2025-01-29T05:57:59Z

sched/spinlock: removed the semantics of disable preemption from default spin_lock

In the RTOS environment, spin_lock is simplified as much as possible to improve performance. At the same time, we provide an set of APIs that disable preemption by default:
spin_lock                ->      spin_lock_nopreempt
spin_trylock             ->      spin_trylock_nopreempt
spin_unlock              ->      spin_unlock_nopreempt
spin_lock_irqsave        ->      spin_lock_irqsave_nopreempt
spin_trylock_irqsave     ->      spin_trylock_irqsave_nopreempt
spin_unlock_irqrestore   ->      spin_unlock_irqrestore_nopreempt

spinlock need hold sched lock in no-RT case.

This also has the same intention with RT-Linux.

No, RT-Linux map spinlock to mutex.

xiaoxiang781216 · 2025-01-29T05:53:04Z

fs/inode/fs_files.c

@@ -72,9 +72,9 @@ static FAR struct file *files_fget_by_index(FAR struct filelist *list,
  FAR struct file *filep;
  irqstate_t flags;

-  flags = raw_spin_lock_irqsave(&list->fl_lock);
+  flags = spin_lock_irqsave(&list->fl_lock);


why revert, but not replace raw_spin_lock_irqsave with spin_lock_irqsave_notrace

here need trace, thank you

xiaoxiang781216 · 2025-01-29T06:09:50Z

include/nuttx/spinlock.h

@@ -278,8 +278,6 @@ spin_lock_nopreempt(FAR volatile spinlock_t *lock)
 #ifdef CONFIG_SPINLOCK
 static inline_function void spin_lock(FAR volatile spinlock_t *lock)
 {
-  sched_lock();


But spinning locks implicitly disable preemption:
https://kernel.googlesource.com/pub/scm/linux/kernel/git/christoph/linux/+/refs/heads/arm64_maxsmp/Documentation/locking/locktypes.rst#76

please use spin_lock_nopreempt() , this PR just restore the code behavior to the same as before

xiaoxiang781216 · 2025-01-29T06:21:08Z

https://docs.kernel.org/locking/locktypes.html contain the detailed information, here is the keypoint:

anchao · 2025-01-29T11:35:47Z

sched/spinlock: removed the semantics of disable preemption from default spin_lock

In the RTOS environment, spin_lock is simplified as much as possible to improve performance. At the same time, we provide an set of APIs that disable preemption by default:
spin_lock                ->      spin_lock_nopreempt
spin_trylock             ->      spin_trylock_nopreempt
spin_unlock              ->      spin_unlock_nopreempt
spin_lock_irqsave        ->      spin_lock_irqsave_nopreempt
spin_trylock_irqsave     ->      spin_trylock_irqsave_nopreempt
spin_unlock_irqrestore   ->      spin_unlock_irqrestore_nopreempt
spinlock need hold sched lock in no-RT case.

why? If no scheduling occurs in the spin_lock protected code, why do we need to hold the sched_lock?
In addition, I provide the spin_lock_nopreempt() version, you should use this one not spin_lock()

This also has the same intention with RT-Linux.

No, RT-Linux map spinlock to mutex.

After merge this patch, you can make changes from spin_lock to mutex. This PR just removes sched_lock to improve performance.

anchao · 2025-01-29T11:54:09Z

https://docs.kernel.org/locking/locktypes.html contain the detailed information, here is the keypoint:

so did you notice that sched_lock() was removed from PREEMPT_RT? This is the beginning of RT, because nuttx did not have sched_lock() in spin_lock() involved before, why does your change make the entire system bear the overhead of sched_lock()?
Replacing mutex with spin_lock is the second stage of optimization, which does not conflict with the current PR
You can look at the spin_lock implementation in other RTOS, no one will disable preemption(zephyr)
https://github.com/zephyrproject-rtos/zephyr/blob/c66a0bb2a9269dc483079d5cc24ba77b8c332a9a/include/zephyr/spinlock.h#L182-L214

Now we have 4 APIs available to developers

1.spin_lock
2. spin_lock_nopreempt
3. spin_lock_irqsave
4. spin_lock_irqsave_nopreempt

If you think the code needs to hold sched_lock, please use
*spin_lock_nopreempt instead of spin_lock()

xiaoxiang781216 · 2025-01-29T16:09:24Z

https://docs.kernel.org/locking/locktypes.html contain the detailed information, here is the keypoint:

so did you notice that sched_lock() was removed from PREEMPT_RT? This is the beginning of RT, because nuttx did not have sched_lock() in spin_lock() involved before, why does your change make the entire system bear the overhead of sched_lock()?

Yes, but you need first change the exuction of ALL IRQ from the interupt context to the thread context. Before that happen, the sched lock must be taken at the same time we hold spinlock.

Replacing mutex with spin_lock is the second stage of optimization, which does not conflict with the current PR

Let's emphasis the key point:

The busy loop lock(e.g. spinlock) must hold sched lock too
The sleep lock(e.g. mutex) don't/shouldn't hold sched lock

So, you can't do the change in seperate step without breaking the code:

Remove sched lock first
Change spinlock to mutex at the late time

Let's consider a VERY COMMON case in SMP:

thread0 on CPU0 hold a spinlock without hold sched lock
thread1 on CPU1 wake up thread2
OS decide suspend thread0 and resume thread2 on CPU0
thread1 on CPU1 want to hold the spinlock hold by thread0

but thread1 is ver hard to know when it can finish the busy wait loop since thread0 is already in the suspend state.
so, it's VERY VERY dangerous to schedule a thread which hold a spinlock.

You can look at the spin_lock implementation in other RTOS, no one will disable preemption(zephyr)
https://github.com/zephyrproject-rtos/zephyr/blob/c66a0bb2a9269dc483079d5cc24ba77b8c332a9a/include/zephyr/spinlock.h#L182-L214

zephyr's k_spin_lock disable irq too, but nuttx's spin_lock doesn't disable irq.

Now we have 4 APIs available to developers

1.spin_lock 2. spin_lock_nopreempt 3. spin_lock_irqsave 4. spin_lock_irqsave_nopreempt

If you think the code needs to hold sched_lock, please use *spin_lock__nopreempt_ instead of spin_lock()

No, the correct semantics(same as Linux) is:

spin_lock
spin_lock_preempt
spin_lock_irqsave
spin_lock_irqsave_preempt

Otherwise, the most code will sufer the random deadlock or long time busy loop like the example I mention previously. Only the special designed code can use spin_lock_preempt/spin_lock_irqsave_preempt as expect.
The bottom line is that the most common used API should be SMP SAFE.

anchao · 2025-01-30T01:48:17Z

Yes, but you need first change the exuction of ALL IRQ from the interupt context to the thread context. Before that happen, the sched lock must be taken at the same time we hold spinlock.

Let's emphasis the key point:

The busy loop lock(e.g. spinlock) must hold sched lock too

The sleep lock(e.g. mutex) don't/shouldn't hold sched lock

So, you can't do the change in seperate step without breaking the code:

Remove sched lock first

Change spinlock to mutex at the late time

Let's consider a VERY COMMON case in SMP:

thread0 on CPU0 hold a spinlock without hold sched lock

thread1 on CPU1 wake up thread2

OS decide suspend thread0 and resume thread2 on CPU0

thread1 on CPU1 want to hold the spinlock hold by thread0

but thread1 is ver hard to know when it can finish the busy wait loop since thread0 is already in the suspend state. so, it's VERY VERY dangerous to schedule a thread which hold a spinlock.

Let’s first unify our understanding:

There was no deadlock problem before in nuttx's API semantics, and all codes works very well, regardless of SMP or AMP

Right?

The deadlock problem was caused by xiaomi's changes(from you @xiaoxiang781216 and @hujun260 )
You first replaced enter_critical_section() with spin_lock_irqsave(),
After finding the deadlock, you guys replace spin_lock_irqsave() with spin_lock_irqsave()+sched_lock()
In order to simplify the code, you inside sched_lock() into spin_lock_irqsave(). Now you guys find that such a change will break all ESP devices and the performance of the entire system will be degraded
Look at what you are doing now,
Prepare to replace all kernel codes from spin_lock_irqsave() to raw_spin_lock_irqsave()
replace spin_lock_irqrestore with raw_spin_lock_irqrestore #15695
In order to solve the problem of startup crash, replace all spin_lock_irqsave() with raw_spin_lock_irqsave()
arch: use raw_spin_[un]lock to replace spin_[un]lock, fix regression b69111d16a2a330fa272af8175c832e08881844b of https://github.com/apache/nuttx/pull/14578 #15691
esp32: replace spin_lock_irqsave with raw_spin_lock_irqsave. espressif/esp-hal-3rdparty#8

I can't imagine how many board-level and chip codes will need similar adjustments in the future.
How can other developers continue to trust NuttX when the kernel API is so unstable?

Most kernel code does not need to hold sched_lock(), which is unfair to some code that does not cause scheduling. New API semantics should be expressed with new APIs instead of modifying the behavior of the original API.

You can look at the spin_lock implementation in other RTOS, no one will disable preemption(zephyr)
https://github.com/zephyrproject-rtos/zephyr/blob/c66a0bb2a9269dc483079d5cc24ba77b8c332a9a/include/zephyr/spinlock.h#L182-L214

zephyr's k_spin_lock disable irq too, but nuttx's spin_lock doesn't disable irq.

let remove spin_lock_irqsave() and move the irqsave to spin_lock() ok?

Now we have 4 APIs available to developers
1.spin_lock 2. spin_lock_nopreempt 3. spin_lock_irqsave 4. spin_lock_irqsave_nopreempt
If you think the code needs to hold sched_lock, please use *spin_lock__nopreempt_ instead of spin_lock()

No, the correct semantics(same as Linux) is:

spin_lock

spin_lock_preempt

spin_lock_irqsave

spin_lock_irqsave_preempt

Otherwise, the most code will sufer the random deadlock or long time busy loop like the example I mention previously. Only the special designed code can use spin_lock_preempt/spin_lock_irqsave_preempt as expect. The bottom line is that the most common used API should be SMP SAFE.

No, keep the current API semantics unchanged, and use new APIs to represent new features

xiaoxiang781216 · 2025-01-30T05:34:40Z

Yes, but you need first change the exuction of ALL IRQ from the interupt context to the thread context. Before that happen, the sched lock must be taken at the same time we hold spinlock.

Let's emphasis the key point:

The busy loop lock(e.g. spinlock) must hold sched lock too

The sleep lock(e.g. mutex) don't/shouldn't hold sched lock

So, you can't do the change in seperate step without breaking the code:

Remove sched lock first

Change spinlock to mutex at the late time

Let's consider a VERY COMMON case in SMP:

thread0 on CPU0 hold a spinlock without hold sched lock

thread1 on CPU1 wake up thread2

OS decide suspend thread0 and resume thread2 on CPU0

thread1 on CPU1 want to hold the spinlock hold by thread0

but thread1 is ver hard to know when it can finish the busy wait loop since thread0 is already in the suspend state. so, it's VERY VERY dangerous to schedule a thread which hold a spinlock.

Let’s first unify our understanding:

There was no deadlock problem before in nuttx's API semantics, and all codes works very well, regardless of SMP or AMP

Right?

No, it isn't truth. the critical section is a big lock which is a well known SMP isssue and should be replaced by the fine grant lock step by step:
https://en.wikipedia.org/wiki/Giant_lock
More and more IoT device has SMP capability, so it's important to improve NuttX's SMP performance.

The deadlock problem was caused by xiaomi's changes(from you @xiaoxiang781216 and @hujun260 )

All known issues are identified and fixed timely(ALL within 24hours), do you have any new issue?

You first replaced enter_critical_section() with spin_lock_irqsave(),

Let's emphasize that the critical section isn't equal to a spin lock since the thread will leave the critical seciton if the thread is suspended by shceduler, but the spinlock doesn't have this capability that's why many places replace enter_critical_section with spin_lock_irqsave + sched_lock.

After finding the deadlock, you guys replace spin_lock_irqsave() with spin_lock_irqsave()+sched_lock()

Where do you find we patch the sched_lock seperatedly? we change to spin_lock_irqsave+sched_lock at the same time.

In order to simplify the code, you inside sched_lock() into spin_lock_irqsave().

No, #14578 was created three months ago to fix #9531 and the problem I mentioned before.
But since it take a long time to review this important change, we have to use spin_lock_irqsave+sched_lock before it get merged.

Now you guys find that such a change will break all ESP devices and the performance of the entire system will be degraded

that's why spin_lock and raw_spin_lock exist:

In normal case, driver should use spin_lock which disable the schedule at the same time
In special case, you can call raw_spin_lock to bypass locking schedule

But the default version should be the safe one.

Look at what you are doing now,
Prepare to replace all kernel codes from spin_lock_irqsave() to raw_spin_lock_irqsave()
replace spin_lock_irqrestore with raw_spin_lock_irqrestore #15695

Yes, since this place is identified carefully that the lock scheduler sin't really needed.

In order to solve the problem of startup crash, replace all spin_lock_irqsave() with raw_spin_lock_irqsave()

This is the wrong usage of spinlock which hold the lock in one thread and unlock in another thread.

arch: use raw_spin_[un]lock to replace spin_[un]lock, fix regression b69111d16a2a330fa272af8175c832e08881844b of https://github.com/apache/nuttx/pull/14578 #15691
esp32: replace spin_lock_irqsave with raw_spin_lock_irqsave. espressif/esp-hal-3rdparty#8

I can't imagine how many board-level and chip codes will need similar adjustments in the future. How can other developers continue to trust NuttX when the kernel API is so unstable?

@hujun260 could we use a new approach to avoid touching esp32 code?

Most kernel code does not need to hold sched_lock(), which is unfair to some code that does not cause scheduling. New API semantics should be expressed with new APIs instead of modifying the behavior of the original API.

Yes, that's why we provide raw_spin__lock or spin_lock_preempt, but the default one(spin_lock) need to hold sched_lock for safety.

You can look at the spin_lock implementation in other RTOS, no one will disable preemption(zephyr)
https://github.com/zephyrproject-rtos/zephyr/blob/c66a0bb2a9269dc483079d5cc24ba77b8c332a9a/include/zephyr/spinlock.h#L182-L214

zephyr's k_spin_lock disable irq too, but nuttx's spin_lock doesn't disable irq.

let remove spin_lock_irqsave() and move the irqsave to spin_lock() ok?

please look at the dissusion: #9531

anchao · 2025-01-30T06:23:05Z

No, it isn't truth. the critical section is a big lock which is a well known SMP isssue and should be replaced by the fine grant lock step by step:
https://en.wikipedia.org/wiki/Giant_lock
More and more IoT device has SMP capability, so it's important to improve NuttX's SMP performance.

I agree with the changes to the critical section, but the current API changes are beyond my understanding and we are heading down the wrong path.

All known issues are identified and fixed timely(ALL within 24hours), do you have any new issue?

You have too many mistakes in the entire change cycle. This is just the one point of my review. There are many more, so I won’t share too much.

#15232 (comment)

Let's emphasize that the critical section isn't equal to a spin lock since the thread will leave the critical seciton if the thread is suspended by shceduler, but the spinlock doesn't have this capability that's why many places replace enter_critical_section with spin_lock_irqsave + sched_lock.
Where do you find we patch the sched_lock seperatedly? we change to spin_lock_irqsave+sched_lock at the same time.
No, #14578 was created three months ago to fix #9531 and the problem I mentioned before.
But since it take a long time to review this important change, we have to use spin_lock_irqsave+sched_lock before it get merged.

Let's refrain from unfounded remarks. You intended to replace wo_trace with the raw_ version to avoid the implementation of trace. I pointed out the issues in this approach within PR14943. Nevertheless, you were rather determined to merge the raw_ implementation.

#14943 (comment)

Yes, since this place is identified carefully that the lock scheduler sin't really needed.

No, let us keep the semantics of spin_lock_irqsave and not extend sched_lock and not follow linux which is an old implementation

This is the wrong usage of spinlock which hold the lock in one thread and unlock in another thread.

who wrong？It is very irresponsible of you to say this. There is nothing wrong with ESP using spin_lock_irqsave. It is your changes that caused the regression. Why do you say it is someone else's fault? It is unreasonable.

@hujun260 could we use a new approach to avoid touching esp32 code?

stop give the new API, just keep original API semantics

please look at the dissusion: #9531

The naming of the API can show the implementation content, which is a perfect naming method.

spin_lock_irqsave contains two functions:

1. spin_lock

2. irqsave

Why do you have to make spin_lock_irqsave contain three functions?

1. spin_lock

2. irqsave

3. sched_lock <--- why?

anchao · 2025-01-30T06:39:27Z

The current API definition is perfect, and the function name can identify the implementation details of each function.

spin_lock: spin lock
spin_lock_nopreempt: spin_lock + sched_lock
spin_lock_irqsave: spin lock + irq save
spin_lock_irqsave_nopreempt: spin_lock + irq save + sched_lock

And your expected API

spin_lock: spin lock + sched_lock <--- why?
spin_lock_preempt: spin lock <--- why?

nuttx is not Linux, please do not introduce the old implementation in Linux into Nuttx

xiaoxiang781216 · 2025-01-30T07:01:00Z

The current API definition is perfect, and the function name can identify the implementation details of each function.
spin_lock: spin lock
spin_lock_nopreempt: spin_lock + sched_lock
spin_lock_irqsave: spin lock + irq save
spin_lock_irqsave_nopreempt: spin_lock + irq save + sched_lock
And your expected API
spin_lock: spin lock + sched_lock <--- why?

so, you don't read my reply carefully, let me copy the example again:

thread0 on CPU0 hold a spinlock without hold sched lock
thread1 on CPU1 wake up thread2
OS decide suspend thread0 and resume thread2 on CPU0
thread1 on CPU1 want to hold the spinlock hold by thread0

but thread1 is ver hard to know when it can finish the busy wait loop since thread0 is already in the suspend state.
so, it's VERY VERY dangerous to schedule a thread which hold a spinlock.

and the issue @patacongo mention before: #9531 (comment)
spin lock can't use standalone without sched lock.

spin_lock_preempt: spin lock <--- why?

same it is unsafe, should be used with the caution.

nuttx is not Linux, please do not introduce the old implementation in Linux into Nuttx

This behavior isn't Linux specific, ALL OS support spin_lock must disable schedule too, otherwise the deadlock will happen naturally. Please reread the above case again.

anchao · 2025-01-30T07:09:27Z

so, you don't read my reply carefully, let me copy the example again:

thread0 on CPU0 hold a spinlock without hold sched lock

thread1 on CPU1 wake up thread2

OS decide suspend thread0 and resume thread2 on CPU0

thread1 on CPU1 want to hold the spinlock hold by thread0

but thread1 is ver hard to know when it can finish the busy wait loop since thread0 is already in the suspend state. so, it's VERY VERY dangerous to schedule a thread which hold a spinlock.

and the issue @patacongo mention before: #9531 (comment) spin lock can't use standalone without sched lock.

You should use spin_lock_nopreempt() instead of spin_lock()
I have answered this question many times, I will not answer this question again!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
You should use spin_lock_nopreempt() instead of spin_lock()
I have answered this question many times, I will not answer this question again!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

spin_lock_preempt: spin lock <--- why?

same it is unsafe, should be used with the caution.

nuttx is not Linux, please do not introduce the old implementation in Linux into Nuttx

This behavior isn't Linux specific, ALL OS support spin_lock must disable schedule too, otherwise the deadlock will happen naturally. Please reread the above case again.

anchao · 2025-01-30T07:11:25Z

if your code does not enable the interrupt controller and schedler at boot time, why does spin_lock require sched_lock and irq_save?

anchao · 2025-01-30T07:13:13Z

I hope you read my reply carefully. Which of the following 2 options do you pick?

spin_lock: spin lock
spin_lock_nopreempt: spin_lock + sched_lock
spin_lock_irqsave: spin lock + irq save
spin_lock_irqsave_nopreempt: spin_lock + irq save + sched_lock

or

spin_lock: spin lock + sched_lock
spin_lock_preempt: spin_lock
spin_lock_irqsave: spin lock + irq save + sched_lock
spin_lock_irqsave_preempt: spin_lock + irq save

xiaoxiang781216 · 2025-01-30T07:14:51Z

so, you don't read my reply carefully, let me copy the example again:

thread0 on CPU0 hold a spinlock without hold sched lock

thread1 on CPU1 wake up thread2

OS decide suspend thread0 and resume thread2 on CPU0

thread1 on CPU1 want to hold the spinlock hold by thread0

but thread1 is ver hard to know when it can finish the busy wait loop since thread0 is already in the suspend state. so, it's VERY VERY dangerous to schedule a thread which hold a spinlock.
and the issue @patacongo mention before: #9531 (comment) spin lock can't use standalone without sched lock.

You should use spin_lock_nopreempt() instead of spin_lock() I have answered this question many times, I will not answer this question again!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! You should use spin_lock_nopreempt() instead of spin_lock() I have answered this question many times, I will not answer this question again!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Ok, could you give me an example, when can you use spin_lock API? the interrupt can be happened without your control, which mean spin_lock CAN'T BE CALLED in any thread context.

anchao · 2025-01-30T07:15:18Z

The advantage of the first option is that all code does not need to be changed, including the ESP third-party library, and the historical API behavior is the same as before.

For the second option, 90% of the places in the kernel that use spin_lock_irqsave need to be replaced with spin_lock_irqsave_preempt, and all implementations that are not in the nuttx repo also need to be re-modified

anchao · 2025-01-30T07:17:03Z

so, you don't read my reply carefully, let me copy the example again:

thread0 on CPU0 hold a spinlock without hold sched lock

thread1 on CPU1 wake up thread2

OS decide suspend thread0 and resume thread2 on CPU0

thread1 on CPU1 want to hold the spinlock hold by thread0

but thread1 is ver hard to know when it can finish the busy wait loop since thread0 is already in the suspend state. so, it's VERY VERY dangerous to schedule a thread which hold a spinlock.
and the issue @patacongo mention before: #9531 (comment) spin lock can't use standalone without sched lock.

You should use spin_lock_nopreempt() instead of spin_lock() I have answered this question many times, I will not answer this question again!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! You should use spin_lock_nopreempt() instead of spin_lock() I have answered this question many times, I will not answer this question again!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Ok, could you give me an example, when can you use spin_lock API? the interrupt can be happened without your control, which mean spin_lock CAN'T BE CALLED in any thread context.

if your code does not enable the interrupt controller and schedler at boot time, why does spin_lock require sched_lock and irq_save?

bringup phase and ISR and scheduler is not enabled

anchao · 2025-01-30T07:19:49Z

Ok, could you give me an example, when can you use spin_lock API? the interrupt can be happened without your control, which mean spin_lock CAN'T BE CALLED in any thread context.

There are also some places where more fine-grained control can be used, such as

irq_save

...
if (....)
spin_lock

if (spin_is_locked)
spin_unlock

irq_restore

github-actions bot added Area: Drivers Drivers issues Area: File System File System issues Area: OS Components OS Components issues Size: M The size of the change in this PR is medium labels Jan 27, 2025

anchao force-pushed the 25012701 branch from 6feecde to 8a0420b Compare January 27, 2025 04:57

lupyuen approved these changes Jan 27, 2025

View reviewed changes

xiaoxiang781216 reviewed Jan 27, 2025

View reviewed changes

arch/arm/src/am335x/am335x_can.c Outdated Show resolved Hide resolved

xiaoxiang781216 reviewed Jan 27, 2025

View reviewed changes

arch/arm/src/am335x/am335x_can.c Outdated Show resolved Hide resolved

anchao added 8 commits January 28, 2025 09:51

Revert "fs/spinlock: replace no sched_lock() version to improve the p…

8656e75

…erformance" This reverts commit ace7f0c.

Revert "No need to call sched_lock explicitly after call spin_lock_ir…

e6910d0

…qsave, since it will be called in func spin_lock_irqsave." This reverts commit 8f3a2a6.

Revert "Because sched_lock is now called in spin_lock_irqsave, there …

06cb9e6

…is no need to call it explicitly." This reverts commit b0af946.

Revert "sched/wqueue: some minor improve to reduce sched_lock range"

486c182

This reverts commit 2779989.

anchao force-pushed the 25012701 branch from 8a0420b to db76160 Compare January 28, 2025 02:08

github-actions bot removed the Arch: x86_64 Issues related to the x86_64 architecture label Jan 28, 2025

xiaoxiang781216 reviewed Jan 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sched/spin_lock: rename raw_spin_* to spin_*_notrace #15705

sched/spin_lock: rename raw_spin_* to spin_*_notrace #15705

anchao commented Jan 27, 2025 •

edited

Loading

nuttxpr commented Jan 27, 2025

acassis commented Jan 28, 2025

xiaoxiang781216 commented Jan 29, 2025

xiaoxiang781216 Jan 29, 2025

anchao Jan 29, 2025

xiaoxiang781216 Jan 29, 2025

anchao Jan 29, 2025

xiaoxiang781216 commented Jan 29, 2025

anchao commented Jan 29, 2025

anchao commented Jan 29, 2025

xiaoxiang781216 commented Jan 29, 2025 •

edited

Loading

anchao commented Jan 30, 2025 •

edited

Loading

xiaoxiang781216 commented Jan 30, 2025

anchao commented Jan 30, 2025

anchao commented Jan 30, 2025

xiaoxiang781216 commented Jan 30, 2025 •

edited

Loading

anchao commented Jan 30, 2025

anchao commented Jan 30, 2025

anchao commented Jan 30, 2025

xiaoxiang781216 commented Jan 30, 2025

anchao commented Jan 30, 2025

anchao commented Jan 30, 2025

anchao commented Jan 30, 2025

sched/spin_lock: rename raw_spin_* to spin_*_notrace #15705

Are you sure you want to change the base?

sched/spin_lock: rename raw_spin_* to spin_*_notrace #15705

Conversation

anchao commented Jan 27, 2025 • edited Loading

Summary

Impact

Testing

nuttxpr commented Jan 27, 2025

acassis commented Jan 28, 2025

xiaoxiang781216 commented Jan 29, 2025

xiaoxiang781216 Jan 29, 2025

Choose a reason for hiding this comment

anchao Jan 29, 2025

Choose a reason for hiding this comment

xiaoxiang781216 Jan 29, 2025

Choose a reason for hiding this comment

anchao Jan 29, 2025

Choose a reason for hiding this comment

xiaoxiang781216 commented Jan 29, 2025

anchao commented Jan 29, 2025

anchao commented Jan 29, 2025

xiaoxiang781216 commented Jan 29, 2025 • edited Loading

anchao commented Jan 30, 2025 • edited Loading

xiaoxiang781216 commented Jan 30, 2025

anchao commented Jan 30, 2025

anchao commented Jan 30, 2025

xiaoxiang781216 commented Jan 30, 2025 • edited Loading

anchao commented Jan 30, 2025

anchao commented Jan 30, 2025

anchao commented Jan 30, 2025

xiaoxiang781216 commented Jan 30, 2025

anchao commented Jan 30, 2025

anchao commented Jan 30, 2025

anchao commented Jan 30, 2025

anchao commented Jan 27, 2025 •

edited

Loading

xiaoxiang781216 commented Jan 29, 2025 •

edited

Loading

anchao commented Jan 30, 2025 •

edited

Loading

xiaoxiang781216 commented Jan 30, 2025 •

edited

Loading