-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sched/semaphore: Move POSIX regulated parts of semaphores into libc #11257
Conversation
Continuation of #11165 |
@pussuw besides ostest, it is important to test using the LTP suite to confirm everything is working! |
@acassis do we have the LTP testsuite readily integrated, or how do people typically run it ? I have ran some of the test cases manually but that is quite a bit of manual work. I noticed that nuttx-apps has an application called ltp, and this can be run with the sim:posix_test target. A few questions though:
|
The |
_SEM_XX need to be kept since many libxxx can be called from both kernel and userspace. |
Yes but this is exactly my question, is any API from libxxx allowed to become a cancellation point via e.g. sem_wait ? What is wrong with just calling the nxsem_ version (i.e. the system call) always from libxxx ? Now the nxsem_ APIs are all available to libc |
since the kernel caller don't want to enable cancellation or change error, but the user space caller want. @patacongo could you give more input? |
[This comments are not really relevant, but I'll keep them here for the POSIX references]. POSIX specifies exactly which APIS must be cancellation, which APIs may optionally be cancellation points are required, and all specifies that any other cancellation point function other than those is forbidden. See 2.9.5 Thread Cancellation in https://pubs.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_09.html "An implementation shall not introduce cancellation points into any other functions specified in this volume of IEEE Std 1003.1-2001." If cancellation points can be introduced in any other way, that would be a serious POSIX violation. POSIX lists those functions which must support cancellation points. It is not an option available to the application; those functions must always support cancellation points. Also lets not introduce any non-standard (i.e., non-POSIX or non-Linux) APIs into the application interface. NuttX is a standards based OS and must not support many non-standard interfaces. |
Sorry, I was not in context and didn't respond well when @xiaoxiang781216 asked me to comment. I leave my mostly irrelevant response in place any way because it does have the important POSIX requirements. There is an unresolved problem withe_SEM_XX in the FLAT build most. Since there is only one copy of libc in that configuration and it will use sem_wait() with the cancellation point in all contexts. That is wrong. However, it works perfectly in the PROTECTED and KERNEL builds where there are two versions of libc, one for applications using sem_wait() and one for the kernel using nxsem_wait(). The kernel version works well in the PROTECTED and KERNEL builds. Using sem_wait() in the libc in the FLAT build is wrong most of the time for application calls because it often creates cancellation points where they are forbidden by POSIX. Calling sem_wait() from within in the OS is always wrong since when called from within the kernel. The fact that sem_wait() works at all in the FLAT build is because of nested cancellation point support. Calling nxsem_wait() is usually correct in all build modes provided that the libc function is not relying on sem_wait() to provide the cancellation point (which would be bad, very coupled design). So I think I agree with you. Calling nxsem_wait() from libc always is the cleaner solution if possible (and also fixes the existing problems in regard to using sem_wait()). |
sem_wait() also sets the errno value. nxsem_wait() does not. If all calls to _SEM_WAIT are changed to nxsem_wait() we also need to be certain that the calling libc function does not depend on sem_wait() to set the errno value only when called from the application space. |
Thanks, the most important reason for me doing this change is to get rid of the _SEM macros, if those cannot be removed the work in this PR is futile and the whole PR should just be closed. I already changed all |
…space This moves all the public POSIX semaphore functions into libc and with this most of the user-space logic is also moved; namely cancel point and errno handling. This also removes the need for the _SEM_XX macros used to differentiate which API is used per user-/kernel mode. Such macros are henceforth unnecessary.
It is still not clear to me how we have ran LTP tests in the past ?
|
Yes, Xiaomi run LTP test internally
You can try https://github.com/apache/nuttx/blob/master/boards/sim/sim/sim/configs/posix_test/defconfig
You need run the test programs which is huge (1000+) manually now, we plan to enable LTP test in github action in the next couple month. |
Thanks for the reply, so it is not trivial to run those tests right now since I don't have any ready made test framework for this. Manually testing all of the cases is out of the question. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you include a bloaty output of the total size change of this PR.
Yes! |
Yes, please take a look at sim/sim/sim/configs/posix_test |
should we merge this patch or wait @pussuw 's size report? |
I'll provide the info on Monday @xiaoxiang781216 |
@xiaoxiang781216 I think it is a good idea to wait! NuttX is increasing very fast and I'm afraid that we cannot fit on MCUs with 32KB Flash anymore! |
@acassis wouldn't it be great if we can automate a size impact summary in github-actions. |
The size difference is almost none, tried with a few targets, pre is before this change, post is after: rv-virt:nsh64
sabre-6quad:nsh
icicle:knsh (kernel mode target with system calls)
The application image size for kernel mode targets will be a bit larger, since libc is duplicated / statically linked to each binary. @PetervdPerk-NXP was there some specific target or some specific bloaty flags you wanted me to use. The standard bloaty ouput is basically the same as using size. |
Summary
This moves the POSIX regulated parts of semaphores into libc, i.e. cancellation points and errno handling, meaning some of the work can now be done directly in user space without kernel involvement.
Impact
Architecturally quite significant, otherwise none (assuming no regressions)
Testing
rv-virt:nsh64 (ostest) / knsh64 / ksmp64 (what can be tested)