-
-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(radio): async call handling to prevent GX12 switch lockups #5949
Conversation
nb: the underlying issue is not RM radio related, we have failure xTimerPendFunctionCallFromISR() in all radio, but in place where it is less obvious to see, like per10ms or telemetry |
Add anti stacking protection where needed |
Spent some time this morning working with Raph on this issue. It turns out, as we hinted earlier, down to some priority issues, that unfortunately wont have any easy fix. We will introduce some improvements as part of 3.0. However, we discovered a point: the amount of files in directory does critically matter. In any fopen or fcreate operation, the system has to parse some or all of the fat tables, which on this architecture, takes a LOT of time. It is very important for system integrity to keep the number of files in directory to as low as possible, preferably less than a hundred. It is a real issue for SDlogs function which as been changed since Otx days and now generate a file per invocation, leading to huge number of files in LOGS directory and very severely impacting performance. Also, it should be noted that audio packs that number of tracks does also have a negative impact |
On another project I worked on formatting the SD as exFat gave a big performance boost. There is an option to enable exFat in fatfs. I wonder if this might help. |
Worth a try at the very least, we also need to understand the memory impact, but yes, we should investigate |
exFat is patented by Microsoft. Only the Linux Kernel is licensed for free. Every other use of exFat needs a payed MS license to be legal. |
Ok, I guess that rules it out then ! |
Yeah, we're counting down the days to March 2027...
https://patents.google.com/patent/US20090164440A1/en
…On Mon, 3 Mar 2025, 6:06 am 3djc, ***@***.***> wrote:
Ok, I guess that rules it out then !
—
Reply to this email directly, view it on GitHub
<#5949 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJ66KJJRIHQCDKCRBEL7ID2SNQEBAVCNFSM6AAAAABX455JBCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOJSHA4DGOBRHA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
[image: 3djc]*3djc* left a comment (EdgeTX/edgetx#5949)
<#5949 (comment)>
Ok, I guess that rules it out then !
—
Reply to this email directly, view it on GitHub
<#5949 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJ66KJJRIHQCDKCRBEL7ID2SNQEBAVCNFSM6AAAAABX455JBCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOJSHA4DGOBRHA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
So to summarize: This is for 2.11 and main. We keep ensuring ISR are kept extra short, but we are also ensuring the queue does not get filled by duplicated calls. In case long SD event occur, the BSP (and telem and per10ms) call will be certain to have a stacking slot, and those operation will be done when SD unlock. There will be 2 follow up PR (but only for 3.0) to move sdlogs to UI tack and review per10ms launch |
As a follow up, RM has conducted lab testing on this and found the expander switch issue to be completely fixed with it. They will conduct a larger test on production lines, but feel confident the issues is gone |
Does this PR also apply in general to transmitters with internal memory? I have problems with T-Pro v2 and Bumblebbe that lock up, and the cause seems to be high memory usage when doing "a lot" of telemetry... Lockups fortunately do not affect the RC signal, so you do not lose contact with the model, but nothing else works, and you have to remove the battery to turn off and reset the transmitter! |
No. And doing a lot of telemetry doesn't consume memory. Lua does however. |
Yes, when logging telemetry! I fly RC gliders and log vario/altitude, GPS and all ELRS telemetry with 0.3 sec intervals so as not to miss anything! I apply this to all my transmitters, and have problems only with those with internal memory! So is there a problem with this type of memory!? |
Memory is not used when doing that, SD is (could be internal SD chip or SDcard depending on radio) |
I'm flying gliders with a T15/ELRS (no SD card, internal memory) logging ELRS, GPS, Vario and ESC telemetry at 0.1s log period for nearly a year now. No lock ups, in fact no issues at all). Are you running any Lua scripts (lua widgets, function or mixer scripts)? |
The only "active" LUAs used are ELRS and GPSx9L, as I wrote, I have no problems with transmitters with SD cards, only those with internal memory. |
Not sure what is GPSx9L lua, but the one I found seems to have some kind of issues https://github.com/moschotto/OpenTX_GPS_Telemetry/blob/7282328f2fdf0f37f782cf187ad0725a0a162cfb/x7_x9lite/GPSx9L.lua#L87 trying to open and close file every 100ms |
First, I would like to thanks Radiomaster devs for their investigation and contribution to bring to surface this issue. Great work guys.
Bottom line is that we have an issue with async call handling based on xTimerPendFunctionCallFromISR(). We need some time to understand why we have that issue.
In the meantime, this workaround allows the switch to be read even if the call to xTimerPendFunctionCallFromISR() was not successful.
Another part of EdgeTX relying heavily on xTimerPendFunctionCallFromISR() is telemetry handling, but the protocol is handling it so this is not critical, so the workaround is not applied there.