Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yamcs Studio 1.5.9 crashing by itself after time #113

Open
nmaas87 opened this issue Nov 2, 2021 · 16 comments
Open

Yamcs Studio 1.5.9 crashing by itself after time #113

nmaas87 opened this issue Nov 2, 2021 · 16 comments

Comments

@nmaas87
Copy link

nmaas87 commented Nov 2, 2021

Hi,
I got an instance of Yamcs Studio 1.5.9 running on a current Lubuntu LTS (20.04) VM.
As this instance is for remote monitoring and kept running for days it is not really been touched a lot, but just kept connected to the server to show telemetry. I realizied now its crashing a lot, sometimes after hours, sometimes after a day. I cannot say how long its actually running before this occurs, but without any user interaction it will at some point just crash and close itself and by itself.
I tried running it from command line - but it does also crash there without leaving any debug messages after the initial connect ones.
How to debug this problem / enable debug messages and find out what causes this issue?

@nmaas87
Copy link
Author

nmaas87 commented Nov 4, 2021

It looks like this problem is more frequent if you put the runner on 90% zoom.

@nmaas87
Copy link
Author

nmaas87 commented Nov 4, 2021

OK correction, it does crash regardless of zoom level. I did crash now after 5 hours with normal/100% zoom.
And I get no debugging info whatsoever.
Could you give me some info on how to debug?
I guess having a hard crash of an OPS Tool during Operations will be bad.

@nmaas87
Copy link
Author

nmaas87 commented Nov 15, 2021

Any info would be good, this is still an issue, also with older versions.

@fqqb
Copy link
Member

fqqb commented Nov 15, 2021

Could be memory-related.
You could use VisualVM or similar while the application is running, to see how it behaves.

Or add the following in your ini file (beneath -vmargs, one argument per line) and then analyze the dump afterwards.

-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/tmp/

@nmaas87
Copy link
Author

nmaas87 commented Nov 16, 2021

Thanks Fabian!
I changed the yamcs-display:~/yamcs-studio-1.5.9$ cat Yamcs\ Studio.ini to following:

-startup
plugins/org.eclipse.equinox.launcher_1.6.0.v20200915-1508.jar
--launcher.library
plugins/org.eclipse.equinox.launcher.gtk.linux.x86_64_1.2.0.v20200915-1442
-vm
plugins/org.eclipse.justj.openjdk.hotspot.jre.full.stripped.linux.x86_64_11.0.2.v20200815-0835/jre/bin
-vmargs
-Xmx2048m
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/tmp/
-Declipse.p2.unsignedPolicy=allow
-Duser.timezone=GMT
-Dosgi.requiredJavaVersion=11
-Dorg.eclipse.update.reconcile=false
--add-modules=ALL-SYSTEM

I will it run (and catch fire) and see when it crashes.
I guess it will create some HeapDump in /tmp/ on crash and then I can see those to you - or how can I analyze them?

Thanks a lot!

@nmaas87
Copy link
Author

nmaas87 commented Nov 17, 2021

Dear @fqqb - you were spot on!
Yamcs Studio seems to suffer from a Memory Leak.
I could see it grow within 8 hours from 707,85 MB reserved RAM to 1367,61 MB reserved RAM - and it was then killed by the system, as it was severly running out of RAM.

Before Yamcs Studio got closed:

              total        used        free      shared  buff/cache   available
Mem:          1,9Gi       1,8Gi        51Mi        28Mi        70Mi        17Mi

After it got closed:

              total        used        free      shared  buff/cache   available
Mem:          1,9Gi       522Mi       1,3Gi        33Mi       162Mi       1,3Gi

For the moment / to safekeep the mission which will need us to have Yamcs Studio run in excess of 10 hours I increased the amount of RAM to 6 GB and let it run now to see where we will end up (but I have to keep in mind that it could crash hours later when it hits the 2048MB limit... So I have to see if I need to increase this also).

I hope this error gets resolved, sadly Yamcs Studio/Java is killed so hard by the OS that it will not write anything to the created files in /tmp, so no debug available.

@nmaas87
Copy link
Author

nmaas87 commented Nov 17, 2021

Additional note: Within the first hours (when there was more than enough RAM available), Yamcs Studio was consuming RAM in the rate of 1.425 MB / Minute without receiving any TM data.

@xpromache
Copy link
Member

Can you run
jmap -histo:live when memory is growing to check if there are some specific classes with the number of objects increasing rapidly?

Do you have scripts in displays? I imagine some of those could cause this problem.
Maybe you can try closing all displays and opening one by one to see which one is consuming memory.

@fqqb
Copy link
Member

fqqb commented Nov 17, 2021

Wait, your system has only 2 GB available?
As per Xmx setting in the ini file, Yamcs Studio's heap is allowed to grow to 2GB. Either lower that, or indeed add make more memory available to your system.

@nmaas87
Copy link
Author

nmaas87 commented Nov 17, 2021

Yeah @fqqb - I thought Yamcs was normally configured for Java default (512 MB) - but saw the 2G line first when I set the debugging info. I am currently trying this on 6 GB of RAM, just to be sure and see how it goes/if it crashes again and then when. I will not have more time for debugging because the mission is coming close and "good enough" works / if I can make it work for enough time. I am using only one display at all - but I am using Javascript and Python scripts in it.

@nmaas87
Copy link
Author

nmaas87 commented Nov 18, 2021

Short headsup:
I started the run yesterday at 08:10 with 6 GB of RAM
@ 13:24,

MiB Mem :   5938,4 total,   3399,6 free,   1587,0 used,    951,8 buff/cache
MiB Swap:      0,0 total,      0,0 free,      0,0 used.   4098,7 avail Mem
   1261 user      20   0  906956 142684  41532 R  85,4   2,3 267:18.09 ffmpeg
   1329 user      20   0 6512104   1,1g  69480 S   3,3  18,6  38:39.65 java

today @ 08:41 / 24 hours later

MiB Mem :   5938,4 total,   1752,9 free,   3168,5 used,   1017,0 buff/cache
MiB Swap:      0,0 total,      0,0 free,      0,0 used.   2512,2 avail Mem
   1261 user      20   0  906956 143168  41760 S  86,7   2,4   1282:22 ffmpeg
   1329 user      20   0 8125932   2,6g  69552 S  27,6  45,3 291:36.02 java

So Yamcs has already risen above the 2048 MB limit in terms of its reserved memory. I will still let it run and see if and when it crashes (I think I really need to set up latest on saturday for real testing of the overall ground segment, until then it can still run and gather data). 24+ hours is ok in my usecase, but with this data I think there is an indication that there is a memory leak.

@nmaas87
Copy link
Author

nmaas87 commented Nov 20, 2021

Ok, in the end it filled up all memory again (6 GB of RAM) and then hardcrashed, so its sadly not because of just having not enough RAM. Here is some memory logging:


new try with 6 GB RAM
2021-11-17 Wednesday 08:10

2021-11-17 13:24:11
MiB Mem :   5938,4 total,   3399,6 free,   1587,0 used,    951,8 buff/cache
MiB Swap:      0,0 total,      0,0 free,      0,0 used.   4098,7 avail Mem
   1261 user      20   0  906956 142684  41532 R  85,4   2,3 267:18.09 ffmpeg
   1329 user      20   0 6512104   1,1g  69480 S   3,3  18,6  38:39.65 java

2021-11-18 Thursday 08:41:13
MiB Mem :   5938,4 total,   1752,9 free,   3168,5 used,   1017,0 buff/cache
MiB Swap:      0,0 total,      0,0 free,      0,0 used.   2512,2 avail Mem
   1261 user      20   0  906956 143168  41760 S  86,7   2,4   1282:22 ffmpeg
   1329 user      20   0 8125932   2,6g  69552 S  27,6  45,3 291:36.02 java

2021-11-18 16:31:02
MiB Mem :   5938,4 total,   1090,1 free,   3823,7 used,   1024,7 buff/cache
MiB Swap:      0,0 total,      0,0 free,      0,0 used.   1856,8 avail Mem
   1261 user      20   0  906956 143168  41760 R  86,4   2,4   1688:27 ffmpeg
   1329 user      20   0 8822252   3,3g  69552 S  31,2  56,2 432:36.00 java


2021-11-19 Friday 17:00:00
              total        used        free      shared  buff/cache   available
Mem:          5,8Gi       5,6Gi       110Mi        20Mi       106Mi        27Mi
Swap:            0B          0B          0B
user        1329 23.1 87.3 10755564 5312172 ?    Sl   Nov17 788:29 /home/user/yamcs-studio-1.5.9//plugins/org.eclipse.justj.openjdk.hotspot.jre.full.stripped.linux.x86_64_11.0.2.v20200815-0835/jre/bin/java -Xmx2048m -Declipse.p2.unsignedPolicy=allow -Duser.timezone=GMT -Dosgi.requiredJavaVersion=11 -Dorg.eclipse.update.reconcile=false --add-modules=ALL-SYSTEM -jar /home/user/yamcs-studio-1.5.9//plugins/org.eclipse.equinox.launcher_1.6.0.v20200915-1508.jar -os linux -ws gtk -arch x86_64 -showsplash -launcher /home/user/yamcs-studio-1.5.9/Yamcs Studio -name Yamcs Studio --launcher.library /home/user/yamcs-studio-1.5.9//plugins/org.eclipse.equinox.launcher.gtk.linux.x86_64_1.2.0.v20200915-1442/eclipse_11201.so -startup /home/user/yamcs-studio-1.5.9//plugins/org.eclipse.equinox.launcher_1.6.0.v20200915-1508.jar --launcher.overrideVmargs -exitdata 13 -vm /home/user/yamcs-studio-1.5.9//plugins/org.eclipse.justj.openjdk.hotspot.jre.full.stripped.linux.x86_64_11.0.2.v20200815-0835/jre/bin/java -vmargs -Xmx2048m -Declipse.p2.unsignedPolicy=allow -Duser.timezone=GMT -Dosgi.requiredJavaVersion=11 -Dorg.eclipse.update.reconcile=false --add-modules=ALL-SYSTEM -jar /home/user/yamcs-studio-1.5.9//plugins/org.eclipse.equinox.launcher_1.6.0.v20200915-1508.jar
user        1261 85.2  1.8 906956 112944 pts/0   RLl+ Nov17 2906:56 ffmpeg

Fr 19. Nov 17:10:02 CET 2021
              total        used        free      shared  buff/cache   available
Mem:          5,8Gi       5,6Gi       116Mi        20Mi        88Mi        24Mi
Swap:            0B          0B          0B
user        1329 23.1 87.5 10755564 5320868 ?    Sl   Nov17 791:24 /home/user/yamcs-studio-1.5.9//plugins/org.eclipse.justj.openjdk.hotspot.jre.full.stripped.linux.x86_64_11.0.2.v20200815-0835/jre/bin/java -Xmx2048m -Declipse.p2.unsignedPolicy=allow -Duser.timezone=GMT -Dosgi.requiredJavaVersion=11 -Dorg.eclipse.update.reconcile=false --add-modules=ALL-SYSTEM -jar /home/user/yamcs-studio-1.5.9//plugins/org.eclipse.equinox.launcher_1.6.0.v20200915-1508.jar -os linux -ws gtk -arch x86_64 -showsplash -launcher /home/user/yamcs-studio-1.5.9/Yamcs Studio -name Yamcs Studio --launcher.library /home/user/yamcs-studio-1.5.9//plugins/org.eclipse.equinox.launcher.gtk.linux.x86_64_1.2.0.v20200915-1442/eclipse_11201.so -startup /home/user/yamcs-studio-1.5.9//plugins/org.eclipse.equinox.launcher_1.6.0.v20200915-1508.jar --launcher.overrideVmargs -exitdata 13 -vm /home/user/yamcs-studio-1.5.9//plugins/org.eclipse.justj.openjdk.hotspot.jre.full.stripped.linux.x86_64_11.0.2.v20200815-0835/jre/bin/java -vmargs -Xmx2048m -Declipse.p2.unsignedPolicy=allow -Duser.timezone=GMT -Dosgi.requiredJavaVersion=11 -Dorg.eclipse.update.reconcile=false --add-modules=ALL-SYSTEM -jar /home/user/yamcs-studio-1.5.9//plugins/org.eclipse.equinox.launcher_1.6.0.v20200915-1508.jar
user        1261 85.2  1.8 906956 112572 pts/0   SLl+ Nov17 2915:07 ffmpeg
 
Fr 19. Nov 17:22:18 CET 2021
              total        used        free      shared  buff/cache   available
Mem:          5,8Gi       581Mi       5,1Gi        30Mi       129Mi       5,0Gi
Swap:            0B          0B          0B
user        1261 85.0  2.2 906956 136084 pts/0   RLl+ Nov17 2918:11 ffmpeg

I guess for my usecase its ok at the moment and I will just need to restart Yamcs Studio before going into each DryRun, Testcountdown and the real one - to avoid losing the MCS during flight.

@Spaceless007
Copy link

Hi,

I'm having a similar issue with Yamcs Studio. Was any work done towards the goal of dealing with memory leaks?

@fqqb

@fqqb
Copy link
Member

fqqb commented Nov 17, 2022

@Spaceless007 I would like a heap dump so that I can investigate what is occupying memory. Do you have jmap available, that's a tool that comes with any Java JDK. Then when you notice things are about to go wrong, but before studio crashes, take a dump referencing the PID of Yamcs Studio:

jmap -dump:format=b,file=heap_dump.hprof <PID>

And please share that hprof file (or in private: fdi AT spaceapplications.com )

In absence of a dump, I'll do some tests myself next week in an attempt to reproduce any issue (i'm not currently aware of any).

@unlikelyzero
Copy link

We're seeing this ourselves when using a large MDB. Is there an expectation that this can be fixed with changes on the end user or is it seen as memory management problem which can be fixed?

@fqqb
Copy link
Member

fqqb commented Jun 28, 2023

@unlikelyzero are you able to isolate a cause on your side that eventually causes a crash? I somehow doubt a large MDB is the reason. Maybe a specific platform, display, or widget :-/

Of course I'd like to fix any problems of this kind, however I ran Yamcs Studio for weeks on end connected to a data source, and all was working well... Also I was given some heap dumps from just before a crash, and also those were healthy, suggesting memory is consumed off-heap for whatever reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants