Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New send queue issue. #70

Open
TheUbMunster opened this issue Jun 18, 2024 · 18 comments
Open

New send queue issue. #70

TheUbMunster opened this issue Jun 18, 2024 · 18 comments

Comments

@TheUbMunster
Copy link

When playing, the send queue can (not always) fill up during high levels of activity (e.g., cap jumping around).

It seems that this issue showed up between release 1.3 and 1.4

Investigating cause, will try to update this thread with my suspicions.

@TheUbMunster
Copy link
Author

This may actually be an indication of a Ryujinx bug, still investigating.

@speyejack
Copy link
Collaborator

I dont remember why, but iirc Ryujinx did not handle the way the TCP/UDP setup worked as something wasn't implemented.

@Istador
Copy link
Contributor

Istador commented Jun 18, 2024

Side note:

Due to this bug I was able to reproduce and fix the memory leak in #63 caused by the full send queue leading to a crash after some time of intensive gameplay.

I was able to trigger a full send queue that way on Ryujinx, but couldn't do so on the Switch and assumed that my weak PC emulating the game was the issue of the emulator not being able to keep processing the send queue fast enough. I haven't tested older versions than 1.4.0 (yet).

The changes from 1.3.0 to 1.4.0 also caused issues for Ryujinx on Mac not being able to connect to a TCP server at all.

@TheUbMunster
Copy link
Author

I've noticed that after stopping the "high levels of activity" (cap jumping etc), it "recovers from the queue backlog" rather slowly. With 100/100 in the send queue, on my really high end PC with really good wired internet etc. it takes about 4 seconds for the send queue to fully empty. Maybe ryujinx's TCP send() function """isn't concurrent""" or just really really slow?

@TheUbMunster
Copy link
Author

Also, piplup was testing a handful of commits between 1.3.0 and 1.4.0 and saw no difference in the queue behavior, which makes me further believe that this might be a ryujinx issue. If I remember in the morning, I'll try it on an older version(s) of ryujinx.

@Istador
Copy link
Contributor

Istador commented Jun 18, 2024

I couldn't reproduce this bug on Ryujinx 1.1.1214 with mod versions v1.2.0 or v1.3.0, but only with v1.4.0. Therefore the issue was introduced somewhere in-between v1.3.0 and v1.4.0.

It's a bit hard to pinpoint the exact commit that introduced the bug.

It's definitely coming from #39 and started to appear with ba7612a. But it could have been introduced by commits earlier in that branch, because they have build errors or don't have the send queue yet, because the send queue was only introduced to the branch in-between by merging d4eabff via d666707.

(Note: the send queue commit d4eabff released with v1.2.0 and didn't had this bug.)

So IMHO this bug was likely introduced somewhere by these changes: 13c873a...ba7612a


Looking at the changes it doesn't really make sense to me why there is any change to the send queue at all.

I mean just looking at the changes in SocketClient::send() the Logger::log() call is now triggered for every packet, but that should have been fixed later with 4171390 by preventing the log for PLAYERINF and HACKCAPINF packets as before (these are the packet types most often send by high cap-jump activity). (IMHO from a clean code perspective the logging code shouldn't be mixed with the socket selection in this way.)

[Edit:] unsuprisingly reverting SocketClient::send() to the previous version didn't fix the issue.

@Istador
Copy link
Contributor

Istador commented Jun 18, 2024

My current hypothesis is that the nn::socket::Poll() call is causing the issue. Likely somehow blocking the socket_log_socket socket for the separate send queue thread.

int result = nn::socket::Poll(pfds, fd_count, this->pollTime);

Commenting out the following polling code and replacing it with a simple return recvTcp(); fixes the send queue issue (though this breaks receiving UDP packets):

const int fd_count = 2;
struct pollfd pfds[fd_count] = {{0}, {0}};
// TCP Connection
pfds[0].fd = this->socket_log_socket;
pfds[0].events = 1;
pfds[0].revents = 0;
// UDP Connection
pfds[1].fd = this->mUdpSocket;
pfds[1].events = 1;
pfds[1].revents = 0;
int result = nn::socket::Poll(pfds, fd_count, this->pollTime);
if (result == 0) {
return true;
} else if (result < 0) {
Logger::log("Error occurred when polling for packets\n");
this->socket_errno = nn::socket::GetLastErrno();
return this->tryReconnect();
}
s32 index = -1;
for (int i = 0; i < fd_count; i++){
if (pfds[i].revents & 1) {
index = i;
break;
}
}
switch (index) {
case 0:
return recvTcp();
case 1:
return recvUdp();
default:
return true;
}


There is a difference between switch and emulators regarding the pollTime parameter used in the call.

#if EMU
this->pollTime = 0;
#else
this->pollTime = -1;
#endif

(Setting it to -1 for emulators prevents Ryujinx from sending anything. I assume 0 means wait and block till you receive something. )

@Istador
Copy link
Contributor

Istador commented Jun 18, 2024

Can we maybe get rid of that combined polling by splitting up the receive thread into two separate threads, one for TCP and one for UDP?

@speyejack
Copy link
Collaborator

My current hypothesis is that the nn::socket::Poll() call is causing the issue. Likely somehow blocking the socket_log_socket socket for the separate send queue thread.

I also remember poll being involved in this issue. socket_log_socket being shared is likely why this->pollTime = -1 only causes packets to be processed when both sending and receiving packets.

(Setting it to -1 for emulators prevents Ryujinx from sending anything. I assume 0 means wait and block till you receive something. )

Actually -1 means wait and block until something happens. While 0 means return immediately and this was where the bug was last time with Ryujinx, it had an incredibly long return time when set to 0. There was an experiemental build that seemed to fix it when I looked into this last, however, I dont know why people would still be affected then. I have since lost track of that issue, but last time Shadow was super helpful in finding the build and filing the bug.

Can we maybe get rid of that combined polling by splitting up the receive thread into two separate threads, one for TCP and one for UDP?

We already have a lot of issues with multithreading surrounding the networking code, so if we can avoid more threads we should. I also dont know if it would fix anything considering it would still be trying to use the two sockets in multiple places and might cause similar blocking.

@TheUbMunster
Copy link
Author

Since polltime = 0 on emu is meant to make it non-blocking, and yet

I've noticed that after stopping the "high levels of activity" (cap jumping etc), it "recovers from the queue backlog" rather slowly. With 100/100 in the send queue, on my really high end PC with really good wired internet etc. it takes about 4 seconds for the send queue to fully empty. Maybe ryujinx's TCP send() function """isn't concurrent""" or just really really slow?

The observed "4 second" behavior is really egregious if true.

Is it possible that ryujinx's non-blocking send is ridiculously slow regardless? I wonder if the behavior changes any noticeable amount if set back to -1.

@speyejack
Copy link
Collaborator

Since polltime = 0 on emu is meant to make it non-blocking, and yet ... The observed "4 second" behavior is really egregious if true.

Yes, that is what seems to be the bug, requesting an immediate timeout seems to cause still have very long delays before returning.

Is it possible that ryujinx's non-blocking send is ridiculously slow regardless?

Potentially, but from Istador's testing, it does seem to be the call to poll which matches up with what I remember about this bug. The send still might be the problem and you could do some investigating to check it out.

I wonder if the behavior changes any noticeable amount if set back to -1.

Istador kindly retested the -1 and it seems to still require a 0:

(Setting it to -1 for emulators prevents Ryujinx from sending anything. I assume 0 means wait and block till you receive something. )

If it still behaves like it previously did, whenever the client sees other players move, the client will start sending and receiving packets. However, as soon as the other players stop, the packets stop.

@Istador
Copy link
Contributor

Istador commented Jun 19, 2024

I wonder if the behavior changes any noticeable amount if set back to -1.

Istador kindly retested the -1 and it seems to still require a 0

Note: I tested with an older Ryujinx version 1.1.1214 (from 2024-02-24) that I can't update, so it might be worth checking with a newer version.

@piplup55
Copy link
Contributor

UDP support is very buggy on ryujinx, i found connecting a switch to the server gets ryujinx to use UDP,
i've already sent all of the required info to a ryujinx member, i'm still unsure why TCP just fills the queue it might be a good idea to test the mod on ldn 3.1.3 since it's based on around 1.1.800

@Istador
Copy link
Contributor

Istador commented Jun 20, 2024

Shadow was super helpful in finding the build and filing the bug.

i've already sent all of the required info to a ryujinx member

Are there any issues that can be checked or followed?

@Istador
Copy link
Contributor

Istador commented Jun 20, 2024

I wonder if the behavior changes any noticeable amount if set back to -1.

Istador kindly retested the -1 and it seems to still require a 0

Note: I tested with an older Ryujinx version 1.1.1214 (from 2024-02-24) that I can't update, so it might be worth checking with a newer version.

Here is a build if someone wants to test -1 for Emulators with a newer Ryujinx version: https://github.com/Istador/SuperMarioOdysseyOnline/actions/runs/9592741445

@TheUbMunster
Copy link
Author

Here is a build if someone wants to test -1 for Emulators with a newer Ryujinx version: https://github.com/Istador/SuperMarioOdysseyOnline/actions/runs/9592741445

Just tested on my machine with Ryujinx 1.1.1336 and the issue is still present, since the packet send queue stuff was once related to "whether or not another player is moving on the server", I'll note that I was the only one online.

@speyejack
Copy link
Collaborator

Just tested on my machine with Ryujinx 1.1.1336 and the issue is still present, since the packet send queue stuff was once related to "whether or not another player is moving on the server", I'll note that I was the only one online.

Could you also test when its set to 0?

Are there any issues that can be checked or followed?

Shadow linked this PR on discord . But according to my observations at the time, it didn't fix it, it only helped. Plus it was the send rather than the poll. So, this might be PR might be unrelated.... Hopefully someone finds a solution with Piplup's reports.

@Istador
Copy link
Contributor

Istador commented Jun 20, 2024

The changes from 1.3.0 to 1.4.0 also caused issues for Ryujinx on Mac not being able to connect to a TCP server at all.

I believe that is also related to the nn::socket::Poll() call. I can't really reproduce it, because I don't have a Mac system, but if I change the code to set the result to 0 always, then I can observe the same behaviour that was reported by Mac users (Debug menu shows Socket Connected but Connecting to Server. is always visible). (Setting it to -1 instead leads to a reconnect loop and a game crash after a few seconds.)

int result = nn::socket::Poll(pfds, fd_count, this->pollTime);
if (result == 0) {
return true;
} else if (result < 0) {
Logger::log("Error occurred when polling for packets\n");
this->socket_errno = nn::socket::GetLastErrno();
return this->tryReconnect();
}

Ryujinx_mac_v1 4 0_kyanthemayan

Sceenshot by kyanthemayan on Discord.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants