Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low Throughput Issue with unbound DNS over TLS on Ubuntu 22.04 #1045

Closed
Ji-Peng opened this issue Apr 13, 2024 · 15 comments · Fixed by #1214
Closed

Low Throughput Issue with unbound DNS over TLS on Ubuntu 22.04 #1045

Ji-Peng opened this issue Apr 13, 2024 · 15 comments · Fixed by #1214
Assignees

Comments

@Ji-Peng
Copy link

Ji-Peng commented Apr 13, 2024

I'd like to report a rather peculiar issue that I encountered approximately a year ago. It seems that the reviewers during the evaluation of my paper artifact also stumbled upon this problem. Therefore, I'm curious to understand the root cause of this unusual issue.

I noticed that when testing unbound DNS over TLS (DoT) throughput on Ubuntu 20.04, the throughput was approximately 1000 connections/second. However, when testing the DoT throughput on Ubuntu 22.04, the throughput was unusually low, averaging only around 20 connections/second.

I utilized the configuration file available at: config_file. Within the same directory, you can find the relevant certificate files, ed25519.key and ed25519.pem. In the configuration file, I enabled data statistics using "statistics-interval: 60" to monitor throughput-related data.

The client I used is available at: DoT Timer. Its behavior is straightforward: within a timing loop, it establishes a TLS connection, queries a domain name, and then closes the connection.

Both the client and server communicate within an internal network, and my client implementation is very simple. Thus, I suspect the issue does not stem from the client.

I obtained the latest code directly from the master branch of unbound.

I apologize for not being able to provide a readily available machine to help replicate the issue, as this was a problem I encountered a year ago.

Another clue I can offer is that I suspect the issue is not caused by the computational overhead of the public key cryptography algorithms within OpenSSL. I experimented with using different OpenSSL Engines to accelerate these algorithms and found that regardless of the ENGINE configuration, the throughput remained strangely low.

My current workaround is to use Ubuntu 20.04 instead of 22.04.

I understand that the information provided may not be sufficient to replicate the problem. If you need further details, please let me know, and I'd be happy to assist in reproducing the issue.

@gthess gthess self-assigned this Apr 15, 2024
@gthess
Copy link
Member

gthess commented Apr 15, 2024

If I understand correctly you are using the exact same version of Unbound built from source on two different systems with different results. Could you share your configure command? If you have a running Unbound at the moment, unbound -V would also include it.

@Ji-Peng
Copy link
Author

Ji-Peng commented Apr 15, 2024

Thank you for your response. Upon reviewing my previous records, I found that the configuration command I used was ./configure --with-ssl=/root/local-eng25519, specifying only my custom OpenSSL directory.

I happen to have a Ubuntu 22.04 machine available now, so I will attempt to replicate this issue.

@Ji-Peng
Copy link
Author

Ji-Peng commented Apr 15, 2024

I have now fully replicated the issue. Please let me know what information you need from me.

@Ji-Peng
Copy link
Author

Ji-Peng commented Apr 15, 2024

I've documented nearly all commands and their corresponding outputs when replicating this issue. Please take a look at the debug_unbound. It contains ample information.

If the above details are still insufficient, please let me know what additional information you need.

@gthess
Copy link
Member

gthess commented Apr 15, 2024

Thanks for this! I can already spot something that my coworker also thought would be an issue:

Linked libs: mini-event internal (it uses select), OpenSSL 1.1.1q  5 Jul 2022

Is mini-event also used on a 20.04 system?

@Ji-Peng
Copy link
Author

Ji-Peng commented Apr 15, 2024

Yes, all configurations are same on Ubuntu 20.04.

@gthess
Copy link
Member

gthess commented Apr 15, 2024

I wasn't clear enough or your answer is not clear to me :)
I meant to ask if you get the same output for the unbound -V command on a 20.04 system.

@Ji-Peng
Copy link
Author

Ji-Peng commented Apr 15, 2024

Sorry, I don't have a ubuntu20.04 system to run unbound right now, I just remember that the operation flow including all the commands on the ubuntu20.04 system is exactly the same

@gthess
Copy link
Member

gthess commented Apr 15, 2024

No worries, I'll try to replicate on 20.04 and come back to this.

@Ji-Peng
Copy link
Author

Ji-Peng commented Apr 15, 2024

Now that it's late at night in China, I can use the Libevent library to link and rerun unbound on my 22.04 system tomorrow

@Ji-Peng
Copy link
Author

Ji-Peng commented Apr 16, 2024

I've used libevent to link and rerun the previously described experimental steps. Here's the configuration: ./configure --with-ssl=/root/local-eng25519 --with-libevent.

The output of the unbound -V command is as follows:

Version 1.19.4

Configure line: --with-ssl=/root/local-eng25519 --with-libevent
Linked libs: libevent 2.1.12-stable (it uses epoll), OpenSSL 1.1.1q  5 Jul 2022
Linked modules: dns64 respip validator iterator

BSD licensed, see LICENSE in source package for details.
Report bugs to [email protected] or https://github.com/NLnetLabs/unbound/issues

Unfortunately, the final throughput remains the same, still close to 20 connections/second.

@Ji-Peng
Copy link
Author

Ji-Peng commented Apr 25, 2024

Has there been any further progress regarding this issue?

@gthess
Copy link
Member

gthess commented Apr 25, 2024

I verified that mini-event is also used with a default 20.04 system but I can look further on reproducing next week.

@gthess
Copy link
Member

gthess commented May 3, 2024

I can reproduce the exact same behavior but not sure what is happening. I am comparing 20.04 and 24.04 btw; 24.04 is slow.
The last thing I noticed that is different with tcpdump is that your client seems to be sending RST packets in 24.04. Haven't investigated further though yet.

@gthess
Copy link
Member

gthess commented Jan 10, 2025

This was identified to be the TCP_NODELAY socket option. More information can be found on the #1214 PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants