Help with configs and reproduction of results #4

ngsrinivas · 2021-01-15T13:07:01Z

Hi xdp team,

I'm attempting to replicate part of the results from the xdp conext'18 paper, to begin with, under the drop and forward through same interface xdp programs. I'm using xdp_rxq_info for both with actions XDP_DROP and XDP_TX on kernel 5.4

I've been following instructions from https://github.com/tohojo/xdp-paper/blob/master/benchmarks/bench02_xdp_drop.org

I have a few questions. If I can provide any more info, pls let me know.

(i) The xdp paper mentions kernel configurations such as disabling full preemption and retpoline mitigation. would you happen to have (pointers to) any data that discusses the relative importance of these configs to the Mpps per core?

(ii) The retpoline config for the kernel is easy enough to find (CONFIG_RETPOLINE). Could you pls point me to the full-preemption disabling config?

In general, if there is any chance you could provide a diff of your kernel config from the baseline config of the distribution, I would be very grateful.

(iii) I couldn't find concrete documentation (with a quick google search) on PCIe descriptor compression. Apologies if I'm missing something basic. I'd be highly appreciative of any pointers on how you accomplished this.

(iv) after following through with the NIC configs (RSS + ethernet flow control + NIC striding etc., trying a few different RX ring sizes) and IRQ affinity, as described in various scripts in the benchmarks folder, I cannot seem to push single core drop performance beyond 16 Mpps (paper reports ~24 Mpps per core) at 64 bytes.

I'm using a AMD EPYC 7452 32-Core Processor (2312.611 MHz).

Do you think the clock frequency alone explains the discrepancy? (xdp paper reports experiments on 3.6 GHz processor) or should I look elsewhere?

(v) Further, I'm finding that even when using multi-cores, total Mpps across all cores tapers off at around the same value (~16.5 Mpps). Have you observed anything of this sort? Where should I look to determine the problem?

I would be very much grateful for any help on these. Thank you in advance.

Srinivas

The text was updated successfully, but these errors were encountered:

netoptimizer · 2021-01-15T15:42:30Z

That is a lot of questions... I'll try to answer (the ones I can) in separate comments

netoptimizer · 2021-01-15T16:11:01Z

(i) The xdp paper mentions kernel configurations such as disabling full preemption and retpoline mitigation. would you happen to have (pointers to) any data that discusses the relative importance of these configs to the Mpps per core?

Today the kernel have fixes/workarounds for the overhead of retpoline (CONFIG_RETPOLINE).
Thus, the retpoline (compile time) setting isn't that important to reproduce our results with today's kernel.

XDP was actually not as affected as other parts of the kernel by CONFIG_RETPOLINE.
One XDP area what was affected was DMA mappings combined with XDP_REDIRECT.
Measurements and fixes are documented here: https://github.com/xdp-project/xdp-project/tree/master/areas/dma

netoptimizer · 2021-01-15T16:26:48Z

In general, if there is any chance you could provide a diff of your kernel config from the baseline config of the distribution, I would be very grateful.

You have to do the diff yourself.

I found the kernel config on my testlab machine, I have add/committed the kernel config to:
https://github.com/xdp-project/xdp-paper/tree/master/benchmarks/ config-4.17.0-rc7-bpf-next-xdp-paper02

I though that I already did this, but I guess forgot.

netoptimizer · 2021-01-15T16:35:11Z

(ii) The retpoline config for the kernel is easy enough to find (CONFIG_RETPOLINE). Could you pls point me to the full-preemption disabling config?

Hmm... did we write that we disabled preemption?

Looking at the config I just uploaded, it looks like we have enabled preemption:

CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y
# CONFIG_DEBUG_PREEMPT is not set

netoptimizer · 2021-01-15T17:41:06Z

(iii) I couldn't find concrete documentation (with a quick google search) on PCIe descriptor compression. Apologies if I'm missing something basic. I'd be highly appreciative of any pointers on how you accomplished this.

I guess, you are talking about the driver priv-flags used e.g. in benchmarks/bench01_baseline.org and bench02_xdp_drop.org

ethtool --set-priv-flags DEVICE
ethtool --show-priv-flags DEVICE

netoptimizer · 2021-01-15T17:49:04Z

(iv) after following through with the NIC configs (RSS + ethernet flow control + NIC striding etc., trying a few different RX ring sizes) and IRQ affinity, as described in various scripts in the benchmarks folder, I cannot seem to push single core drop performance beyond 16 Mpps (paper reports ~24 Mpps per core) at 64 bytes.

I'm using a AMD EPYC 7452 32-Core Processor (2312.611 MHz).

Do you think the clock frequency alone explains the discrepancy? (xdp paper reports experiments on 3.6 GHz processor) or should I look elsewhere?

You need to investigate if your CPU supports DDIO/DCA.

Maybe this is related to this issue: xdp-project/xdp-tutorial#170
The issue describe how you can test this...
...can you please help us answer if DDIO works on AMD EPYC CPUs?

There is a scientific article about DDIO/DCA here: https://www.usenix.org/conference/atc20/presentation/farshin

netoptimizer mentioned this issue Jan 15, 2021

Add kernel config used on testlab #5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help with configs and reproduction of results #4

Help with configs and reproduction of results #4

ngsrinivas commented Jan 15, 2021

netoptimizer commented Jan 15, 2021

netoptimizer commented Jan 15, 2021

netoptimizer commented Jan 15, 2021

netoptimizer commented Jan 15, 2021

netoptimizer commented Jan 15, 2021

netoptimizer commented Jan 15, 2021

Help with configs and reproduction of results #4

Help with configs and reproduction of results #4

Comments

ngsrinivas commented Jan 15, 2021

netoptimizer commented Jan 15, 2021

netoptimizer commented Jan 15, 2021

netoptimizer commented Jan 15, 2021

netoptimizer commented Jan 15, 2021

netoptimizer commented Jan 15, 2021

netoptimizer commented Jan 15, 2021