Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ofproto-dpif-upcall: Pause revalidators when purging.
This issue has been observed when running traffic tests with a dpdk enabled userspace datapath (though those tests are added in a separate series). However, the described issue also affects the kernel datapath which is why this patch is sent separately. A main thread executing the 'revalidator/purge' command could race with revalidator threads that can be dumping/sweeping the purged flows at the same time. This race can be reproduced (with dpif debug logs) by running the conntrack - ICMP related unit tests with the userspace datapath: 2023-10-09T14:11:55.242Z|00177|unixctl|DBG|received request revalidator/purge[], id=0 2023-10-09T14:11:55.242Z|00044|dpif(revalidator17)|DBG|netdev@ovs-netdev: flow_dump ufid:68ff6817-fb3b-4b30-8412-9cf175318294 <empty>, packets:0, bytes:0, used:never 2023-10-09T14:11:55.242Z|00178|dpif|DBG|netdev@ovs-netdev: flow_del ufid:07046e91-30a6-4862-9048-1a76b5a88a5b recirc_id(0),dp_hash(0),skb_priority(0),in_port(2),skb_mark(0), ct_state(0),ct_zone(0),ct_mark(0),ct_label(0), packet_type(ns=0,id=0), eth(src=a6:0a:bf:e2:f3:f2,dst=62:23:0f:f6:2c:75), eth_type(0x0800),ipv4(src=10.1.1.1,dst=10.1.1.2,proto=17,tos=0, ttl=64,frag=no),udp(src=37380,dst=10000), packets:0, bytes:0, used:never ... 2023-10-09T14:11:55.242Z|00049|dpif(revalidator17)|WARN|netdev@ovs-netdev: failed to flow_get (No such file or directory) ufid:07046e91-30a6-4862-9048-1a76b5a88a5b <empty>, packets:0, bytes:0, used:never 2023-10-09T14:11:55.242Z|00050|ofproto_dpif_upcall(revalidator17)|WARN| Failed to acquire udpif_key corresponding to unexpected flow (No such file or directory): ufid:07046e91-30a6-4862-9048-1a76b5a88a5b ... 2023-10-09T14:11:55.242Z|00183|unixctl|DBG|replying with success, id=0: "" To avoid this race, a first part of the fix is to pause (if not already paused) the revalidators while the main thread is purging the datapath flows. Then a second issue is observed by running the same unit test with the kernel datapath. Its dpif implementation dumps flows via a netlink request (see dpif_flow_dump_create(), dpif_netlink_flow_dump_create(), nl_dump_start(), nl_sock_send__()) in the leader revalidator thread, before pausing revalidators: 2023-10-09T14:44:28.742Z|00122|unixctl|DBG|received request revalidator/purge[], id=0 ... 2023-10-09T14:44:28.742Z|00125|dpif|DBG|system@ovs-system: flow_del ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 recirc_id(0),dp_hash(0), skb_priority(0),in_port(2),skb_mark(0),ct_state(0),ct_zone(0), ct_mark(0),ct_label(0),eth(src=a6:0a:bf:e2:f3:f2, dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=10.1.1.1, tip=10.1.1.2,op=1,sha=a6:0a:bf:e2:f3:f2,tha=00:00:00:00:00:00), packets:0, bytes:0, used:never ... 2023-10-09T14:44:28.742Z|00129|unixctl|DBG|replying with success, id=0: "" ... 2023-10-09T14:44:28.742Z|00006|dpif(revalidator21)|DBG|system@ovs-system: flow_dump ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 <empty>, packets:0, bytes:0, used:never ... 2023-10-09T14:44:28.742Z|00012|dpif(revalidator21)|WARN|system@ovs-system: failed to flow_get (No such file or directory) ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 <empty>, packets:0, bytes:0, used:never 2023-10-09T14:44:28.742Z|00013|ofproto_dpif_upcall(revalidator21)|WARN| Failed to acquire udpif_key corresponding to unexpected flow (No such file or directory): ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 To avoid evaluating already deleted flows, the second part of the fix is to ensure that dumping from the leader revalidator thread is done out of any pause request. As a result of this patch, the unit test "offloads - delete ufid mapping if device not exist - offloads enabled" does not need to waive the random warning logs when purging dp flows. Fixes: 98bb428 ("tests: Add command to purge revalidators of flows.") Acked-by: Eelco Chaudron <[email protected]> Acked-by: Simon Horman <[email protected]> Signed-off-by: David Marchand <[email protected]> Signed-off-by: Ilya Maximets <[email protected]>
- Loading branch information