Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix warning-sending race #6784

Merged

Conversation

rustyrussell
Copy link
Contributor

This makes us generally less aggressive about disconnecting, so we drain properly. And in the case of a bad reestablish message, we don't disconnect at all, so the peer can error.

@rustyrussell rustyrussell added this to the v23.11 milestone Oct 16, 2023
@rustyrussell rustyrussell force-pushed the guilt/fix-warning-race branch 2 times, most recently from 563df2b to 802df24 Compare October 19, 2023 03:02
@rustyrussell rustyrussell requested a review from cdecker as a code owner October 19, 2023 03:02
@rustyrussell rustyrussell force-pushed the guilt/fix-warning-race branch 3 times, most recently from 771b161 to e1d7902 Compare October 20, 2023 03:23
It definitely needs generated source files.

Signed-off-by: Rusty Russell <[email protected]>
Don't want it to optimize something out!

Signed-off-by: Rusty Russell <[email protected]>
This prepares us for connectd disconnecting more gently: it can
wait for these to all disconnect.

Signed-off-by: Rusty Russell <[email protected]>
This changes some tests which expected us to disconnect.

Signed-off-by: Rusty Russell <[email protected]>
…ng warnings

We generalize the current df-only "aborted" flag (and invert it) to a
"disconnected" flag in the peer status message.

We convert it back to the aborted flag for now inside subd.c, but that's
next.

Signed-off-by: Rusty Russell <[email protected]>
Reverts 6203d25 "pytest: fix flake in upfront warning." now we've actually fixed the cause.

Signed-off-by: Rusty Russell <[email protected]>
We truncate the file on stop(), but don't re-created it on start().

We didn't notice it before, but the net

Signed-off-by: Rusty Russell <[email protected]>
Previously, we would forward the message to a subd, but now we have
the case where the subd is gone, but we're still connected.  If the
peer anything but a reestablish in that state, we drop the connection.

Instead, an error should always make us fail the channel.

Signed-off-by: Rusty Russell <[email protected]>
…ning.

This gives the peer a chance to send an error, which will make us drop
to chain.

Signed-off-by: Rusty Russell <[email protected]>
Fixes: ElementsProject#5818
It was intermittant before: I added a sleep(1) in the code before
sending the error (temporarily) to make it always triggers.

Signed-off-by: Rusty Russell <[email protected]>
We do it here, but it's not necessary, and we also deprive them of the
chance to do so (since we kill them).

Signed-off-by: Rusty Russell <[email protected]>
@rustyrussell rustyrussell force-pushed the guilt/fix-warning-race branch from 61bfdcd to b85431b Compare October 22, 2023 04:07
close() is allowed to lose data, and I saw this in CI:

```
2023-10-22T05:12:36.6576005Z ____________________________ test_permfail_htlc_out ____________________________
2023-10-22T05:12:36.6608511Z [gw2] linux -- Python 3.8.18 /home/runner/.cache/pypoetry/virtualenvs/cln-meta-project-AqJ9wMix-py3.8/bin/python
2023-10-22T05:12:36.6611663Z 
2023-10-22T05:12:36.6614768Z node_factory = <pyln.testing.utils.NodeFactory object at 0x7f381039a5e0>
2023-10-22T05:12:36.6623694Z bitcoind = <pyln.testing.utils.BitcoinD object at 0x7f38103c0400>
2023-10-22T05:12:36.6627092Z executor = <concurrent.futures.thread.ThreadPoolExecutor object at 0x7f38103c0ee0>
2023-10-22T05:12:36.6627701Z 
2023-10-22T05:12:36.6628051Z     def test_permfail_htlc_out(node_factory, bitcoind, executor):
2023-10-22T05:12:36.6631192Z         # Test case where we fail with unsettled outgoing HTLC.
2023-10-22T05:12:36.6634154Z         disconnects = ['+WIRE_REVOKE_AND_ACK', 'permfail']
2023-10-22T05:12:36.6635106Z         l1 = node_factory.get_node(options={'dev-no-reconnect': None})
2023-10-22T05:12:36.6637321Z         # Feerates identical so we don't get gratuitous commit to update them
2023-10-22T05:12:36.6642691Z         l2 = node_factory.get_node(disconnect=disconnects,
2023-10-22T05:12:36.6644734Z                                    feerates=(7500, 7500, 7500, 7500))
2023-10-22T05:12:36.6647205Z     
2023-10-22T05:12:36.6649671Z         l1.rpc.connect(l2.info['id'], 'localhost', l2.port)
2023-10-22T05:12:36.6650460Z         l2.daemon.wait_for_log('Handed peer, entering loop')
2023-10-22T05:12:36.6654865Z         l2.fundchannel(l1, 10**6)
2023-10-22T05:12:36.6655305Z     
2023-10-22T05:12:36.6657810Z         # This will fail at l2's end.
2023-10-22T05:12:36.6660554Z         t = executor.submit(l2.pay, l1, 200000000)
2023-10-22T05:12:36.6662947Z     
2023-10-22T05:12:36.6665147Z         l2.daemon.wait_for_log('dev_disconnect permfail')
2023-10-22T05:12:36.6668530Z         l2.wait_for_channel_onchain(l1.info['id'])
2023-10-22T05:12:36.6671588Z         bitcoind.generate_block(1)
2023-10-22T05:12:36.6674510Z >       l1.daemon.wait_for_log('Their unilateral tx, old commit point')
2023-10-22T05:12:36.6675001Z 
2023-10-22T05:12:36.6675212Z tests/test_closing.py:3027: 
...
2023-10-22T05:12:36.8784390Z lightningd-2 2023-10-22T04:41:04.448Z DEBUG   0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-connectd: dev_disconnect: +WIRE_REVOKE_AND_ACK (WIRE_REVOKE_AND_ACK)
2023-10-22T05:12:36.8786260Z lightningd-2 2023-10-22T04:41:04.452Z INFO    0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#1: Peer connection lost
2023-10-22T05:12:36.8788076Z lightningd-2 2023-10-22T04:41:04.453Z DEBUG   0266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c03518-channeld-chan#1: Status closed, but not exited. Killing
2023-10-22T05:12:36.8789915Z lightningd-1 2023-10-22T04:41:04.454Z INFO    022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-channeld-chan#1: Peer connection lost
```

Note that l1 doesn't receive WIRE_REVOKE_AND_ACK!

Signed-off-by: Rusty Russell <[email protected]>
@rustyrussell rustyrussell merged commit 1d61edf into ElementsProject:master Oct 23, 2023
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant