Skip to content
This repository has been archived by the owner on May 26, 2022. It is now read-only.

Calling DialPeer from DialPeer started goroutine deadlocks. #161

Closed
Jorropo opened this issue Feb 24, 2020 · 5 comments
Closed

Calling DialPeer from DialPeer started goroutine deadlocks. #161

Jorropo opened this issue Feb 24, 2020 · 5 comments

Comments

@Jorropo
Copy link

Jorropo commented Feb 24, 2020

While implementing webrtc aside I found a deadlock (still finishing properly in a timeout), calling DialPeer on the same peers that was used to instanciate your goroutine (example: calling host.NewStream or host.Connect from the Dial function of a transport (only if peer ID is the same)).

It seems like the first Connect (starting the transport Dial) is the problem, the second one just wait on the channel as expected.
I think transport dial are somehow tested sequentially that mean for Connect to return to the transport would require Dial in the transport to return (here is the deadlocks).

@Jorropo
Copy link
Author

Jorropo commented Feb 24, 2020

Issue found, that was on the host, I was at one point beffore doing a Connect with a single address (the one of my transport) and so the host store that in the peerstore, and then future connect don't resolve the dht because routedhost only does if no address are avaible.

Should be solved with peerstore address origin.

@Jorropo Jorropo closed this as completed Feb 24, 2020
@Jorropo
Copy link
Author

Jorropo commented Mar 1, 2020

Even with this fixed the bug is still here, while implementing #162 I've found the bug, if a dial already have been started newer address are not used to start a new dial. Gonna be fixed in #167.

@Jorropo Jorropo reopened this Mar 1, 2020
@Stebalien
Copy link
Member

There's no way to fix this issue inside the swarm itself. Transports need to avoid recursively trying to dial the same peer.

@Stebalien Stebalien reopened this Mar 2, 2020
@Stebalien
Copy link
Member

Nevermind. I'm not sure how to fix this in the swarm itself but we need to find some way to detect this.

@Stebalien
Copy link
Member

I've re-reported this issue in libp2p/go-libp2p#816 to give a clear description of what's happening.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants