Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check queue is not nil before accessing #93

Closed
wants to merge 1 commit into from

Conversation

C-Pro
Copy link
Contributor

@C-Pro C-Pro commented Oct 14, 2024

I have got a panic in production service.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x2157d16]

goroutine 26814443 [running]:
github.com/centrifugal/centrifuge-go.(*cbQueue).pushOrClose(0x0, 0xc0024a4cd8, 0x0)
	/home/runner/go/pkg/mod/github.com/centrifugal/[email protected]/queue.go:64 +0x36
github.com/centrifugal/centrifuge-go.(*cbQueue).push(...)
	/home/runner/go/pkg/mod/github.com/centrifugal/[email protected]/queue.go:55
github.com/centrifugal/centrifuge-go.(*Client).runHandlerSync(0xc001146780, 0xc0024a4cc0)
	/home/runner/go/pkg/mod/github.com/centrifugal/[email protected]/client.go:635 +0xbb
github.com/centrifugal/centrifuge-go.(*Subscription).handlePublication(0xc000814e00, 0xc004855440)
	/home/runner/go/pkg/mod/github.com/centrifugal/[email protected]/subscription.go:590 +0x107
github.com/centrifugal/centrifuge-go.(*Client).handlePush(0xc001146780, 0xc003f4d380)
	/home/runner/go/pkg/mod/github.com/centrifugal/[email protected]/client.go:719 +0x105
github.com/centrifugal/centrifuge-go.(*Client).handle(0xc001146780, 0xc000bd8f00)
	/home/runner/go/pkg/mod/github.com/centrifugal/[email protected]/client.go:682 +0x151
github.com/centrifugal/centrifuge-go.(*Client).readOnce(0xc001146780, {0x2e920e0?, 0xc0012b8cf0?})
	/home/runner/go/pkg/mod/github.com/centrifugal/[email protected]/client.go:618 +0xa9
github.com/centrifugal/centrifuge-go.(*Client).reader(0xc001146780, {0x2e920e0, 0xc0012b8cf0}, 0x1?)
	/home/runner/go/pkg/mod/github.com/centrifugal/[email protected]/client.go:625 +0x5f
created by github.com/centrifugal/centrifuge-go.(*Client).startReconnecting in goroutine 26814431
	/home/runner/go/pkg/mod/github.com/centrifugal/[email protected]/client.go:961 +0x669

I suspect the centrifuge connection was closed and Client.moveToClosed has set c.cbQueue to nil before Subscription.handlePublication callback was executed and tried to push to the queue.

So I am adding a check that queue is not nil before calling push. I suspect there should be a better way involving status field, but I am not sure what the lifecycle is supposed to be, so taking most straightforward approach.

@FZambia
Copy link
Member

FZambia commented Oct 16, 2024

Thanks! I'll take a closer look soon.

Maybe it's possible to build a reproducer using integration tests which are using real Centrifugo instance now:

docker run -p 8000:8000 centrifugo/centrifugo:v5 centrifugo --client_insecure

I.e. publish to some channel and close the client. I think this should be visible when running a test under stress tool for example. If you will have time to try making it - would be nice, if not - I'll try myself as soon as I start looking.

@C-Pro C-Pro mentioned this pull request Oct 17, 2024
@C-Pro
Copy link
Contributor Author

C-Pro commented Oct 17, 2024

Maybe it's possible to build a reproducer using integration tests which are using real Centrifugo instance now:

Added a test in a separate PR #95 to make it easier to check on both master and this branch.
For me it fails in first couple of runs on master and does not fail in the branch of this pr.
I am running multiple times to catch the race with high confidence:

go test -v -count=25 -failfast -timeout=1h -run TestConcurrentPublishSubscribeDisconnect .

@FZambia
Copy link
Member

FZambia commented Oct 19, 2024

Many thanks, already started looking whether there is a better way to fix, if won't find a better approach – will proceed with your changes. Hopefully during next several days.

@FZambia
Copy link
Member

FZambia commented Oct 20, 2024

Opened #96, it already includes #95 - it addresses the issue by not calling callbacks at all after close, waiting for currently queued callbacks to fire.

@C-Pro C-Pro closed this Oct 20, 2024
@C-Pro C-Pro closed this Oct 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants