Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

engine_newPayloadV3 RPC Method call #31011

Open
Eduard-Voiculescu opened this issue Jan 9, 2025 · 4 comments
Open

engine_newPayloadV3 RPC Method call #31011

Eduard-Voiculescu opened this issue Jan 9, 2025 · 4 comments
Labels

Comments

@Eduard-Voiculescu
Copy link

Eduard-Voiculescu commented Jan 9, 2025

System information

Geth version: geth version
CL client & version: e.g. lighthouse/nimbus/[email protected]
OS & Version: Linux
Commit hash : (if develop)

Expected behaviour

RPC call should not break geth node running.

Actual behaviour

The issue occurred on op-sepolia running the firehose tracer. It occurred randomly and it doesn't occur on each new block. It would break the firehose tracer if the method engine_newPayloadV3 is called outside of the normal flow between EL <> CL.

Steps to reproduce the behaviour

Kinda hard to reproduce, it occurred randomly after weeks of running a node that had run as expected.
Also, once the node is restarted, it would work as expected without any issues. A hypothesis, would be that the issue could come from https://github.com/ethereum/go-ethereum/blob/master/beacon/blsync/engineclient.go#L95-L119 where there is an async go func running.

Backtrace

"RPC method engine_newPayloadV3 crashed: caller expected to not be in transaction state but we were, this is a bug at block #22064228 (f15fe758377e046d2755bd92da341a86ca1e96f83667b2d8613f929c1798d14e) in transaction b03d74a6fd894daf8deae16893a4b2a0e00aed1d9066f167c50334442b2b54e8 (caller=github.com/ethereum/go-ethereum/eth/tracers/firehose.go:413, init=true, inBlock=true, inTransaction=true, inCall=false)\ngoroutine 210911285 [running]:\ngithub.com/ethereum/go-ethereum/rpc.(*callback).call.func1()\n\tgithub.com/ethereum/go-ethereum/rpc/service.go:199 +0x85\npanic({0x18858e0?, 0xc0a1b10e40?})\n\truntime/panic.go:770 +0x132\ngithub.com/ethereum/go-ethereum/eth/tracers.(*Firehose).panicInvalidState(0xc0224a8000, {0x1bb3fa3, 0x49}, 0xc0a1b10de0?)\n\tgithub.com/ethereum/go-ethereum/eth/tracers/firehose.go:1488 +0x51d\ngithub.com/ethereum/go-ethereum/eth/tracers.(*Firehose).ensureInBlockAndNotInTrx(0xc0224a8000)\n\tgithub.com/ethereum/go-ethereum/eth/tracers/firehose.go:1430 +0x45\ngithub.com/ethereum/go-ethereum/eth/tracers.(*Firehose).OnBlockEnd(0xc0224a8000, {0x0, 0x0?})\n\tgithub.com/ethereum/go-ethereum/eth/tracers/firehose.go:413 +0x10c\ngithub.com/ethereum/go-ethereum/core.(*BlockChain).processBlock.func1()\n\tgithub.com/ethereum/go-ethereum/core/blockchain.go:1899 +0x2d\npanic({0x18858e0?, 0xc0a1b10dd0?})\n\truntime/panic.go:770 +0x132\ngithub.com/ethereum/go-ethereum/eth/tracers.(*Firehose).panicInvalidState(0xc0224a8000, {0x1bb3fa3, 0x49}, 0x7f03e8609648?)\n\tgithub.com/ethereum/go-ethereum/eth/tracers/firehose.go:1488 +0x51d\ngithub.com/ethereum/go-ethereum/eth/tracers.(*Firehose).ensureInBlockAndNotInTrx(0xc0224a8000)\n\tgithub.com/ethereum/go-ethereum/eth/tracers/firehose.go:1430 +0x45\ngithub.com/ethereum/go-ethereum/eth/tracers.(*Firehose).OnBeaconBlockRootStart(0xc0224a8000)\n\tgithub.com/ethereum/go-ethereum/eth/tracers/firehose.go:515 +0xa5\ngithub.com/ethereum/go-ethereum/core.ProcessBeaconBlockRoot({0x66, 0x3, 0x6b, 0x45, 0x9b, 0x4, 0x67, 0x49, 0x46, 0xbc, ...}, ...)\n\tgithub.com/ethereum/go-ethereum/core/state_processor.go:240 +0x94\ngithub.com/ethereum/go-ethereum/core.(*StateProcessor).Process(0xc0224900c0, 0xc063fc3600, 0xc0143f9500, {0xc018c65680, 0x0, 0x0, {0x0, 0x0, 0x0}, 0x0, ...})\n\tgithub.com/ethereum/go-ethereum/core/state_processor.go:80 +0x67a\ngithub.com/ethereum/go-ethereum/core.(*BlockChain).processBlock(0xc01da4c488, 0xc063fc3600, 0xc0143f9500, {0x0?, 0x28e7598?, 0x37560e0?}, 0x0)\n\tgithub.com/ethereum/go-ethereum/core/blockchain.go:1905 +0x2ce\ngithub.com/ethereum/go-ethereum/core.(*BlockChain).insertChain(0xc01da4c488, {0xc05d1c5680, 0x1, 0x1}, 0x0, 0x0)\n\tgithub.com/ethereum/go-ethereum/core/blockchain.go:1824 +0x1e57\ngithub.com/ethereum/go-ethereum/core.(*BlockChain).InsertBlockWithoutSetHead(0xc01da4c488, 0xc063fc3600, 0x0)\n\tgithub.com/ethereum/go-ethereum/core/blockchain.go:2379 +0xdb\ngithub.com/ethereum/go-ethereum/eth/catalyst.(*ConsensusAPI).newPayload(_, {{0x23, 0x42, 0xa9, 0xa5, 0x10, 0xd8, 0x55, 0xe7, 0x79, ...}, ...}, ...)\n\tgithub.com/ethereum/go-ethereum/eth/catalyst/api.go:946 +0x1ab7\ngithub.com/ethereum/go-ethereum/eth/catalyst.(*ConsensusAPI).NewPayloadV3(_, {{0x23, 0x42, 0xa9, 0xa5, 0x10, 0xd8, 0x55, 0xe7, 0x79, ...}, ...}, ...)\n\tgithub.com/ethereum/go-ethereum/eth/catalyst/api.go:622 +0x225\nreflect.Value.call({0xc0231a3e30?, 0xc023502110?, 0x80?}, {0x1b47329, 0x4}, {0xc087e14180, 0x4, 0xc087e14180?})\n\treflect/value.go:596 +0xca6\nreflect.Value.Call({0xc0231a3e30?, 0xc023502110?, 0x3?}, {0xc087e14180?, 0x16?, 0x16?})\n\treflect/value.go:380 +0xb9\ngithub.com/ethereum/go-ethereum/rpc.(*callback).call(0xc023504960, {0x28de358, 0xc091f70eb0}, {0xc0ae0a6e10, 0x13}, {0xc091f70f00, 0x3, 0x4e14cf?})\n\tgithub.com/ethereum/go-ethereum/rpc/service.go:205 +0x36d\ngithub.com/ethereum/go-ethereum/rpc.(*handler).runMethod(0xc01c444000?, {0x28de358?, 0xc091f70eb0?}, 0xc093b604d0, 0x3?, {0xc091f70f00?, 0x7f02c7a04998?, 0xc0a1b10cf0?})\n\tgithub.com/ethereum/go-ethereum/rpc/handler.go:568 +0x3c\ngithub.com/ethereum/go-ethereum/rpc.(*handler).handl ... bytes: 866

Could there be any issue with the updateLoop which would call engine_newPayloadV3 on a fork or something which could cause the node to panic?

@lightclient
Copy link
Member

It looks like the panic is coming from your code firehose.go. If you've registered the tracer to run, it will run when you insert a block into the chain - which happens on engine_newPayloadV3. External tracers are not expected to panic, so if they do, we expose that.

@Eduard-Voiculescu
Copy link
Author

Ok so that's the weird issue I am seeing. The node has been running as expected up to head for weeks on end with no crashes whatsoever.

This is why I kinda have the feeling there is an issue with the async thread calling engine_newPayloadV3. Because the normal call to engine_newPayloadV3 is during the lifecycle of the geth node itself right?

So would it be possible the thread would make the call out of scope of the normal block processing?

@lightclient
Copy link
Member

Can you expand on why you think this is an issue in go-ethereum? The panic is coming from your code here:

https://github.com/streamingfast/go-ethereum/blob/a8d379c482b5ad9efa8e587c028e155fb9401593/eth/tracers/firehose.go#L1305-L1308

You're building a custom view of the call stack in a custom tracer. It's possible your tracer doesn't perfectly represent the true call stack. I don't see how additional calls to engine_newPayloadV3 could affect your representation of the call stack.

We'll need more information from you to diagnosis this. Are you even using the engineclient ? Geth only uses this for blsync at the moment.

@Eduard-Voiculescu
Copy link
Author

Ok I will gather more information on this and keep monitoring it. Thank you for your help @lightclient

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants
@lightclient @Eduard-Voiculescu and others