-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MDEV-24948: thd_need_wait_reports() hurts performance #3823
base: 10.6
Are you sure you want to change the base?
Conversation
Reduce the cost of thd_rpl_deadlock_check() which is used to do parallel replication conflict detecton. 1. Cache the value of thd_need_wait_reports() in InnoDB trx_t, and only call thd_rpl_deadlock_check() if required. 2. Remove the FL_WAITED flag in GTID. This flag was set if a transaction did any lock wait inside InnoDB on the master. In optimistic parallel replication on the slave, such transaction would not run in parallel with any prior transactions. It is not clear what the value of this mechanism is. If a transaction did not wait on the master, it can still conflict on the slave. And even if it did have to wait on the master, that does not mean that it will need to wait on the slave, or that such wait will cause a conflict or other problem. So waiting for all prior transactions on the slave in FL_WAITED transactions might slow down things rather than speed up. 3. Only request thd_rpl_deadlock_check() calls for non-replication THDs if the --binlog-commit-wait-count option is enabled. Such wait might have to timeout, if the waiting commits are blocking other transactions from joining the group commit, so the wait reports are used to terminate the wait early. After this patch, setting --binlog-commit-wait-count non-zero will only have effect for newly created connections. With these changes, unless the non-default --binlog-commit-wait-count option is used, the overhead of thd_rpl_deadlock_check() is reduced to a single conditional in non-parallel-replication threads. Signed-off-by: Kristian Nielsen <[email protected]>
|
const bool rpl= trx->mysql_thd && innodb_deadlock_detect && | ||
thd_need_wait_reports(trx->mysql_thd); | ||
trx->need_wait_report; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, we could remove the trx->mysql_thd &&
part. I’d check innodb_deadock_detect
last.
assert_freed(); | ||
trx_sys.rw_trx_hash.put_pins(this); | ||
need_wait_report= false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be cleaner to clear the need_wait_report
in trx_t::commit_cleanup()
or similar, and extend assert_freed()
with ut_ad(!need_wait_report)
.
@@ -544,6 +545,7 @@ void trx_disconnect_prepared(trx_t *trx) | |||
trx->read_view.close(); | |||
trx_sys.trx_list.freeze(); | |||
trx->is_recovered= true; | |||
trx->need_wait_report= false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zero initialization should be guaranteed by a debug assertion that you should add to trx_create()
.
Reduce the cost of thd_rpl_deadlock_check() which is used to do parallel
replication conflict detecton.
Cache the value of thd_need_wait_reports() in InnoDB trx_t, and only call
thd_rpl_deadlock_check() if required.
Remove the FL_WAITED flag in GTID. This flag was set if a transaction did
any lock wait inside InnoDB on the master. In optimistic parallel
replication on the slave, such transaction would not run in parallel with
any prior transactions. It is not clear what the value of this mechanism is.
If a transaction did not wait on the master, it can still conflict on the
slave. And even if it did have to wait on the master, that does not mean
that it will need to wait on the slave, or that such wait will cause a
conflict or other problem. So waiting for all prior transactions on the
slave in FL_WAITED transactions might slow down things rather than speed up.
Only request thd_rpl_deadlock_check() calls for non-replication THDs if
the --binlog-commit-wait-count option is enabled. Such wait might have to
timeout, if the waiting commits are blocking other transactions from joining
the group commit, so the wait reports are used to terminate the wait early.
After this patch, setting --binlog-commit-wait-count non-zero will only have
effect for newly created connections.
With these changes, unless the non-default --binlog-commit-wait-count option
is used, the overhead of thd_rpl_deadlock_check() is reduced to a single
conditional in non-parallel-replication threads.
Signed-off-by: Kristian Nielsen [email protected]