Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-24948: thd_need_wait_reports() hurts performance #3823

Draft
wants to merge 1 commit into
base: 10.6
Choose a base branch
from

Conversation

knielsen
Copy link
Member

@knielsen knielsen commented Feb 7, 2025

Reduce the cost of thd_rpl_deadlock_check() which is used to do parallel
replication conflict detecton.

  1. Cache the value of thd_need_wait_reports() in InnoDB trx_t, and only call
    thd_rpl_deadlock_check() if required.

  2. Remove the FL_WAITED flag in GTID. This flag was set if a transaction did
    any lock wait inside InnoDB on the master. In optimistic parallel
    replication on the slave, such transaction would not run in parallel with
    any prior transactions. It is not clear what the value of this mechanism is.
    If a transaction did not wait on the master, it can still conflict on the
    slave. And even if it did have to wait on the master, that does not mean
    that it will need to wait on the slave, or that such wait will cause a
    conflict or other problem. So waiting for all prior transactions on the
    slave in FL_WAITED transactions might slow down things rather than speed up.

  3. Only request thd_rpl_deadlock_check() calls for non-replication THDs if
    the --binlog-commit-wait-count option is enabled. Such wait might have to
    timeout, if the waiting commits are blocking other transactions from joining
    the group commit, so the wait reports are used to terminate the wait early.
    After this patch, setting --binlog-commit-wait-count non-zero will only have
    effect for newly created connections.

With these changes, unless the non-default --binlog-commit-wait-count option
is used, the overhead of thd_rpl_deadlock_check() is reduced to a single
conditional in non-parallel-replication threads.

Signed-off-by: Kristian Nielsen [email protected]

Reduce the cost of thd_rpl_deadlock_check() which is used to do parallel
replication conflict detecton.

1. Cache the value of thd_need_wait_reports() in InnoDB trx_t, and only call
thd_rpl_deadlock_check() if required.

2. Remove the FL_WAITED flag in GTID. This flag was set if a transaction did
any lock wait inside InnoDB on the master. In optimistic parallel
replication on the slave, such transaction would not run in parallel with
any prior transactions. It is not clear what the value of this mechanism is.
If a transaction did not wait on the master, it can still conflict on the
slave. And even if it did have to wait on the master, that does not mean
that it will need to wait on the slave, or that such wait will cause a
conflict or other problem. So waiting for all prior transactions on the
slave in FL_WAITED transactions might slow down things rather than speed up.

3. Only request thd_rpl_deadlock_check() calls for non-replication THDs if
the --binlog-commit-wait-count option is enabled. Such wait might have to
timeout, if the waiting commits are blocking other transactions from joining
the group commit, so the wait reports are used to terminate the wait early.
After this patch, setting --binlog-commit-wait-count non-zero will only have
effect for newly created connections.

With these changes, unless the non-default --binlog-commit-wait-count option
is used, the overhead of thd_rpl_deadlock_check() is reduced to a single
conditional in non-parallel-replication threads.

Signed-off-by: Kristian Nielsen <[email protected]>
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@svoj svoj added the MariaDB Foundation Pull requests created by MariaDB Foundation label Feb 10, 2025
Comment on lines 2090 to +2091
const bool rpl= trx->mysql_thd && innodb_deadlock_detect &&
thd_need_wait_reports(trx->mysql_thd);
trx->need_wait_report;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, we could remove the trx->mysql_thd && part. I’d check innodb_deadock_detect last.

Comment on lines 386 to +388
assert_freed();
trx_sys.rw_trx_hash.put_pins(this);
need_wait_report= false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cleaner to clear the need_wait_report in trx_t::commit_cleanup() or similar, and extend assert_freed() with ut_ad(!need_wait_report).

@@ -544,6 +545,7 @@ void trx_disconnect_prepared(trx_t *trx)
trx->read_view.close();
trx_sys.trx_list.freeze();
trx->is_recovered= true;
trx->need_wait_report= false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zero initialization should be guaranteed by a debug assertion that you should add to trx_create().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MariaDB Foundation Pull requests created by MariaDB Foundation
Development

Successfully merging this pull request may close these issues.

4 participants