Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DDL notifier does not guarantee delivery when internal SQL failed to COMMIT #59055

Open
lance6716 opened this issue Jan 21, 2025 · 1 comment · May be fixed by #59157
Open

DDL notifier does not guarantee delivery when internal SQL failed to COMMIT #59055

lance6716 opened this issue Jan 21, 2025 · 1 comment · May be fixed by #59157
Labels
affects-8.5 This bug affects the 8.5.x(LTS) versions. component/ddl This issue is related to DDL of TiDB. severity/major type/bug The issue is confirmed as a bug.

Comments

@lance6716
Copy link
Contributor

Bug Report

defer func() {
if err == nil {
err = errors.Trace(session.Commit(ctx))
} else {
session.Rollback()
}
}()
now := time.Now()
if err = handler(ctx, session.Context, change.event); err != nil {
return errors.Trace(err)
}
if time.Since(now) > slowHandlerLogThreshold {
logutil.Logger(ctx).Warn("Slow process event",
zap.Stringer("handler", handlerID),
zap.Int64("ddlJobID", change.ddlJobID),
zap.Int64("subJobID", change.subJobID),
zap.Stringer("event", change.event),
zap.Duration("duration", time.Since(now)))
}
newFlag := change.processedByFlag | (1 << handlerID)

If handler runs without error, DDL notifier marks the in-memory structure change as processed in line 295. However the handler may still fail in line 277 when commits. The change uses the same SQL transaction so the wrong processed state is not persisted, but the memory state of change is used afterward and causes problems

if change.processedByFlag == n.handlersBitMap {
s3, err3 := n.sysSessionPool.Get()
if err3 != nil {
return errors.Trace(err3)
}
sess4Del := sess.NewSession(s3.(sessionctx.Context))
err3 = n.store.DeleteAndCommit(
ctx,
sess4Del,
change.ddlJobID,
int(change.subJobID),
)

1. Minimal reproduce step (Required)

Will add UT in fix PR

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiDB version? (Required)

@Rustin170506
Copy link
Member

To reproduce it:

func TestTwoHandlers(t *testing.T) {
	// 1. One always fails
	// 2. One always succeeds
	// Make sure events don't get lost after the second handler succeeds.
	store := testkit.CreateMockStore(t)
	tk := testkit.NewTestKit(t, store)
	tk.MustExec("USE test")
	tk.MustExec("DROP TABLE IF EXISTS " + ddl.NotifierTableName)
	tk.MustExec(ddl.NotifierTableSQL)
	s := notifier.OpenTableStore("test", ddl.NotifierTableName)
	sessionPool := util.NewSessionPool(
		1,
		func() (pools.Resource, error) {
			return tk.Session(), nil
		},
		nil,
		nil,
	)
	n := notifier.NewDDLNotifier(sessionPool, s, 50*time.Millisecond)
	// Always fails
	failHandler := func(_ context.Context, sctx sessionctx.Context, _ *notifier.SchemaChangeEvent) error {
		// Mock a duplicate key error
		_, err := sctx.GetSQLExecutor().Execute(context.Background(), "INSERT INTO test."+ddl.NotifierTableName+" VALUES(1, -1, 'some', 0)")
		return err
	}
	// Always succeeds
	successHandler := func(_ context.Context, _ sessionctx.Context, _ *notifier.SchemaChangeEvent) error {
		return nil
	}
	n.RegisterHandler(2, successHandler)
	n.RegisterHandler(1, failHandler)
	n.OnBecomeOwner()
	tk2 := testkit.NewTestKit(t, store)
	se := sess.NewSession(tk2.Session())
	ctx := context.Background()
	event1 := notifier.NewCreateTableEvent(&model.TableInfo{ID: 1000, Name: ast.NewCIStr("t1")})
	err := notifier.PubSchemeChangeToStore(ctx, se, 1, -1, event1, s)
	require.NoError(t, err)
	require.Never(t, func() bool {
		changes := make([]*notifier.SchemaChange, 8)
		result, closeFn := s.List(ctx, se)
		count, err2 := result.Read(changes)
		require.NoError(t, err2)
		closeFn()
		return count == 0
	}, 5*time.Second, 50*time.Millisecond)
}

@jebter jebter added the component/ddl This issue is related to DDL of TiDB. label Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-8.5 This bug affects the 8.5.x(LTS) versions. component/ddl This issue is related to DDL of TiDB. severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants