-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
br: add table filter for log restore #57394
base: master
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
Hi @Tristan1900. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
7c04493
to
8ef978b
Compare
8ef978b
to
02a2318
Compare
02a2318
to
c29b578
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #57394 +/- ##
================================================
+ Coverage 73.0570% 74.2094% +1.1524%
================================================
Files 1689 1709 +20
Lines 467012 467741 +729
================================================
+ Hits 341185 347108 +5923
+ Misses 104851 98852 -5999
- Partials 20976 21781 +805
Flags with carried forward coverage won't be shown. Click here to find out more.
|
f624281
to
6ebc531
Compare
852dcff
to
b5132a9
Compare
d2b68b8
to
d72415c
Compare
04d4ccf
to
f0eb49d
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
2f06bd9
to
924a195
Compare
br/pkg/stream/table_mapping.go
Outdated
} else { | ||
dr.Name = dbInfo.Name.O | ||
} | ||
return nil | ||
} | ||
|
||
func (tc *TableMappingManager) parseTableValueAndUpdateIdMapping(dbID int64, value []byte) error { | ||
func (tm *TableMappingManager) ProcessTableValueAndUpdateIdMapping(dbID int64, tableInfo model.TableInfo) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not pass by pointer, which is like tableInfo *model.TableInfo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And remove tableInfo.ID = tableReplace.TableID
and partitions.Definitions[i].ID = newID
below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call, these are from the rewrite method, not needed here
ee5d347
to
45bcdca
Compare
} | ||
|
||
// collect table history indexed by table id, same id may have different table names in history | ||
if meta.IsTableKey(rawKey.Field) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else if meta.IsTableKey(rawKey.Field) {
br/pkg/stream/table_mapping.go
Outdated
} else { | ||
dr.Name = dbInfo.Name.O | ||
} | ||
return nil | ||
} | ||
|
||
func (tc *TableMappingManager) parseTableValueAndUpdateIdMapping(dbID int64, value []byte) error { | ||
func (tm *TableMappingManager) ProcessTableValueAndUpdateIdMapping(dbID int64, tableInfo model.TableInfo) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And remove tableInfo.ID = tableReplace.TableID
and partitions.Definitions[i].ID = newID
below.
br/pkg/task/restore.go
Outdated
// check during log backup | ||
dbName = name | ||
} else { | ||
log.Warn("did not find db id in full/log backup, "+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps it's better to return an error because it must be a code-level error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we have full backup with --filter="foo*" and PiTR doesn't have any filter, it can have this problem, let me know your thoughts on how to better handle this situation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest LGTM
br/pkg/task/restore.go
Outdated
return | ||
} | ||
|
||
func adjustTablesToRestoreAndCreateTableTracker( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the function is important so it needs some unit tests.
tempIDs = append(tempIDs, id) | ||
} | ||
|
||
// sort to -1, -2, -4, -8 ... etc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tempIDs[i] > tempIDs[j]
does not means upstream(tempIDs[i]) > upstream[tempIDs[j])
, so is it necessary to sort tempIDs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it doesn't, I was thinking to assign id based on first seen first, somewhat like sorted by time in the log backup, I was hoping it can be a bit useful when debugging. The order doesn't suggest anything such as upstream id ordering.
br/pkg/task/restore.go
Outdated
} | ||
|
||
// need to restore the matching table in snapshot restore phase | ||
for _, originalTable := range originalDB.Tables { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to merge filterRestoreFiles
and adjustTablesToRestoreAndCreateTableTracker
without any intermediates. That's because I am worried about the time complexity of the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question, discussed offline, the case of rename/exchange table should be rare so it should be fine, if it's problem we can convert it to be a map in future.
021494a
to
1eee968
Compare
/retest |
@Tristan1900: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
37f61af
to
9b515fe
Compare
} | ||
|
||
// write cf doesn't have short value in it | ||
if value == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we handle the DELETE
operations in the write CF? (Which doesn't have a value.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, I think it's not necessarily needed, we are only tracking the all the table ids at this step, if there is a delete there must be a create table happened before, so that should be tracked already.
br/pkg/restore/log_client/client.go
Outdated
ctx context.Context, | ||
s storage.ExternalStorage, | ||
tableFilter filter.Filter, | ||
piTRTableFilter *utils.PiTRTableTracker, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps name the variable piTRTableTracker
or rename the PiTRTableTracker
to something like TracedPiTRTableFilter
? As tracker.ContainsDB(foo) -> Filtered Out
looks a little bit strange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, I want to avoid using filter here because I feel like the term is overloaded so can cause confusion. let me rename all to be tracker related. but actually tracker.ContainsDB(foo) -> included
func (rc *SnapClient) GetDatabaseMap() map[int64]*metautil.Database { | ||
dbMap := make(map[int64]*metautil.Database) | ||
for _, db := range rc.databases { | ||
dbMap[db.Info.ID] = db | ||
} | ||
return dbMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about maps.Clone
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh it's indexed differently if you mean to clone rc.databases
. one is indexed by db name and this one is db id
br/pkg/task/restore.go
Outdated
@@ -1655,3 +1823,26 @@ func afterTableRestoredCh(ctx context.Context, createdTables []*snapclient.Creat | |||
}() | |||
return outCh | |||
} | |||
|
|||
func convertMapsToSlices( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about make a generic function like:
func convertMapToSlice[K, V any](m map[K]V) []V {
values := make([]T, 0, len(m))
for v := range maps.Values(m) {
values = append(values, v)
}
return values
}
And call it three times?
br/pkg/task/restore.go
Outdated
} | ||
// handle case where current is not in range and original was in range, we need to remove the original from | ||
// restoring | ||
} else if utils.MatchTable(cfg.TableFilter, dbName, start.TableName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering whether the table will be created again by putting meta again (By the meta entry of rename
). I didn't found something that filters out not matched meta keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok not sure about the meaning of this one, will follow up offline
br/pkg/task/restore_test.go
Outdated
}, | ||
} | ||
|
||
// Test case 1: Basic table tracking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When there are many similar cases, would you parameterize those case? like:
type Case struct {
Filter string
PiTRHistories struct{ tid int64, tname string, did int64 }
ExceptsTables [][2]int
ExceptsDBNotExist []int
ExcpetsTableNotExist []int
}
Also you may warp each case with t.Run
, so cases can be run independently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call!
br/pkg/task/stream.go
Outdated
// gc.ratio-threshold = "-1.0", which represents disable gc in TiKV. | ||
func KeepGcDisabled(g glue.Glue, store kv.Storage) (RestoreFunc, string, error) { | ||
func DisableGc(g glue.Glue, store kv.Storage) (RestoreGcFunc, string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go prefers captialize all letters in acronyms. As we are going to rename we may follow this tradition by the way.
func DisableGc(g glue.Glue, store kv.Storage) (RestoreGcFunc, string, error) { | |
func DisableGC(g glue.Glue, store kv.Storage) (RestoreGCFunc, string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, good to know!
0e3162c
to
6940958
Compare
Signed-off-by: Wenqi Mou <[email protected]>
977507c
to
aa216cd
Compare
@Tristan1900: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: close #57613
Problem Summary:
Need table filter for PiTR
What changed and how does it work?
The following happens if a custom filter is specified during PiTR.
Performance:
we are still scanning log meta kv twice, it's just we move the previously id map building step before snapshot happens.
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.