-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crosscluster/logical: enable automatic bidi replication #138297
Conversation
It looks like your PR touches SQL parser code but doesn't add or edit parser tests. Please make sure you add or edit parser tests if you edit the parser. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
62474e0
to
84d69ee
Compare
…ine tables This patch is required for cockroachdb#138297, where an LDR stream is created from a source that is offline. The subsequent PR will add test coverage to this. Epic: none Release note: none
…ine tables This patch is required for cockroachdb#138297, where an LDR stream is created from a source that is offline. The subsequent PR will add test coverage to this. Epic: none Release note: none
3eee766
to
a254d02
Compare
…ine tables This patch is required for cockroachdb#138297, where an LDR stream is created from a source that is offline. The subsequent PR will add test coverage to this. Epic: none Release note: none
First commit getting reviewed in #138680 |
The stress race failure is likely caused by admission control. In the logs, i see a lot of:
|
a254d02
to
1e13926
Compare
…ine tables This patch is required for cockroachdb#138297, where an LDR stream is created from a source that is offline. The subsequent PR will add test coverage to this. Epic: none Release note: none
1e13926
to
9b5a8e7
Compare
…tables This patch is required for cockroachdb#138297, where an LDR stream is created from a source that is offline. The subsequent PR will add test coverage to this. Epic: none Release note: none
@@ -406,6 +406,26 @@ func (p *partitionedStreamClient) CreateForTables( | |||
return spec, nil | |||
} | |||
|
|||
func (p *partitionedStreamClient) SetupReverseStream( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like SetupReverseStream
is a bit of an odd man out compared to the other methods on the client. The other methods could be conceivably implemented over something like GRPC. SetupReverseStream
on the other hand is accepting a sql command and is basically conn.Exec
with a traced name.
I think this would match the pattern better if it took a struct that described the desired stream and internally constructed the statement.
Alternatively, I think this should be named something that makes it clear what it really is: like ExecStatement
or even expose the underlying pg connection object and run the statement on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good callout. I'll refactor this to be ExecStatement
.
I prefer that over your first suggestion as it's much easier to construct the reverse stream statement during the planning of the original ldr stmt, where we access to the OG parsed statement. Then, since this needs to be called during job execution, i merely need to persist the cmd instead of all the components of command.
jobID := p.ExecCfg().JobRegistry.MakeJobID() | ||
var reverseStreamCmd string | ||
if stmt.CreateTable && options.BidirectionalURI() != "" { | ||
// TODO: validate URI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validating the URI is a bit tricky. We really need the peer cluster to validate it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really need the peer cluster to validate it.
Good point. I can plan to pass it as an extra arg to CreateForTables
, so the og source can validate it. Sounds reasonable?
138765: rowexec: add simple benchmark for index backfills r=rafiss a=rafiss This benchmark is not super realistic, since it runs in memory, but it still should be useful for tracking changes over time. Epic: CRDB-42901 Release note: None 138786: sql: add gauge for object count r=annrpom a=annrpom ### sql: remove redundant metrics Epic: none Release note (backwards-incompatible change): Several metrics are redundant and have been removed. Below is a mapping from removed metrics to an existing, identical metric: ``` [REMOVED] sql.schema_changer.running ---- jobs.schema_change.currently_running [REMOVED] sql.schema_changer.successes ---- jobs.schema_change.resume_completed [REMOVED] sql.schema_changer.retry_errors ---- jobs.schema_change.resume_retry_error [REMOVED] sql.schema_changer.permanent_errors ---- jobs.schema_change.resume_failed ``` --- ### sql: add gauge for object count This patch introduces a metric, `sql.schema_changer.object_count`, that keeps track of the count of descriptors in the cluster. Epic: none Fixes: #134740 Release note (ops change): This patch introduces a metric, `sql.schema_changer.object_count`, that keeps track of the count of objects in the cluster. 138823: sql/externalcatalog: allow ExtractInternalCatalog to include offline tables r=fqazi a=msbutler This patch is required for #138297, where an LDR stream is created from a source that is offline. The subsequent PR will add test coverage to this. Epic: none Release note: none Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Annie Pompa <[email protected]> Co-authored-by: Michael Butler <[email protected]>
This patch introduces two options to CREATE LOGICALLY REPLICATED TABLES: - UNIDIRECTIONAL: to specify a unidectional ldr stream with a fast offline scan - BIDIRECTIONAL ON {uri}: which customers will use to set up a bidirectional stream with one command. The uri provided by the customer connects to the source of the reverse stream, aka the destination of the main stream. The PARENT option will be used internally to set up the reverse stream. Specifically, when the destination sets up the reverse stream, it will pass its jobID as the parent for the reverse stream, which the reverse stream will use to prevent creating duplicate reverse streams and to configure a stream that replicates from an offline source. Customers should not directly use it. Epic: none Release note: none
This destination will use this api to set up a reverse stream from a cursor. Epic: none Release note: none
For mysterious resons, the resolver cannot fetch offline descriptors unless the mutable flag is passed. Epic: none Release note: none
This patch allows a user to create an LDR stream that replicates from an offline table. Epic: none Release note: none
This patch enables a user to create a bidirectional replication stream with one command. When the user runs `CREATE LOGICALLY REPLICATED TABLE tabB FROM tabA ON uriA WITH BIDIRECTIONAL ON uriB`, the following will happen: - on cluster B, during job planning, construct the reverse stream cmd, while we still have access to original LDR cmd. This command will have the following form: `CREATE LOGICAL REPLICATION STREAM FROM TABLE tabB ON 'uriB' INTO TABLE tabA WITH CURSOR = $1, PARENT = '{og job id}';` - on cluster B, create table tabB and begin an offline scan, as normal - on cluster B, once the offline scan completes, but before tabB is published, set up the reverse stream from B to A at a cursor time after the initial scan completes but before the tabB has come online. This cursor time prevents data looping and ensures the replication of all future foreground writes. - on cluster B, steady state replication begins. Epic: none Release note (sql change): when a user runs CREATE LOGICALLY REPLICATED TABLE, they must specify one of the following options: - UNIDIRECTIONAL: setups a unidirectional stream with fast initial scan - BIDIRECTIONAL ON {dest uri}: sets up a bidirectional stream from the original dest to the original source.
9b5a8e7
to
15876c2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
TFTR! bors r=jeffswenson |
This patch enables a user to create a bidirectional replication stream with one
command. When the user runs
CREATE LOGICALLY REPLICATED TABLE tabB FROM tabA ON uriA WITH BIDIRECTIONAL ON uriB
, the following will happen:still have access to original LDR cmd. This command will have the following
form:
CREATE LOGICAL REPLICATION STREAM FROM TABLE tabB ON 'uriB' INTO TABLE tabA WITH CURSOR = $1, PARENT = '{og job id}';
set up the reverse stream from B to A at a cursor time after the initial scan
completes but before the tabB has come online. This cursor time prevents data
looping and ensures the replication of all future foreground writes.
Epic: none
Release note (sql change): when a user runs CREATE LOGICALLY REPLICATED TABLE,
they must specify one of the following options:
dest to the original source.