Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add linearizable support to SQL VSchema management #17401

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

mattlord
Copy link
Contributor

@mattlord mattlord commented Dec 17, 2024

Description

This PR prevents lost writes when using the VTGate SQL API for VSchema management (see issue).

It also lays the foundation for supporting linearizability guarantees for vschemas within Vitess. Please see the tracking issue for the other known pieces.

Related Issue(s)

Checklist

  • "Backport to:" labels have been added if this change should be back-ported to release branches
  • If this change is to be back-ported to previous releases, a justification is included in the PR description
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on CI?
  • Documentation was added or is not required

Copy link
Contributor

vitess-bot bot commented Dec 17, 2024

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • Ensure there is a link to an issue (except for internal cleanup and flaky test fixes), new features should have an RFC that documents use cases and test cases.

Tests

  • Bug fixes should have at least one unit or end-to-end test, enhancement and new features should have a sufficient number of tests.

Documentation

  • Apply the release notes (needs details) label if users need to know about this change.
  • New features should be documented.
  • There should be some code comments as to why things are implemented the way they are.
  • There should be a comment at the top of each new or modified test to explain what the test does.

New flags

  • Is this flag really necessary?
  • Flag names must be clear and intuitive, use dashes (-), and have a clear help text.

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow needs to be marked as required, the maintainer team must be notified.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from vitess-operator and arewefastyet, if used there.
  • vtctl command output order should be stable and awk-able.

@vitess-bot vitess-bot bot added NeedsBackportReason If backport labels have been applied to a PR, a justification is required NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request NeedsWebsiteDocsUpdate What it says labels Dec 17, 2024
@github-actions github-actions bot added this to the v22.0.0 milestone Dec 17, 2024
@mattlord mattlord force-pushed the vschema_topo_version branch 4 times, most recently from 4b4578d to be1ccd1 Compare December 18, 2024 22:11
@mattlord mattlord added Component: Query Serving Type: Enhancement Logical improvement (somewhere between a bug and feature) and removed NeedsWebsiteDocsUpdate What it says NeedsIssue A linked issue is missing for this Pull Request NeedsBackportReason If backport labels have been applied to a PR, a justification is required labels Dec 18, 2024
Signed-off-by: Matt Lord <[email protected]>
@mattlord mattlord force-pushed the vschema_topo_version branch from be1ccd1 to b2e2ba6 Compare December 19, 2024 22:21
@mattlord mattlord force-pushed the vschema_topo_version branch 2 times, most recently from e9319d3 to 1ccc129 Compare January 6, 2025 21:46
Signed-off-by: Matt Lord <[email protected]>
@mattlord mattlord force-pushed the vschema_topo_version branch from 1ccc129 to e02adaf Compare January 6, 2025 22:48
Copy link

codecov bot commented Jan 13, 2025

Codecov Report

Attention: Patch coverage is 75.00000% with 65 lines in your changes missing coverage. Please review.

Project coverage is 67.70%. Comparing base (71ccd6d) to head (d8c9c53).

Files with missing lines Patch % Lines
go/vt/vtctl/vtctl.go 31.25% 22 Missing ⚠️
go/vt/vtgate/executorcontext/vcursor_impl.go 60.00% 8 Missing ⚠️
go/cmd/vtcombo/cli/vschema_watcher.go 0.00% 7 Missing ⚠️
go/vt/topotools/vschema_ddl.go 86.04% 6 Missing ⚠️
go/vt/vtcombo/tablet_map.go 0.00% 5 Missing ⚠️
go/vt/topo/vschema.go 90.00% 4 Missing ⚠️
go/vt/vtctl/grpcvtctldserver/server.go 85.71% 4 Missing ⚠️
go/vt/vtgate/vschema_manager.go 55.55% 4 Missing ⚠️
go/vt/topo/helpers/copy.go 33.33% 2 Missing ⚠️
go/test/utils/diff.go 0.00% 1 Missing ⚠️
... and 2 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #17401      +/-   ##
==========================================
- Coverage   67.71%   67.70%   -0.02%     
==========================================
  Files        1584     1584              
  Lines      254721   254799      +78     
==========================================
+ Hits       172473   172499      +26     
- Misses      82248    82300      +52     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

This reverts commit cd455a0.

Signed-off-by: Matt Lord <[email protected]>
@mattlord mattlord removed the NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work label Jan 14, 2025
Signed-off-by: Matt Lord <[email protected]>
@mattlord mattlord force-pushed the vschema_topo_version branch 6 times, most recently from e1122ea to 1e3a496 Compare January 15, 2025 00:34
@mattlord mattlord force-pushed the vschema_topo_version branch from 1e3a496 to f5c6c63 Compare January 15, 2025 01:29
Signed-off-by: Matt Lord <[email protected]>
return err
}
} else {
// Use the cached version as we are in read-only mode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do we get to this code path? Later on we are calling vc.vm.UpdateVSchema() which is expected to write back to the topo, correct? Should we just return an error here, if it is going to fail anyway later on?

Copy link
Contributor Author

@mattlord mattlord Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the topo connection is read-only, which would be e.g. when using --keyspaces_to_watch (see #8988). In that case it's fine to use the cached version because we know that the subsequent ApplyVSchemaDDL() call will fail. I could return an error here rather than make the Apply call. I left it this way to leave the flow (and unit test) unchanged.

I agree that it's a little awkward, so I'm happy to instead return an error about the topo being in read-only mode when vc.topoServer == nil if others prefer that.

This is the current error that users see:

go/test/endtoend/vtgate/keyspace_watches/keyspace_watch_test.go:        vschemaDDLError = fmt.Sprintf("Error 1105 (HY000): cannot perform Update on keyspaces/%s/VSchema as the topology server connection is read-only",

I'll see if I can maintain that with an earlier failure here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harshit-gangal what do you think? I'll hold off until you have a chance to review. Perhaps the error here doesn't really matter much and I can instead just return:

cannot update the VSchema as the topology server connection is read-only

Copy link
Member

@GuptaManan100 GuptaManan100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One change needs to be made, but otherwise looks good to me. Approving so that PR can be merged after the change has been made. ❤️

Comment on lines +32 to +37
// keyspace's vschema.
type KeyspaceVSchemaInfo struct {
Name string
*vschemapb.Keyspace
version Version
}
Copy link
Member

@GuptaManan100 GuptaManan100 Jan 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This struct is a duplicate of an existing one in this same package. There exists KeyspaceInfo in keyspace.go. The struct definition is as follows -

// KeyspaceInfo is a meta struct that contains metadata to give the
// data more context and convenience. This is the main way we interact
// with a keyspace.
type KeyspaceInfo struct {
	keyspace string
	version  Version
	*topodatapb.Keyspace
}

It pretty much has the same information just with a different nomenclature. I think we should remove this and use that struct directly or vice versa.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KeyspaceInfo is for the topo Keyspace record:

vitess/proto/topodata.proto

Lines 259 to 301 in eaaa206

// A Keyspace contains data about a keyspace.
message Keyspace {
// OBSOLETE string sharding_column_name = 1;
reserved 1;
// OBSOLETE KeyspaceIdType sharding_column_type = 2;
reserved 2;
// OBSOLETE int32 split_shard_count = 3;
reserved 3;
// OBSOLETE ServedFrom served_froms = 4;
reserved 4;
// keyspace_type will determine how this keyspace is treated by
// vtgate / vschema. Normal keyspaces are routable by
// any query. Snapshot keyspaces are only accessible
// by explicit addresssing or by calling "use keyspace" first
KeyspaceType keyspace_type = 5;
// base_keyspace is the base keyspace from which a snapshot
// keyspace is created. empty for normal keyspaces
string base_keyspace = 6;
// snapshot_time (in UTC) is a property of snapshot
// keyspaces which tells us what point in time
// the snapshot is of
vttime.Time snapshot_time = 7;
// DurabilityPolicy is the durability policy to be
// used for the keyspace.
string durability_policy = 8;
// ThrottlerConfig has the configuration for the tablet
// server's lag throttler, and applies to the entire
// keyspace, across all shards and tablets.
ThrottlerConfig throttler_config = 9;
// SidecarDBName is the name of the Vitess sidecar database
// used for various system metadata that is stored in each
// tablet's mysqld instance.
string sidecar_db_name = 10;
}

KeyspaceVSchemaInfo is for the keyspace's vschema:

// Keyspace is the vschema for a keyspace.
message Keyspace {
// If sharded is false, vindexes and tables are ignored.
bool sharded = 1;
map<string, Vindex> vindexes = 2;
map<string, Table> tables = 3;
// If require_explicit_routing is true, vindexes and tables are not added to global routing
bool require_explicit_routing = 4;
// foreign_key_mode dictates how Vitess should handle foreign keys for this keyspace.
ForeignKeyMode foreign_key_mode = 5;
enum ForeignKeyMode {
unspecified = 0;
disallow = 1;
unmanaged = 2;
managed = 3;
}
// multi_tenant_mode specifies that the keyspace is multi-tenant. Currently used during migrations with MoveTables.
MultiTenantSpec multi_tenant_spec = 6;
}

Definitely not the same thing 🙂


// TestVSchemaSQLAPIConcurrency tests that we prevent lost writes when we have
// concurrent vschema changes being made via the SQL API.
func TestVSchemaSQLAPIConcurrency(t *testing.T) {
Copy link
Member

@GuptaManan100 GuptaManan100 Jan 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good test 😍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Query Serving Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
3 participants