Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

☂️ Enhance and Stabilise Druid E2E tests #782

Open
1 of 18 tasks
unmarshall opened this issue Apr 4, 2024 · 2 comments
Open
1 of 18 tasks

☂️ Enhance and Stabilise Druid E2E tests #782

unmarshall opened this issue Apr 4, 2024 · 2 comments
Assignees
Labels
area/dev-productivity Developer productivity related (how to improve development) area/quality Output qualification (tests, checks, scans, automation in general, etc.) related area/testing Testing related kind/enhancement Enhancement, improvement, extension
Milestone

Comments

@unmarshall
Copy link
Contributor

unmarshall commented Apr 4, 2024

What you would like to be added:

e2e tests for etcd-druid should only test etcd-druid and not etcd-backup-restore. Eventually remove all backup-restore or etcd specific tests. These should be done in etcd-backup-restore and etcd-wrapper repositories respectively.

  • Use KO to build images so that these are faster.

  • Generate all PKI artifacts to be used for e2e tests. This utility should be re-used for any tests (other than e2e tests) that require PKI artifacts.

  • Use local provider instead of emulators for etcd-druid e2e tests. hostPath can be configured when creating the backup-secret. Kind already comes bundled with rancher's local-path-provisioner making it possible to use local store as backup-bucket. As a consequence we remove hack/e2e-test/infrastructure. This also considerably simplifies our e2e setup and the time it takes to do the setup also reduces. (Do we need to test with(out) a backup-bucket?) - This needs to be decided.

  • Use separate namespace for running e2e tests concurrently (using go-native tests)

  • In skaffold.yaml change the image to just etcd-druid. When you invoke kind-up then it will setup a KIND cluster and a local registry. It will also automatically integrate with this registry. This avoids explicit loading of images via kind load and completely removes the need to push dev images to gcr container registry.

  • Remove ginkgo with native golang tests. We are removing ginkgo usage from druid. A lot of unit and some IT tests have already been migrated and will be merged with Druid Refactor to Address Multiple Controller Conflicts #777 .

  • For tests that have failed, preserve their namespaces so that developers can debug. For tests that have passed cleanup the respective namespaces.

  • Do not stop the kind cluster at the end of the test run. Or at least have an option to not do that. For concourse pipeline mandatory cleanup is required but for local runs where an ability to analyse failure is required there we can switch it off.

  • Need to add proper compaction and copy-backups-task testing to the e2e test suite. (Add copy-backups-task tests to e2e test suite #705)

  • Add secret controller tests to e2e test suite #706

  • Introduce etcd-druid upgrade tests #807

    • Add to CI pipeline with PR branch vs master branch (or previous release)
  • Test backward compatibility to previous druid version (support for downgrade)

  • Test reconciliation after error in previous reconciliation which had caused the etcd cluster to be unready. This test would catch cases such as the one described in Deleting the etcd-bootstrap configmap leads to etcd reconciliation to never succeed #818

  • Enable etcd-components webhook and run the e2e tests to check the functionality offered by this webhoook.

  • (Performance/Benchmark): Record and publish (as logs) startup times for etcd clusters. Ideally these should be recorded as metrics for all clusters managed by druid on dev/staging/canary/live landscapes. This will help understand any deterioration in the startup times across releases.

  • Have capability to put breakpoint to any test and enable debugging from the IDE.

  • Should be able to do fast iterations with quasi hot deploy (golang unfortunately does not support real hot-deploy).

Motivation (Why is this needed?):
E2E tests should be simple, comprehensive, fast and stable.

@unmarshall unmarshall added the kind/enhancement Enhancement, improvement, extension label Apr 4, 2024
@unmarshall unmarshall self-assigned this Apr 4, 2024
@unmarshall unmarshall added area/dev-productivity Developer productivity related (how to improve development) area/quality Output qualification (tests, checks, scans, automation in general, etc.) related area/testing Testing related labels Apr 4, 2024
@shreyas-s-rao shreyas-s-rao changed the title Enhance and Stabilise Druid E2E tests ☂️ Enhance and Stabilise Druid E2E tests Jun 24, 2024
@shreyas-s-rao
Copy link
Contributor

Manual tests that I generally run before merging large PRs, cover different combinations of druid auto-reconcile enabled, single/multi node etcds, backups disabled/enabled (with different providers), TLS disabled/enabled, etc, for various scenarios like:

  • etcd creation
  • reconciliation
  • spec changes
  • scale-up of replicas (with different combinations of TLS disabled/enabled)
  • hibernation/unhibernation (scale down to 0 and back up to original replicas)
  • upgrade of druid from old to new version (with checks for etcd status reconciliation, and later spec reconciliation)
  • compaction jobs
  • copy-backups tasks
Ex: list of manual tests executed before merging #777
TEST NAME Druid Auto-Reconcile Single/Multi Node Backups (provider) Etcd Client TLS Etcd Peer TLS EtcdBR TLS TEST RESULT
Deploy etcd, check reconciliation, hibernate, unhibernate, delete etcd FALSE Single NA FALSE FALSE FALSE TRUE
Deploy etcd, check reconciliation, hibernate, unhibernate, delete etcd FALSE Single NA TRUE TRUE TRUE TRUE
Deploy etcd, check reconciliation, hibernate, unhibernate, delete etcd FALSE Single AWS TRUE TRUE TRUE TRUE
Deploy etcd, check reconciliation, hibernate, unhibernate, delete etcd FALSE Multi NA FALSE FALSE FALSE TRUE
Deploy etcd, check reconciliation, hibernate, unhibernate, delete etcd FALSE Multi NA TRUE TRUE TRUE TRUE
Deploy etcd, check reconciliation, hibernate, unhibernate, delete etcd FALSE Multi AWS TRUE TRUE TRUE TRUE
Deploy etcd, check reconciliation, hibernate, unhibernate, delete etcd FALSE Multi GCP TRUE TRUE TRUE TRUE
Deploy etcd, check reconciliation, hibernate, unhibernate, delete etcd FALSE Multi Azure TRUE TRUE TRUE TRUE
Deploy etcd, check reconciliation, hibernate, unhibernate, delete etcd FALSE Multi Openstack TRUE TRUE TRUE TRUE
Deploy etcd, check reconciliation, hibernate, unhibernate, delete etcd FALSE Multi Local TRUE TRUE TRUE TRUE
Perform etcd spec changes, check if reconciliation triggered FALSE Multi GCP TRUE TRUE TRUE TRUE
Scale-up etcd from single-node non-TLS to multi-node non-TLS, hibernate, unhibernate FALSE Single GCP FALSE FALSE FALSE TRUE
Scale-up etcd from single-node non-TLS to multi-node TLS, hibernate, unhibernate FALSE Single GCP FALSE FALSE FALSE TRUE
Scale-up etcd from single-node TLS to multi-node TLS, hibernate, unhibernate FALSE Single NA TRUE TRUE TRUE TRUE
Upgrade druid from master to #777, check status updates, add reconcile annotation, check reconciliation FALSE Multi GCP TRUE TRUE TRUE TRUE
Deploy etcdcopybackupstask, check success FALSE Multi Local TRUE TRUE TRUE TRUE
Configure compaction with low threshold, populate etcd, check if compaction jobs are triggered and run FALSE Single AWS TRUE TRUE TRUE TRUE
Deploy etcd, check reconciliation, hibernate, unhibernate, delete etcd TRUE Multi GCP TRUE TRUE TRUE TRUE
Perform etcd spec changes, check if reconciliation triggered TRUE Multi GCP TRUE TRUE TRUE TRUE

@unmarshall
Copy link
Contributor Author

#833 introduced namespace separation but this will be completely re-written.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dev-productivity Developer productivity related (how to improve development) area/quality Output qualification (tests, checks, scans, automation in general, etc.) related area/testing Testing related kind/enhancement Enhancement, improvement, extension
Projects
None yet
Development

No branches or pull requests

2 participants