-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix - scylla-node] - Ansible builtin + when condition start node #372
base: master
Are you sure you want to change the base?
Conversation
well the idea looks OK to me, @vladzcloudius ? |
code wise however you should split the builtin into separate commit (since it's refactoring) both patches/commits should have messages prefixed with and the functional change should describe the change and intended behaviour fix of course thinking about it, there might be more corner cases that this change will unblock and that would require more fixes cleanup task is then done afterwards (after all nodes are added) but I see how your idea can speed up the role run for big clusters, so we should consider it |
…des, skip tasks that miss information
…a-ansible-roles into fix-scylla-node-ansible-2.14
…ommentary sections
Hello @tarzanek, Thanks for the feedback, I've split both commits let me know if it's ok for you. Thanks for the prefix tips I've include it. We do use Let me know if you need anything else to allow this PR. |
Hello @tarzanek , Any update about this PR ? Thanks a lot in advance ! |
Please note that we already have another PR under review that will add support for the Also note that this PR couldn't be merged before fixing the The In general, even though this serves as a workaround, relying on |
Hello @igorribeiroduarte , Thanks for your feedback. For the Let me know if it's not clear I can elaborate more. |
@unkls First of all it's awesome that you enjoy Scylla Role. The main problem is that Role has a few invariants that are being validated in regards to the cluster. Another pieces that are probably going to break are the tasks that are replicating the node configuration from the first node in the DC ensuring that all nodes are configured the same way: "Collect IO settings for the first node of every DC". And we there are a few other places that are going to break unless you execute a role on all nodes of the cluster ( And the thing is - there is no much motivation to restrict the Role using If you do that the execution of a Role from the beginning to the end on the cluster that has already been configured takes about 2 minutes - in your example the execution on the whole cluster and on a single node that bootstraps is going to take exactly the same time. Instrumenting a Node Role to support
While this is not impossible but I find the motivation for this unclear while cost of such an instrumentation - quite significant. |
@vladzcloudius I made a local change related to The above change now waits for all "unhealthy" cluster nodes to be 0. This seems to be a useful check since this step executes after the check for the CQL port step. I ran into an issue deploying a test cluster from scratch were the second node of three would never proceed past the "become healthy" step because only one of three nodes were deployed prior to the check. |
Hello Scylladb community,
First thanks for that amazing ansible role !!
I'm currently using your ansible-role to deploy a cluster from scratch and adding a node into a cluster. I'm using ansible
2.16
to match your requirement (2.8
). Currently when I was using your examples I was facing some issues with 2 tasks :Start scylla non-seeds nodes serially
|Start seeders serially
.This issue append when I'm trying to add a node into the cluster.
My error was due to the
Resolve scylla_broadcast_address
task that was setting the fact only for the current node that run the task.Then when we enter into
start_scylla_service dependent tasks
task, it will iterate over all the servers inside our current inventory and try to catch thescylla_broadcast_address
. This variable is empty when we call the role for one node.Examples :
My inventory is like :
I want to add a new node
scylla-4
from the group[new_node]
.My playbook is like :
I call it like :
ansible-playbook scylla.yaml -i inventories/inventory.yml -u devops --diff -l scylla-4 -t new_node -e "@group_vars/scylla_poc"
When I arrive at the task
scylla_broadcast_address
it will perform the set_fact only for my nodescylla-4
. Once I will run the taskStart seeders serially
ansible will iterate overloop: "{{ groups['scylla'] }}"
but by design this should run only the new node.That will iterate like :
And same pattern append with
Start scylla non-seeds nodes serially
task.My fix proposal is to add a new check to avoid crash errors on set_fact empty for
scylla_broadcast_address
.Let me know if you need more to debug/discuss !
Thanks a lot in advance !