-
Notifications
You must be signed in to change notification settings - Fork 388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dkron can't be safely used in k8s at the moment #1442
Comments
Possibly fixed in #1446 |
Hey can you try with v4.0.0-beta? this should be fixed by #1446 |
Also there is a significant change in dkron k8s helm. @vcastellm are you going to merge it too? |
And also we are for sure waiting for Dkron v4, but isn't it a good idea to release a patch version of Dkron 3.2.x (with #1446) to provide possibility to use Dkron in k8s now? |
@ivan-kripakov-m10 it would be possible to release a patch version for v3 but I don't see any advantage of it. Can you elaborate on possible use cases of v3 vs v4? |
@vcastellm not sure if I'm supposed to use any extra flags but 4.0.0-beta3 does not fix my issue #1253 (which I believe to be similar to this one) After killing the server (to make it restart), agents report a log like this one
Docker swarm compose (to illustrate configuration) services:
server:
image: dkron/dkron:4.0.0-beta3
command: agent
environment:
#DKRON_NODE_NAME: "{{.Node.Hostname}}"
DKRON_NODE_NAME: dkron1
DKRON_DATA_DIR: /ext/data
DKRON_SERVER: 1
DKRON_BIND_ADDR: tasks.server:8946
DKRON_BOOTSTRAP_EXPECT: 1
deploy:
mode: replicated
replicas: 1
agents:
image: dkron/dkron:4.0.0-beta3
command: agent
environment:
DKRON_NODE_NAME: "{{.Node.Hostname}}"
DKRON_RETRY_JOIN: tasks.server
DKRON_BIND_ADDR: '{{`{{ GetInterfaceIP "eth0" }}:8946`}}'
DKRON_TAG: 'arch={{.Node.Platform.Architecture}} server=false'
deploy:
mode: global |
@vcastellm It appears that speed is the primary focus for me. From what I gather, version 4 will bring numerous modifications to both the user interface and backend. Implementing change #1446 and rolling out a release to enable users to utilize dkron in k8s seems like a more straightforward and quicker task comparing to the extensive v4 update. |
I converted a Dkron test instance with 3 servers and 2 agents to version 4.0.0-beta4. After that I deleted various pods several times, restarted the server's StatefulSet and so on. In all cases, the new pods reconnected correctly with the Dkron cluster, IP changes were handled, and leader selection worked. |
Hi, |
Is there a helm chart for V4? |
@fabltd you can use helm from main branch from here: https://github.com/distribworks/dkron-helm |
Does this install V4 ? Looking at the code it's V3? |
you can change dkron version here: https://github.com/distribworks/dkron-helm/blob/c57a99f7cf75d1f49e6290a6351280c57ce21356/dkron/values.yaml#L9 |
hi @vcastellm! |
hi!
Is your feature request related to a problem? Please describe.
At the moment dkron cannot be safely used in k8s because dkron servers cannot handle IP changes.
To reproduce you can just deploy dkron using actual helm, shutdown the cluster and redeploy it.
Nodes will try to reconnect to each other using old IPs, but this process won't succeed.
Describe the solution you'd like
I think the consul-like approach can be used: hashicorp/consul#3403
Additional context
I'm not sure if this is the only problem with dkron in k8s (there is a hypothesis that you need to resolve todo - one and two, but I'm not sure - will share updates if any appears)
If you know of any other problems, I would suggest making a series of improvements aimed at supporting the work of dkron in k8s.
I think many people would like to have such an opportunity (I have seen many issues that are related to this in one way or another).
The text was updated successfully, but these errors were encountered: