-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update restore network fix for checkpointed container to latest bouch… #19
base: cr-combined
Are you sure you want to change the base?
Update restore network fix for checkpointed container to latest bouch… #19
Conversation
Methods for checkpointing and restoring containers were added to the native driver. The LXC driver returns an error message that these methods are not implemented yet. Signed-off-by: Saied Kazemi <[email protected]> Conflicts: daemon/execdriver/native/create.go daemon/execdriver/native/driver.go daemon/execdriver/native/init.go Conflicts: daemon/execdriver/driver.go daemon/execdriver/native/create.go
Support was added to the daemon to use the Checkpoint and Restore methods of the native exec driver for checkpointing and restoring containers. Signed-off-by: Saied Kazemi <[email protected]> Conflicts: api/server/server.go daemon/container.go daemon/daemon.go daemon/networkdriver/bridge/driver.go daemon/state.go vendor/src/github.com/docker/libnetwork/ipallocator/allocator.go Conflicts: api/server/server.go
- C/R is now an EXPERIMENTAL level feature. - Requires CRIU 1.6 (and builds it from source in the Dockerfile) - Introduces checkpoint and restore as top level cli methods (will likely change) Signed-off-by: Ross Boucher <[email protected]>
…er/docker/cr-combined. Reuse the endpoint of the checkpointed container when restore. Pass veth pair name to ciur when restore a checkpointed container. TODO: Add libnetwork API to retrieve ethXXX in the container Signed-off-by: Hui Kang <[email protected]>
I'll try this out as soon as I get a chance. |
for _, i := range criuOpts.VethPairs { | ||
veth := new(criurpc.CriuVethPair) | ||
veth.IfOut = proto.String(i.HostInterfaceName) | ||
veth.IfOut = proto.String(i.HostInterfaceName + "@docker0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should add the @docker0 in docker rather than here.
rename veth name in runconfig/restore.go Signed-off-by: Hui Kang <[email protected]>
@boucher Updated. |
Sorry for the delay. This seems to be working for me locally. Unfortunately, I can't really merge it with the libnetwork change. We need to figure out a way to do this that they'll accept. |
Unfortunately, I've rebased and this no longer applies again. container.NetworkSettings.EndpointID no longer appears to exist, and the releaseNetwork logic has been moved around quite a bit. |
I've pushed an attempted update here, but it has some flaws: |
I will look at it soon. Thanks. |
e91c518
to
988a915
Compare
b584b5a
to
a6a4511
Compare
7c96921
to
7fda470
Compare
I am trying to checkpoint and restore a container with active TCP connection. For this i took the latest code from boucher's cr-combined branch and compiled it with Experimental flag enabled. I have compiled and installed CRIU version 1.8 I have a docker image (TCP server ) which contains the code to listen on a TCP socket. And i execute a client code which sends messages to the server and waits for the response from server. The client is executed from the same host in which the containers are running When i issue a checkpoint the client sends the message to the server and keeps the waiting for the response. Once the server container is restored( same container and not new one) the client is unable to send the message and the client exits. Also found that the interface eth0 of the restored container is not in running state( From the container the docker bridge is not pingable). The above issue is not seen if i run docker with --net=host option and checkpoint and restore of tcp connection works seamlessly. Is this an know issue and is there any workaround for it ? |
06cd8b9
to
9bb9ce0
Compare
d80f2fb
to
9272300
Compare
@amakumar , Great thanks for your post. I faced the same issue. Process number matters Here is the clue I had found: if you donot use --net=host, you will get:(ps auxf) Here we get TWO process! we will get only ONE process. Here is the difference. https://criu.org/Inheriting_FDs_on_restore lsof -p [container process id] myapp 23285 root 1w FIFO 0,9 0t0 262144 pipe maybe docker native checkpoint /restore do not support or handle inherite_FD operation very well. |
…er/docker/cr-combined.
Reuse the endpoint of the checkpointed container when restore.
Pass veth pair name to ciur when restore a checkpointed container.
TODO: Add libnetwork API to retrieve ethXXX in the container
Signed-off-by: Hui Kang [email protected]