Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADD: cni support for the docker executor. #586

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

andreaspeters
Copy link
Contributor

With this PR I would like to add support for CNI (ContainerNetworkInterface) to the Docker Executor. As far as possible, I have followed the CNI implementation of the Mesos Containerizer.
The following parameters and their default values are added to the Docker Executor:

  • network_cni_plugins_dir: /usr/lib/cni/
  • network_cni_config_dir: /etc/mesos/cni/net.d

How is CNI used with Docker?

By setting a name on the mesosproto NetworkInfo object which matches the CNI name in the CNI configuration.

Here is an example of a CNI configuration:

cat /etc/mesos/cni/net.d/10-mesos-net.conf
{
    "cniVersion": "0.2.0",
    "name": "mesos-net",
    "type": "ipvlan",
    "bridge": "cni0",
    "isGateway": true,
    "ipMasq": true,
    "vlanId": 5,
    "ipam": {
       "type": "host-local",
       "ranges": [
          [
            {
              "subnet": "10.10.0.0/16",
              "rangeStart": "10.10.0.10",
              "rangeEnd": "10.10.0.250"
            }
          ]
        ],
        "routes": [
            { "dst": "0.0.0.0/0" }
        ]
    }
}

How does the Docker Executor work?

If the name of the mesosproto NetworkInfo object has been set, the executor checks whether a CNI configuration exists for it. If this is the case, it is used to create a network interface in the container. If no CNI configuration exists, an error message appears in the stderr file in the sandbox directory. As the name can also be a Docker Network Plugin, the container is started in any case. If there is no Docker Network Plugin with the name, the container runs without an additional network interface.

How can the CNI support be tested?

I have added a test that can be used via the Mesos Test Tool.

mesos-tests --verbose --gtest_filter="DockerCniTest.*”

@andreaspeters
Copy link
Contributor Author

Friendly ping :-)

Copy link
Contributor

@bmahler bmahler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could use a more clear overview of the approach and its limitations, shouldn't we be using docker network plugins?

https://docs.docker.com/engine/extend/plugins_network/#use-network-driver-plugins

That seems like the right way to do this? Whereas the approach taken in this patch looks problematic and looks like a hack? We're doing CNI stuff after we've docker run the container?

Comment on lines +335 to +348
Try<std::tuple<Subprocess, std::string>> cni = attachCNI(task);
if (!cni.isError()) {
Subprocess cniSub = std::get<0>(cni.get());
std::string cniNetworkName = std::get<1>(cni.get());

WIFEXITED(cniSub.status().get().get());
Try<std::string> cniIP = attachCNISuccess(cniSub);
if (cniIP.isError()) {
LOG(ERROR) << cniIP.error();
} else {
useCNI = true;
setIPAddresses(cniIP.get(), false);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm.. this looks like a bad design? docker run has already been called, and then we attach CNI? This means the workload could have started and failed to perform networking operations? It seems like the correct way to do this is to perform CNI work while setting up the container, whereas this looks like a hack where it's set up after the container has been started already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can only add CNI if the container is already running since there is no way for us to "infiltrate" the container during the startup.

There are up to one second where the workload could have connection issues. But the container will not fail (crash). That's a point we have to write as notice.

My personal opinion is: Docker inc. should add CNI support to there engine. But they will not and on the other hand they also do not care there network plugins. 🤷🏻‍♂️ At least with these CNI implementation we can use a wide range of new network interfaces with our docker executor and does not have to use unmaintained docker network plugins.

Comment on lines +906 to +909
// function to attach or detach CNI from/to the container
Try<std::tuple<Subprocess, std::string>> commandCNI(
const mesos::ContainerInfo& container,
const string command)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like these should probably just return a Future of whatever information you want? not clear what the string here would represent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns the CNI name in line 1011. We need it in line 335. But maybe I also misunderstand you. 😅

@andreaspeters
Copy link
Contributor Author

This could use a more clear overview of the approach and its limitations, shouldn't we be using docker network plugins?

https://docs.docker.com/engine/extend/plugins_network/#use-network-driver-plugins

That seems like the right way to do this? Whereas the approach taken in this patch looks problematic and looks like a hack? We're doing CNI stuff after we've docker run the container?

There is no official "Docker Inc" way to support CNI. They prefer to use there own docker-network plugin. Problem with that is, the plugins are pretty old. Some of them are even not in development anymore but still on the official list. 🤷🏻‍♂️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants