In order to support identity mapping and user namespaces, the Moby
project needs to defer the creation of a container's network namespace
to the runtime and hook into the container lifecycle to configure the
network namespace before the user binary is started. The standard way to
do so is by configuring a `createRuntime` OCI lifecycle hook, in which
the OCI runtime executes a specified process in the runtime environment
after the container has been created and before it is started. In the
case of Moby the network namespace needs to be configured from the
daemon process, which necessitates that the hook process communicate
with the daemon process. This is complicated and slow. All the hook
process does is inform the daemon of the container's PID and wait until
the daemon has finished applying the network namespace configuration.
There is an alternative to the `createRuntime` OCI hook which containerd
clients can take advantage of. The `container.NewTask` method is
directly analogous to the OCI create operation, and the `task.Start`
method is directly analogous to the OCI start operation. Any operations
performed between the `NewTask` and `Start` calls are therefore directly
analogous to `createRuntime` OCI hooks, without needing to execute any
external processes! Provide a mechanism for network.Namespace instances
to register a callback function which can be used to configure a
container's network namespace instead of, or in addition to,
`createRuntime` OCI hooks.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This allows a frontend to request a specific for stubs removal.
By default, if not specified, this will revert to the previous
behaviour. New gateway clients however will set the property to the
desired recursive removal mode.
This property needs to be set for both components that call the
executor: for ExecOp, as well as for the StartContainer API.
Signed-off-by: Justin Chadwell <me@jedevc.com>
Deleting a containerd task whose status is Created fails with a
"precondition failed" error. This is because (aside from Windows) a
process is spawned when the task is created, and deleting the task while
the process is running would leak the process if it was allowed.
Change the deferred `task.Delete` call to pass the `WithProcessKill`
delete option so the cleanup has a chance to succeed in the event that
the `p.Start` call inside `runProcess` returns an error.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This adds netNSPoolSize pool options which allow setting a target
network namespace pool size. buildkitd will create this number of
network namespaces at startup (without blocking). When a container
execution finishes, the network namespace gets returned to the pool. If
the pool goes above the target size, there is a grace period to allow
network namespaces to be reused, and if this passes without reuse, the
extra namespaces will be released.
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
This will allow clients to retrieve exit error codes returned during a
solve without parsing the error messages.
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
Set's an apparmor profile on the OCI spec if one is configred on the
worker.
Adds selinux labels to containers (only added if selinux is enabled on
the system).
This assumes that the specified apparmor profile is already loaded on
the system and does not try to load it or even check if it is loaded.
SELinux support requires the `selinux` build tag to be added.
Likewise, `runc` would require both the `apparmor` and `selinux` build
tags.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Vendored go-selinux to v1.8.0
Fixed tests
Signed-off-by: Tibor Vass <tibor@docker.com>
(cherry picked from commit 68bb095353)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Copy this const to a local constant to prevent importing the containerd
client in the front-end.
For consistency, I also updated the executor code to use the same const,
although not strictly needed.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
update run/exec tests for stdin and expected failures
move common tests for runc and container to shared tests package
Signed-off-by: Cory Bennett <cbennett@netflix.com>
Refactor the interface to avoid such issues in the future.
BuildKit own mounts are stateless and not affected but
a different mountable implementation could get confused.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
This patch allows downstream code to pass a DNSConfig that is
then used by executor/oci.GetResolvConf.
This would allow the BuildKit-based builder in Docker to honor
the docker daemon's DNS configuration, thus fixing a feature gap
with the legacy builder.
Signed-off-by: Tibor Vass <tibor@docker.com>
Note that this mode allows build executor containers to kill (and potentially ptrace) an arbitrary process in the BuildKit host namespace.
This mode should be enabled only when the BuildKit is running in a container as an unprivileged user.
Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>