The hardcoded architecture list was little-endian only, causing
seccomp_arch_add() to fail with -EDOM on s390x.
Drop it. It's optional and libseccomp automatically adds the native
architecture when the filter is created.
Fixes: https://github.com/opencontainers/runc/issues/4835
Signed-off-by: Ricardo Branco <rbranco@suse.de>
The idea of commit d1fca8e was right (report errors for non-existent
root, unless using the default root dir) but the logic was inverted.
Fix the logic.
Test case for default root requires non-existent /root/runc, which is
not always possible.
Reported-by: RedMakeUp <girafeeblue@gmail.com>
Co-authored-by: RedMakeUp <girafeeblue@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Migrate from urfave/cli v1 (maintenance mode) to v3 to benefit from
active development, improved features, and long-term support.
Signed-off-by: lifubang <lifubang@acmcoder.com>
As runc binary grows in size over time (new features, more
dependencies) some tests start to flake because of low memory limits.
One such test is "runc run (cgroup v2 resources.unified override)";
it obviously fails because of 1M memory limit:
> runc run failed: unable to start container process: container init was OOM-killed (memory limit too low?)
Increase the limits 4x. Do the same for the "unified only" test.
Fixes issue 5264.
Reported-by: Kevin Berry <kpberry11@gmail.com>
Reported-by: Ricardo Branco <rbranco@suse.de>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Apparently, lima's experimental/fedora-rawhide image does not include
which rpm, and we don't really want to bother installing it.
Replace "which" with "command -v". Looks like this was the only place;
we already use "command -v" everywhere else.
This should fix lima (experimental/fedora-rawhide) CI.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Before commit 7dc24868, when process.env was nil, prepareEnv
returned a flag telling HOME is not set, and it was added.
Commit 7dc24868 moved the functionality of adding HOME into
prepareEnv but did not properly handle nil case. As a result,
runc exec -p with process.json having no env set resulted in
an exec with no HOME set.
Fix this, and add unit and integration tests.
Fixes: 7dc24868 ("libct: switch to numeric UID/GID/groups")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
All existing tests check runc run, and there is no single runc exec
environment test except for one in exec.bats.
Add it (no new issues found).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This tests checks that "runc exec --env VAR=VAR ..." actually appends
VAR=VAL to the exec's environment.
Add additional checks that:
- process.env from config.json is also inherited;
- HOME is set.
Those checks do not reveal any new issues.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
add bat integration test for rootfs propagation test, expect to
see the mount propagation is slave, the test will create a isolate mntns
to run the test as the test will mutate the rootfs propagation
Signed-off-by: sean <xujihui1985@gmail.com>
Those tests were added by commit 8d180e96 ("Add support for Linux
Network Devices"), apparently by copy-pasting the test cases which
call simple_cr (all four of them).
While different simple_cr tests make sense as they cover different
code paths in runc and/or check for various regression, the same
variations with netdevice do not make sense, as having a net device
is orthogonal to e.g. bind mount, --debug, or cgroupns.
Remove those.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
When RUNC_USE_SYSTEMD is set, tests/rootless.sh is using
ssh -tt rootless@localhost
to run tests as rootless user. In this case, local environment is not
passed to the user's ssh session (unless explicitly specified), and so
the tests do not get ROOTLESS_FEATURES.
As a result, idmap-related tests are skipped when running as rootless
using systemd cgroup driver:
integration test (systemd driver)
...
[02] run rootless tests ... (idmap)
...
ok 286 runc run detached ({u,g}id != 0) # skip test requires rootless_idmap
...
Fix this by creating a list of environment variables needed by the
tests, and adding those to ssh command line (in case of ssh) or
exporting (in case of sudo) so both cases work similarly.
Also, modify disable_idmap to unset variables set in enable_idmap so
they are not exported at all if idmap is not in features.
Fixes: bf15cc99 ("cgroup v2: support rootless systemd")
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
These helpers all make more sense as a self-contained package and moving
them has the added benefit of removing an unneeded libpathrs dependency
(from libcontainer/utils's import of pathrs-lite) from several test
binaries.
Signed-off-by: Aleksa Sarai <aleksa@amutable.com>
Some of runc integration tests may do something that I would not like
when running those on my development laptop. Examples include
- changing the root mount propagation [1];
- replacing /root/runc [2];
- changing the file in /etc (see checkpoint.bats).
Yet it is totally fine to do all that in a throwaway CI environment,
or inside a Docker container.
Introduce a mechanism to skip specific "unsafe" tests unless an
environment variable, RUNC_ALLOW_UNSAFE_TESTS, is set. Use it
from a specific checkpoint/restore test which modifies
/etc/criu/default.conf.
[1]: https://github.com/opencontainers/runc/pull/5200
[2]: https://github.com/opencontainers/runc/pull/5207
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The runtime-spec [1] currently says:
> 6. Runtime's start command is invoked with the unique identifier of
> the container.
> 7. The startContainer hooks MUST be invoked by the runtime. If any
> startContainer hook fails, the runtime MUST generate an error, stop
> the container, and continue the lifecycle at step 12.
> 8. The runtime MUST run the user-specified program, as specified by
> process.
> 9. The poststart hooks MUST be invoked by the runtime. If any
> poststart hook fails, the runtime MUST generate an error, stop the
> container, and continue the lifecycle at step 12.
> ...
> 11. Runtime's delete command is invoked with the unique identifier of
> the container.
> 12. The container MUST be destroyed by undoing the steps performed
> during create phase (step 2).
> 13. The poststop hooks MUST be invoked by the runtime. If any poststop
> hook fails, the runtime MUST log a warning, but the remaining hooks
> and lifecycle continue as if the hook had succeeded.
Currently, we do 9 before 8 (heck, even before 6), which is clearly
against the spec and results in issues like the one described in [2].
Let's move running poststart hook to after the user-specified process
has started.
NOTE this patch only fixes the order and does not implement removing
the container when the poststart hook failed (as this part of the spec
is controversial -- destroy et al and should probably be, and currently
are, part of "runc delete").
[1]: https://github.com/opencontainers/runtime-spec/blob/main/runtime.md#lifecycle
[2]: https://github.com/opencontainers/runc/issues/5182
Reported-by: ningmingxiao <ning.mingxiao@zte.com.cn>
Reported-by: Erik Sjölund <erik.sjolund@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since switching to Go 1.25 in go.mod, the "detect fd leaks" test fails
like this:
> not ok 57 runc create[detect fd leak as comprehensively as possible]
> # (in test file tests/integration/create.bats, line 76)
> # `[ "$violation_found" -eq 0 ]' failed
> ...
> # Violation: FD 9 -> '/system.slice/runc-test_busybox.scope/cpu.cfs_quota_us'
> # Violation: FD 10 -> '/system.slice/runc-test_busybox.scope/cpu.cfs_period_us'
> ...
This happens because Go 1.25 adds a feature to dynamically set GOMAXPROC
based on current CPU quota values. This feature can be disabled by setting
GODEBUG=containermaxprocs=0,updatemaxprocs=0
but it is harmless to keep it (except for the above test failure).
Add an exception to the test case.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This adds support for WaitKillableRecv seccomp flag
(also known as SCMP_FLTATR_CTL_WAITKILL in libseccomp and
as SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV in the kernel).
This requires:
- libseccomp >= 2.6.0
- libseccomp-golang >= 0.11.0
- linux kernel >= 5.19
Note that this flag does not make sense without NEW_LISTENER, and
the kernel returns EINVAL when SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV
is set but SECCOMP_FILTER_FLAG_NEW_LISTENER is not set.
For runc this means that .linux.seccomp.listenerPath should also be set,
and some of the seccomp rules should have SCMP_ACT_NOTIFY action. This
is why the flag is tested separately in seccomp-notify.bats.
At the moment the only adequate CI environment for this functionality is
Fedora 43. On all other platforms (including CentOS 10 and Ubuntu 24.04)
it is skipped similar to this:
> ok 251 runc run [seccomp] (SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV) # skip requires libseccomp >= 2.6.0 and API level >= 7 (current version: 2.5.6, API level: 6)
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
SCMP_ACT_KILL terminates the process with a fatal signal, which may
produce a core dump depending on the host configuration.
While this is harmless on ephemeral CI instances, it can leave unwanted
core files on developer or customer systems. It also interferes with
test environments that detect unexpected core dumps.
Signed-off-by: Ricardo Branco <rbranco@suse.de>
When parsing mount options into recAttrSet and recAttrClr,
the code sets attr_clr to individual atime flags (e.g.
MOUNT_ATTR_NOATIME or MOUNT_ATTR_STRICTATIME) when clearing
atime attributes. However, this violates the kernel's
requirement documented in mount_setattr(2)[1]:
> Note that, since the access-time values are an enumeration
> rather than bit values, a caller wanting to transition to a
> different access-time setting cannot simply specify the
> access-time setting in attr_set, but must also include
> MOUNT_ATTR__ATIME in the attr_clr field. The kernel will
> verify that MOUNT_ATTR__ATIME isn't partially set in
> attr_clr (i.e., either all bits in the MOUNT_ATTR__ATIME
> bit field are either set or clear), and that attr_set
> doesn't have any access-time bits set if MOUNT_ATTR__ATIME
> isn't set in attr_clr.
Passing only a single atime flag (e.g. MOUNT_ATTR_RELATIME) in
attr_clr causes mount_setattr() to fail with EINVAL.
This change ensures that whenever an atime mode is updated,
attr_clr includes MOUNT_ATTR__ATIME to properly reset the
entire access-time attribute field before applying the new mode.
[1] https://man7.org/linux/man-pages/man2/mount_setattr.2.html
Signed-off-by: lifubang <lifubang@acmcoder.com>
We intentionally broke this in commit d40b3439a9 ("rootfs: switch to
fd-based handling of mountpoint targets") under the assumption that most
users do not need this feature. Sadly it turns out they do, and so
commit 3f925525b4 ("rootfs: re-allow dangling symlinks in mount
targets") added a hotfix to re-add this functionality.
This patch adds some much-needed tests for this behaviour, since it
seems we are going to need to keep this for compatibility reasons (at
least until runc v2...).
Co-developed-by: lifubang <lifubang@acmcoder.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
On some systems (e.g., AlmaLinux 8), systemd automatically removes cgroup paths
when they become empty (i.e., contain no processes). To prevent this, we spawn
a dummy process to pin the cgroup in place.
Fix: https://github.com/opencontainers/runc/issues/5003
Signed-off-by: lifubang <lifubang@acmcoder.com>
This was always the intended behaviour but commit 72fbb34f50 ("rootfs:
switch to fd-based handling of mountpoint targets") regressed it when
adding a mechanism to create a file handle to the target if it didn't
already exist (causing the later stat to always succeed).
A lot of people depend on this functionality, so add some tests to make
sure we don't break it in the future.
Fixes: 72fbb34f50 ("rootfs: switch to fd-based handling of mountpoint targets")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
This is mostly to improve readability. While at it, make the script more
robust by adding -e option to shell. The exception is echo $pid which is
opportunistic and may fail depending on the order of pids in the file.
Also, remove the empty comment and a shellcheck annotation.
Fixes: c91fe9ae
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The "runc delete --force [paused container]" test case does not check
runc pause exit code, and if added, the test fails in rootless tests,
because:
- not all rootless tests have access to cgroups;
- rootless containers doesn't have default cgroups path.
To fix, add:
- setup for rootless case;
- require cgroups_freezer;
- runc pause exit code check.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In our bats tests, runc itself is a wrapper which calls bats run helper,
so using "run runc" is wrong as it results in calling run helper twice.
Fixes: 8d180e965
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Commands that are not run via "run" helper (cat, mkdir, __runc)
do not set $status, so it makes no sense to check it.
Fixes: 94505a04, ed548376
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This is a bit opinionated, but some comments in integration tests do not
really help to understand the nature of the tests being performed by
stating something very obvious, like
# run busybox detached
runc run -d busybox
To make things worse, these not-so-helpful messages are being
copy/pasted over and over, and that is the main reason to remove them.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Remove the devicemapper driver mentions, and is it no longer
supported by docker (or podman).
2. Remove the test example -- we have plenty of real ones.
3. Add a link to (well written and extensive) bats documentation.
4. Fix capitalization in a sentence.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>