The RunPodSandbox unconditionally pre-pulls the pause container
image via ensurePauseImageExists() before starting any sandbox.
However, only the "podsandbox" controller actually uses the pause
image to create a pause container holding namespaces. Shim-based
sandbox controllers (e.g. Kata Containers) manage the sandbox
lifecycle entirely at the shim level and never reference the pause
image.
Add a DisablePauseImagePull flag to the Runtime config that gates
ensurePauseImageExists(). When a sandboxer is not "podsandbox", the
flag skips the unnecessary pre-pull, avoiding wasted network/storage
overhead and reducing sandbox startup latency.
The long-term direction is to offload image pulling entirely to the
controller implementation (shim level); this flag is an incremental
step toward that goal without introducing a breaking behavior change.
Also add unit tests to verify that ensurePauseImageExists is only
invoked for the "podsandbox" sandboxer and correctly skipped otherwise.
Signed-off-by: Alex Lyn <alex.lyn@antgroup.com>
Document the new server plugin configuration blocks for GRPC, TTRPC,
debug, and metrics. Mark the legacy top-level sections as deprecated.
Note that in version 4, the TTRPC plugin is configured independently
from GRPC and uses its own defaults when its plugin block is omitted.
Signed-off-by: Derek McGowan <derek@mcg.dev>
This disables the slow_chown feature (nobody in their right mind
is going to be choosing erofs and want to slowly chown each file),
indicates that we support idmaps if the kernel supports it, and makes
sure to chown the upperdir.
This is more or less exactly how the overlay snapshotter does things,
minus the slow_chown part (which has discussions about dropping
altogether at some point anyways).
Signed-off-by: Andrew Halaney <ahalaney@netflix.com>
EROFS has supported a tiny metadata-only image to reference external
blobs since Linux 5.16. This eliminates the need to mount each EROFS
layer one by one and is also useful for VM-based containers (e.g.
nerdbox and Kata containers.)
Similar to LCOW/CimFS, `snapshots.UnpackKeyPrefix` is used to
trigger fsmerge generation (typically < 100 ms) on demand in Prepare().
In the future, we can also generate fsmeta in Commit() of the final
unpacking layer (by introducing an annotation to keep the chainID).
However, in the case of intermediate layer reuse, the Prepare() handling
will still be required.
```toml
[plugins."io.containerd.snapshotter.v1.erofs"]
max_unmerged_layers = 1 # enable fsmerge if image layers >= 2
```
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
adds a background stats collector that calculates `UsageNanoCores` for containers and pod sandboxes.
- run in the background every second to collect CPU metrics for all containers and sandboxes (similar to what cAdvisor does)
- keep a rolling buffer of CPU samples and calculates the instantaneous CPU usage rate from consecutive samples
- read pod-level CPU stats from the parent cgroup rather than the pause container
- add cgroupv2 Pressure Stall Information for CPU, memory, and IO
- add missing `Timestamp` and `Interfaces` fields
when Kubernetes runs with `PodAndContainerStatsFromCRI=true`, it expects `UsageNanoCores` to be set in stats responses.
This value represents how much CPU is being used right now (as opposed to `UsageCoreNanoSeconds` which is cumulative).
To calculate it, we need to compare CPU samples over time to replicate what is in cadvisor.
we can't yet really test this in CI as some changes in kubernetes has to land for `--feature-gates=PodAndContainerStatsFromCRI=true`
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
The otelgrpc.UnaryClientInterceptor and otelgrpc.StreamClientInterceptor
options were deprecated and removed in favor of NewClientHandler.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Added v0.1.0 plugin support to the list of deprecated features in
RELEASES.md. Added a chapter about how to enable and configure the
default validator plugin in NRI.md
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Enabling the IMMUTABLE_FL file attribute causes dirty data to be
flushed synchronously at least on EXT4, which can greatly impact
container launch performance. In contrast, the overlayfs snapshotter
does not use syncfs by default.
Most users may not need IMMUTABLE_FL, let's make IMMUTABLE_FL optional
to align with the behavior of the overlayfs snapshotter and recover the
original performance.
1. tensorflow
Test commands:
$ nerdctl image pull --snapshotter=X --unpack="false" tensorflow/tensorflow:2.19.0
$ time nerdctl container --snapshotter=X run -d tensorflow/tensorflow:2.19.0 /bin/sh
Results:
overlayfs | 0m18.748s
erofs (no IMMUTABLE_FL) | 0m10.090s
erofs (with IMMUTABLE_FL) | 0m21.074s
2. ubuntu 22.04
Test commands:
$ nerdctl image pull --snapshotter=X --unpack="false" ubuntu:22.04
$ time nerdctl container --snapshotter=X run -d ubuntu:22.04 /bin/sh
Results:
overlayfs | 0m1.147s
erofs (no IMMUTABLE_FL) | 0m0.795s
erofs (with IMMUTABLE_FL) | 0m1.094s
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Derive filesystem UUIDs (`lsblk -o +UUID`) from the OCI layer digests
(although diffIDs are better in principle, but they're unavailable by
differs in advance) rather than generating a random one. This allows
EROFS to uniquely identify each layer using the content-addressable
filesystem UUID.
It can also be used for reproducible builds. To achieve this, configure
`mkfs_options` with `-T0 --mkfs-time` (However, `--mkfs-time` requires
erofs-utils 1.8+; Otherwise, all inode timestamps will be reset w/o it):
``` toml
[plugins."io.containerd.differ.v1.erofs"]
mkfs_options = ["-T0 --mkfs-time"]
```
Fixes: c73c8e5d52 ("Introduce EROFS differ")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
This simplifies the permissions. If it's run on the home, some distros
make the /home/user dir with just permissions for the owner, but we
need +x permissions for others (technically for host user the container
is mapped to, but that is more tricky in this example).
/tmp has the right permissions already, so let's just do the example
there.
While we are there, I just copied the two commands from the runc doc, to
create the rootfs, instead of linking there. Also, I changed the
config.json to include the right path, now that is known.
Having the path fixed makes sure users can't do a mistake when setting
it. This was the cause of #11575 (they were not setting the rootfs as an
absolute path, as documented).
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
While we are there, bash should not be the process, it should be sh. In
the bare-bone image used in the example, bash is not present (or not
present anymore?).
Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
- adds a transfer service progress reporter to handle timeouts. Also other test fixes
- fallback to local image pull when configuration conflict
Signed-off-by: Tony Fang <nhfang@amazon.com>
Co-authored-by: Swagat Bora <sbora@amazon.com>
To make it as DEPRECATED, this PR does the following:
1. Changes config default to use `NetworkPluginBinDirs`;
2. Mark `NetworkPluginBinDir` as deprecated (in config version 3);
3. Add config migration from 2 to 3, which migrates `bin_dir`
in version 2 to `bin_dirs` in version 3.
Signed-off-by: Jin Dong <djdongjin95@gmail.com>
[wip] add deprecation warning
Signed-off-by: Jin Dong <djdongjin95@gmail.com>