Image config labels are copied onto the container by both the CRI
plugin (BuildLabels) and the client's WithImageConfigLabels option
used by `ctr run`. Labels in the containerd.io/* namespace are
interpreted by containerd itself and labels in the io.cri-containerd*
namespace are interpreted by the CRI plugin. An image config is not a
trusted source for labels in either namespace.
Skip labels in both reserved namespaces when copying labels from an
image config to a container, and warn about each label skipped: an
image that tries to set them may be attempting to alter containerd
behavior. Oversized image labels are already skipped this way by
the CRI plugin.
Labels set explicitly by clients, for example via `ctr run --label`
or in the CRI request, are unaffected.
Verified with the CRI plugin and with `ctr run` against an image
whose config carries labels like these: the labels are no longer
present on the created container and a warning is logged for each.
Assisted-by: Claude Code
Signed-off-by: Ben Cressey <ben@cressey.org>
Signed-off-by: Samuel Karp <samuelkarp@google.com>
If no snapshotter is specified, container run selects the default
snapshotter.
However, if `os.features` is set, we should always call
`checkSnapshotterSupport()`. This ensures containerd clients
report a clear error:
```
ctr: snapshotter overlayfs does not support platform
{amd64 linux [erofs] } for image sha256:[]
```
instead of the confusing layer extraction error:
```
ctr: apply layer error for "": failed to extract layer sha256:[]:
failed to get stream processor for application/vnd.erofs.layer.v1:
no processor for media-type
```
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
This PR adds opt-in tracing spans/attributes in CRI image pull and selected sandbox-related paths to improve debugging and correlation (e.g., sandbox.id/pod metadata). If maintainers prefer a smaller diff, I’m happy to split this into a pull-only PR plus follow-ups.
• follow-up after pull-only PR
• focuses on task/metadata/sandbox/cni setup spans
Signed-off-by: Cindy Li <cindyli@pinterest.com>
Fixes: #12700
Instead of pulling in the selinux dependency for all users of the client
library for no need, just inline the one Sprintf call we were using the
library for here.
Signed-off-by: Wade Simmons <wade@wades.im>
Allows management of referrer objects when performing
pull, archive export and archive import.
Referrer objects are linked to their subjects via GC
labels. The label is based on sha256 checksum of the
object instead of incremental numbers as referrers are
not immutable and don't have any strict order.
In OCI-layout referrers that are not already in the exported
tree are added to the main index.json with
io.containerd.manifest.subject annotation.
On import such descriptors with that annotation
do not create digest-based images in the image store.
Note that this does not mean all the referrer objects in
the registry are now pulled/exported/imported by default.
The caller of the client pkg functions can choose which
referrer objects should also be handled.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Currently the new sandbox returns a sandbox client which will error when
start is called. The new sandbox should also create the sandbox with the
sandbox controller.
Signed-off-by: Derek McGowan <derek@mcg.dev>
- adds a transfer service progress reporter to handle timeouts. Also other test fixes
- fallback to local image pull when configuration conflict
Signed-off-by: Tony Fang <nhfang@amazon.com>
Co-authored-by: Swagat Bora <sbora@amazon.com>
Schema 1 (`application/vnd.docker.distribution.manifest.v1+prettyjws`) has been
officially deprecated since containerd v1.7 (PR 6884), and disabled since v2.0 (PR 9765).
Users who have been seeing warnings like `conversion from schema 1 images is deprecated`
now have to rebuild the image with Schema 2 or OCI.
Schema 2 was introduced in Docker 1.10 (Feb 2016), so most users should have been already
using Schema 2 or OCI.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
When moving to gRPC 1.64 (commit 63b4688175) the usage of the deprecated
`grpc.DialContext` was replaced with `grpc.NewClient`. However, this
change also required to drop the `WithBlock` option, which made sure
that the connection is actually established before returning.
Now, `grpc.NewClient` doesn't attempt to perform the connection but
defers it to the actual first RPC.
Querying the default runtime on client creation breaks that property
depending on whether the default namespace is set or not.
This commit defers the `runtime` field initialization to the first time
the field is actually needed.
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
This implements container restore as described in:
https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/#restore-checkpointed-container-standalone
For detailed step by step instruction also see contrib/checkpoint/checkpoint-restore-cri-test.sh
The code changes are based on changes I have done in Podman around 2018
and CRI-O around 2020.
The history behind restoring container via CRI/Kubernetes probably
requires some explanation. The initial proposal to bring
checkpoint/restore to Kubernetes was looking at pod checkpoint and
restoring and the corresponding CRI changes.
https://github.com/kubernetes-sigs/cri-tools/pull/662https://github.com/kubernetes/kubernetes/pull/97194
After discussing this topic for about two years another approach was
implemented as described in KEP-2008:
https://github.com/kubernetes/enhancements/issues/2008
"Forensic Container Checkpointing" allowed us to separate checkpointing
from restoring. For the "Forensic Container Checkpointing" it is enough
to create a checkpoint of the container. Restoring is not necessary as
the analysis of the checkpoint archive can happen without restoring the
container.
While thinking about a way to restore a container it was by coincidence
that we started to look into restoring containers in Kubernetes via
Create and Start. The way it was done in CRI-O is to figure out during
Create if the container image is a checkpoint image and if that is true
we are using another code path. The same was implemented now with this
change in containerd.
With this change it is possible to restore the container from a
checkpoint tar archive that is created during checkpointing via CRI.
To restore a container via Kubernetes we convert the tar archive to an
OCI image as described in the kubernetes.io blog post from above. Using
this OCI image it is possible to restore a container in Kubernetes.
At this point I think it should be doable to restore containers in
CRI-O and containerd no matter if they have been created by containerd or
CRI-O. The biggest difference is the container metadata and that can
be adapted during restore.
Open items:
* It is not clear to me why restoring a container in containerd goes
through task/Create(). But as the restore code already exists this
change extended the existing code path to restore a container in
task/Create() to also restore a container through the CRI via
Create and Start.
* Automatic image pulling. containerd does not pull images
automatically if created via the CRI. There is an option in
crictl to pull images before starting, but that uses the CRI
image pull interface. It is still a separate pull and create
operation. Restoring containers from an OCI image is a bit
different. The checkpoint OCI image does not include the base
image, but just a reference to the image (NAME@DIGEST).
Using crictl with pulling will enable the pulling of the
checkpoint image, but not of the base image the checkpoint is
based on. So during preparation of the checkpoint containerd
will automatically pull the base image, but I was not able how
to pull an image blockingly in containerd. So there is a for
loop waiting for the container image to appear in the internal
store. I think this probably can be implemented better.
Anyway, this is a first step towards container restored in Kubernetes
when using containerd.
Signed-off-by: Adrian Reber <areber@redhat.com>
Fix the gRPC client dialer not using the timeout passed by the
containerd client timeout option.
Commit 63b4688175 replaced the usage of deprecated `grpc.DialContext`
with `grpc.NewClient`.
However, the `dialer.ContextDialer` relied on the context deadline to
propagate the timeout:
388fb336b0/vendor/google.golang.org/grpc/clientconn.go (L216)
This assumption is now broken, because `grpc.NewClient` doesn't do any
initial connection and defers it to the first RPC usage.
This commit passes the timeout via the `MinConnectTimeout` grpc
connection param, which will be applied to **every** connection attempt
(not just the first).
Signed-off-by: Paweł Gronowski <pawel.gronowski@docker.com>
the client package provides a WithDialOpts option, however, dial-options
passed to override all defaults that are set in containerd. This makes it
difficult to expand the defaults with custom options, as this requires
copying the defaults, and trying to keep those in sync (e.g. see [moby#48617]).
This patch introduces a new `WithExtraDialOpts` option which, unlike
`WithDialOpts` are appended to, instead of overriding, previous options.
This allows setting custom options, while maintaining containerd's defaults.
Also unlike `WithDialOpts`, this option can be used multiple times to allow
additional options to be set.
[moby#48617]: https://github.com/moby/moby/pull/48617
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We can add the dependency in oss_fuzz_build.sh, since
it's only used for oss-fuzz
change os.MkdirTemp to t.TempDir in fuzz tests
Signed-off-by: Jin Dong <djdongjin95@gmail.com>
Uses the new github.com/containerd/errdefs/pkg module which is intended
to hold less stable utility functions separately from the stable
github.com/containerd/errdefs error types.
Includes temporary update to hcsshim until a release is cut there
Signed-off-by: Derek McGowan <derek@mcg.dev>
There's a couple spots where we know exactly how large
the destination buffer should be, so pre-size these to
avoid any reallocs to a higher capacity.
Signed-off-by: Danny Canter <danny@dcantah.dev>