Commit Graph

19 Commits

Author SHA1 Message Date
Chris Henzie
a0086cfcee Merge commit from fork 2026-06-15 21:26:29 -07:00
Chris Henzie
432a7af299 Merge commit from fork 2026-06-15 21:25:18 -07:00
Brian Goff
8196411f24 cri: make checkpoint restore robust to unexpected archive content
The CRI checkpoint restore path unpacked checkpoint archive/OCI image content
directly into the container's persistent state directory and read files such as
container.log back from it with a symlink-following copy. Checkpoint content is
externally provided, so make restore more defensive about what it unpacks and
how it reads those files back.

Behavior changes:

- Only unpack regular files and directories from the checkpoint archive.

- Unpack checkpoint content into a dedicated <state>/ctrd-restore
  subdirectory created fresh rather than into the state dir itself, so
  checkpoint content cannot collide with containerd's own files (e.g.
  the "status" blob). Restore and cleanup operate on that subdir;
  cleanup is now a single RemoveAll of it.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2026-06-15 15:11:36 -07:00
Samuel Karp
861ffc1097 cri: filter CDI annotations on checkpoint restore
Filter out any annotations on the checkpointed container matching
`cdi.k8s.io/` or exactly `cdi.k8s.io` during restore to prevent
unauthorized device restoration. When an annotation is denied, a warning
log is generated.

Tested by:
* Unit tests for exact matching, prefix boundaries, and metadata merging
* Complete CRI integration and checkpoint restore suite

Assisted-by: Antigravity
Signed-off-by: Samuel Karp <samuelkarp@google.com>
2026-06-09 16:56:45 -07:00
Samuel Karp
0c0918fa8f cri: do not re-tag restored checkpoints
Google-Bug-Id: 508657842
Signed-off-by: Samuel Karp <samuelkarp@google.com>
2026-06-03 10:49:45 -07:00
Davanum Srinivas
c30f23452c cri: use upstream Kubernetes modules
Switch the CRI integration layer from containerd's forked Kubernetes helpers
and clients to the upstream Kubernetes modules, and finalize the dependency
update to Kubernetes v0.36.0.

Replace the remaining internal helper copies with upstream packages:
- internal/cri/clock -> k8s.io/utils/clock
- internal/cri/executil -> upstream CRI exec helpers
- internal/cri/resourcequantity -> k8s.io/apimachinery/pkg/api/resource
- internal/cri/setutils -> k8s.io/apimachinery/pkg/util/sets
- internal/cri/types/labels.go -> internal/cri/labels
- integration/cri-api/pkg/apis/services.go -> k8s.io/cri-api/pkg/apis/services.go

Adopt the upstream CRI clients directly:
- add k8s.io/cri-client v0.36.0, k8s.io/cri-streaming v0.36.0, and
  k8s.io/streaming v0.36.0 as direct dependencies
- promote k8s.io/utils to a direct dependency and pull in
  k8s.io/component-base v0.36.0 indirectly
- keep integration/remote as a thin containerd adapter around cri-client,
  because the integration tests still need the stream-shaped
  GetContainerEvents RPC

Finalize the Kubernetes dependency update from v0.36.0-rc.0 to v0.36.0,
refresh vendor/, and drop the obsolete internal utility copies.

Also fix the protobuf MessageState mutex-copy vet failures exposed by the new
APIs and close the temporary integration CRI clients explicitly.

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2026-04-23 12:59:58 +02:00
Fu Wei
704a2ff7eb Merge pull request #12763 from ningmingxiao/chekpoint_2
cri: fix create container panic if originalAnnotations is nil
2026-01-13 19:42:05 +00:00
ningmingxiao
9018c75d5d cri: fix create container panic if originalAnnotations is nil when restore container
Signed-off-by: ningmingxiao <ning.mingxiao@zte.com.cn>
2026-01-12 15:13:06 +08:00
ningmingxiao
0dc9582295 cri: fix checkpoint failed with short id
Signed-off-by: ningmingxiao <ning.mingxiao@zte.com.cn>
2026-01-09 10:47:37 -06:00
Radostin Stoyanov
cf7f4f5cc2 restore: skip pull for existing base image
This patch avoids image pull if the base image is already
cached locally by containerd.

Fixes: #11901

Signed-off-by: Radostin Stoyanov <rstoyano@redhat.com>
2025-06-24 17:46:29 +01:00
Mike Brown
8a08aebe1d removing/cloning vendor of kubelet pod label definitions
Signed-off-by: Mike Brown <brownwm@us.ibm.com>
2025-05-01 16:59:31 +00:00
Adrian Reber
9e6beafd53 Support container restore through CRI/Kubernetes
This implements container restore as described in:

https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/#restore-checkpointed-container-standalone

For detailed step by step instruction also see contrib/checkpoint/checkpoint-restore-cri-test.sh

The code changes are based on changes I have done in Podman around 2018
and CRI-O around 2020.

The history behind restoring container via CRI/Kubernetes probably
requires some explanation. The initial proposal to bring
checkpoint/restore to Kubernetes was looking at pod checkpoint and
restoring and the corresponding CRI changes.

https://github.com/kubernetes-sigs/cri-tools/pull/662
https://github.com/kubernetes/kubernetes/pull/97194

After discussing this topic for about two years another approach was
implemented as described in KEP-2008:

https://github.com/kubernetes/enhancements/issues/2008

"Forensic Container Checkpointing" allowed us to separate checkpointing
from restoring. For the "Forensic Container Checkpointing" it is enough
to create a checkpoint of the container. Restoring is not necessary as
the analysis of the checkpoint archive can happen without restoring the
container.

While thinking about a way to restore a container it was by coincidence
that we started to look into restoring containers in Kubernetes via
Create and Start. The way it was done in CRI-O is to figure out during
Create if the container image is a checkpoint image and if that is true
we are using another code path. The same was implemented now with this
change in containerd.

With this change it is possible to restore the container from a
checkpoint tar archive that is created during checkpointing via CRI.

To restore a container via Kubernetes we convert the tar archive to an
OCI image as described in the kubernetes.io blog post from above. Using
this OCI image it is possible to restore a container in Kubernetes.

At this point I think it should be doable to restore containers in
CRI-O and containerd no matter if they have been created by containerd or
CRI-O. The biggest difference is the container metadata and that can
be adapted during restore.

Open items:

 * It is not clear to me why restoring a container in containerd goes
   through task/Create(). But as the restore code already exists this
   change extended the existing code path to restore a container in
   task/Create() to also restore a container through the CRI via
   Create and Start.
 * Automatic image pulling. containerd does not pull images
   automatically if created via the CRI. There is an option in
   crictl to pull images before starting, but that uses the CRI
   image pull interface. It is still a separate pull and create
   operation. Restoring containers from an OCI image is a bit
   different. The checkpoint OCI image does not include the base
   image, but just a reference to the image (NAME@DIGEST).
   Using crictl with pulling will enable the pulling of the
   checkpoint image, but not of the base image the checkpoint is
   based on. So during preparation of the checkpoint containerd
   will automatically pull the base image, but I was not able how
   to pull an image blockingly in containerd. So there is a for
   loop waiting for the container image to appear in the internal
   store. I think this probably can be implemented better.

Anyway, this is a first step towards container restored in Kubernetes
when using containerd.

Signed-off-by: Adrian Reber <areber@redhat.com>
2025-03-11 12:55:13 +01:00
Akhil Mohan
ebc47359ea use format string when using printf like commands
As per https://github.com/golang/go/issues/60529, printf like commands with
non-constant format strings and no args give an error in govet

Signed-off-by: Akhil Mohan <akhilerm@gmail.com>
2024-08-14 17:04:53 +05:30
Phil Estes
04c7d6ccbf Merge pull request #9960 from adrianreber/2024-03-12-criu-not-found
Return correct error if CRIU binary is missing
2024-05-06 19:41:38 +00:00
Derek McGowan
2ac2b9c909 Make api a Go sub-module
Allow the api to stay at the same v1 go package name and keep using a
1.x version number. This indicates the API is still at 1.x and allows
sharing proto types with containerd 1.6 and 1.7 releases.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2024-05-02 11:03:00 -07:00
Derek McGowan
e1b94c0e7d Move protobuf package under pkg
Signed-off-by: Derek McGowan <derek@mcg.dev>
2024-05-02 10:52:03 -07:00
Derek McGowan
4a45507772 Move runc options to api directory
Signed-off-by: Derek McGowan <derek@mcg.dev>
2024-05-02 10:52:00 -07:00
Adrian Reber
218e2cf7cd Return correct error if CRIU binary is missing
For the first version of containerd's "Forensic Container Checkpointing"
support the error message if the CRIU binary is not found was
deliberately wrong to not break Kubernetes e2e_node tests.

Now that the e2e_node tests have been adapted, containerd can return the
correct error message.

Signed-off-by: Adrian Reber <areber@redhat.com>
2024-03-12 08:29:30 +00:00
Adrian Reber
f25770e48d Wire through CRI ContainerCheckpoint RPC
This connects the new CRI ContainerCheckpoint RPC to the existing
internal checkpoint functions. With this commit it is possible
to checkpoint a container in Kubernetes using the Forensic Container
Checkpointing KEP (#2008):

 # curl X POST "https://localhost:10250/checkpoint/namespace/podId/container"

Which will result in containerd creating a checkpoint in the location
specified by Kubernetes (usually /var/lib/kubelet/checkpoints).

This is a Linux only feature because CRIU only exists on Linux.

Rewritten with the help of Phil Estes.

Signed-off-by: Phil Estes <estesp@gmail.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
2024-03-07 17:34:07 +00:00