Commit Graph

238 Commits

Author SHA1 Message Date
Fu Wei
ef7bdf18a1 Merge pull request #10177 from azr/azr/parallel-layer-fetch
Multipart layer fetch
2025-04-24 20:35:00 +00:00
Phil Estes
25d1d30e6c Merge pull request #11744 from dmcgowan/unpack-rootfs-types
Add support for unpacking custom media types
2025-04-24 14:11:00 +00:00
Adrien Delorme
72c8c7708c only keep one setting: concurrent_layer_fetch_buffer
Signed-off-by: Adrien Delorme <azr@users.noreply.github.com>
2025-04-24 11:41:33 +02:00
Adrien Delorme
88116b1911 remove max_dl_operations setting
Signed-off-by: Adrien Delorme <azr@users.noreply.github.com>
2025-04-24 11:39:42 +02:00
Adrien Delorme
f9af08820b perf(pull): multipart layer fetch
Signed-off-by: Adrien Delorme <azr@users.noreply.github.com>
Co-Authored-By: Corentin REGAL <143578+co42@users.noreply.github.com>
2025-04-24 11:39:42 +02:00
Derek McGowan
cdd7ec40db Support configuring custom media types for unpack
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-04-23 21:19:07 -07:00
Maksym Pavlenko
d983c186f5 Merge pull request #11733 from erofs/erofs-layers
erofs-differ: support EROFS native image layers
2025-04-23 23:33:23 +00:00
Derek McGowan
116b98704b Merge pull request #8515 from fangn2/cri-image-transfer
Update CRI to use transfer service for image pull by default
2025-04-23 22:58:12 +00:00
Maksym Pavlenko
1f70f07480 Merge pull request #11729 from dmcgowan/erofsutils-internal
Move erofsutils to internal
2025-04-23 19:26:25 +00:00
Tony Fang
b694be29a0 Update CRI image service to pull using transfer service
- adds a transfer service progress reporter to handle timeouts. Also other test fixes
- fallback to local image pull when configuration conflict

Signed-off-by: Tony Fang <nhfang@amazon.com>

Co-authored-by: Swagat Bora <sbora@amazon.com>
2025-04-23 18:18:27 +00:00
Gao Xiang
2f9734fa59 erofs-differ: support EROFS native image layers
If the layer media type is expected as an EROFS native layer (ending
with `.erofs`), copy the content as the layer blob.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2025-04-24 00:26:37 +08:00
Maksym Pavlenko
e511a384ee Add warning message when using async mode
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2025-04-22 09:27:45 -07:00
Maksym Pavlenko
89a8cd2fb8 Introduce no_sync option
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2025-04-22 09:27:45 -07:00
Maksym Pavlenko
57c1cfa5ff Update godoc for Bolt options
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2025-04-22 09:27:45 -07:00
Maksym Pavlenko
c94a92f422 Expose boltdb configuration for metadata plugin
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2025-04-22 09:27:42 -07:00
Derek McGowan
98eded24b8 Move erofsutils to internal
Avoid introducing utils package outside of internal. This package
should not be imported by other modules.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-04-22 09:03:49 -07:00
Sebastiaan van Stijn
568880ec3e erofsutils: MountsToLayer slight optimizations
follow-up to 09f34d18b7

- Use strings.Cut instead of trimming prefixes and strings.Split
  to reduce allocations
- Use a switch for mount-type for slightly better readability
  than if / else if / else.
- Fix GoDoc to start with the function name.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-04-19 17:04:23 +02:00
Fu Wei
4a51d8c8f7 Merge pull request #11603 from erofs/erofs-snapshotter
erofs-differ: implement fast differ with DiffDirChanges()
2025-04-19 13:28:46 +00:00
Gao Xiang
09f34d18b7 erofs-differ: implement fast differ with DiffDirChanges()
Unlike the walking differ, which implements a generic method to
accommodate all kinds of snapshotters, the EROFS differ is just
implemented for EROFS and EROFS snapshotter so it can utilize the
recent DiffDirChanges() [1] to avoid traversing the entire rootfs
directory in order to improve `nerdctl commit` performance.

Additionally, I think `baseDir` is unnecessary too (in principle,
only `upperdir` is useful for OCI format convention).  However,
addressing this requires more work, so left as is for now.

It's also useful to implement a customized Compare() method for
EROFS differ so that we can dump the native EROFS-formatted blob
to the content store later.

[1] https://github.com/containerd/continuity/pull/145
Signed-off-by: Gao Xiang <xiang@kernel.org>
2025-04-19 11:30:48 +08:00
Akihiro Suda
d9c889568e Remove the support for Schema 1 images
Schema 1 (`application/vnd.docker.distribution.manifest.v1+prettyjws`) has been
officially deprecated since containerd v1.7 (PR 6884), and disabled since v2.0 (PR 9765).

Users who have been seeing warnings like `conversion from schema 1 images is deprecated`
now have to rebuild the image with Schema 2 or OCI.

Schema 2 was introduced in Docker 1.10 (Feb 2016), so most users should have been already
using Schema 2 or OCI.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2025-04-11 09:03:26 +09:00
Tonis Tiigi
f87b2c1cd8 avoid import to testing pkg outside of tests
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2025-04-07 23:58:28 -07:00
Henry Wang
a083b669c9 Set default differ for the default unpack config of transfer service
Signed-off-by: Henry Wang <henwang@amazon.com>
2025-04-01 22:39:49 +00:00
Samuel Karp
7bce8dfca5 Merge pull request #11581 from fuweid/fix-image-deletion-sync
*: CRIImageService should delete image synchronously
2025-03-24 21:44:44 +00:00
Wei Fu
e7b4165ab2 *: CRIImageService should delete image synchronously
Use memory service instead of metadata store.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2025-03-21 22:30:45 -04:00
Jin Dong
42effa3b91 Mark NetworkPluginBinDir as DEPRECATED
To make it as DEPRECATED, this PR does the following:

1. Changes config default to use `NetworkPluginBinDirs`;
2. Mark `NetworkPluginBinDir` as deprecated (in config version 3);
3. Add config migration from 2 to 3, which migrates `bin_dir`
  in version 2 to `bin_dirs` in version 3.

Signed-off-by: Jin Dong <djdongjin95@gmail.com>

[wip] add deprecation warning

Signed-off-by: Jin Dong <djdongjin95@gmail.com>
2025-03-21 16:59:32 +00:00
Adrian Reber
9e6beafd53 Support container restore through CRI/Kubernetes
This implements container restore as described in:

https://kubernetes.io/blog/2022/12/05/forensic-container-checkpointing-alpha/#restore-checkpointed-container-standalone

For detailed step by step instruction also see contrib/checkpoint/checkpoint-restore-cri-test.sh

The code changes are based on changes I have done in Podman around 2018
and CRI-O around 2020.

The history behind restoring container via CRI/Kubernetes probably
requires some explanation. The initial proposal to bring
checkpoint/restore to Kubernetes was looking at pod checkpoint and
restoring and the corresponding CRI changes.

https://github.com/kubernetes-sigs/cri-tools/pull/662
https://github.com/kubernetes/kubernetes/pull/97194

After discussing this topic for about two years another approach was
implemented as described in KEP-2008:

https://github.com/kubernetes/enhancements/issues/2008

"Forensic Container Checkpointing" allowed us to separate checkpointing
from restoring. For the "Forensic Container Checkpointing" it is enough
to create a checkpoint of the container. Restoring is not necessary as
the analysis of the checkpoint archive can happen without restoring the
container.

While thinking about a way to restore a container it was by coincidence
that we started to look into restoring containers in Kubernetes via
Create and Start. The way it was done in CRI-O is to figure out during
Create if the container image is a checkpoint image and if that is true
we are using another code path. The same was implemented now with this
change in containerd.

With this change it is possible to restore the container from a
checkpoint tar archive that is created during checkpointing via CRI.

To restore a container via Kubernetes we convert the tar archive to an
OCI image as described in the kubernetes.io blog post from above. Using
this OCI image it is possible to restore a container in Kubernetes.

At this point I think it should be doable to restore containers in
CRI-O and containerd no matter if they have been created by containerd or
CRI-O. The biggest difference is the container metadata and that can
be adapted during restore.

Open items:

 * It is not clear to me why restoring a container in containerd goes
   through task/Create(). But as the restore code already exists this
   change extended the existing code path to restore a container in
   task/Create() to also restore a container through the CRI via
   Create and Start.
 * Automatic image pulling. containerd does not pull images
   automatically if created via the CRI. There is an option in
   crictl to pull images before starting, but that uses the CRI
   image pull interface. It is still a separate pull and create
   operation. Restoring containers from an OCI image is a bit
   different. The checkpoint OCI image does not include the base
   image, but just a reference to the image (NAME@DIGEST).
   Using crictl with pulling will enable the pulling of the
   checkpoint image, but not of the base image the checkpoint is
   based on. So during preparation of the checkpoint containerd
   will automatically pull the base image, but I was not able how
   to pull an image blockingly in containerd. So there is a for
   loop waiting for the container image to appear in the internal
   store. I think this probably can be implemented better.

Anyway, this is a first step towards container restored in Kubernetes
when using containerd.

Signed-off-by: Adrian Reber <areber@redhat.com>
2025-03-11 12:55:13 +01:00
Maksym Pavlenko
e1e88115dd Merge pull request #11450 from austinvazquez/dependabot/go_modules/go.etcd.io/bbolt-1.4.0
build(deps): bump go.etcd.io/bbolt from 1.3.11 to 1.4.0
2025-03-06 00:43:50 +00:00
Phil Estes
f35b7dae5f Merge pull request #11330 from ningmingxiao/blkdiscard
device mapper:fix sometimes blkdiscard doesn't have --version flags
2025-03-05 21:49:43 +00:00
Gao Xiang
3a5de731c5 erofs-snapshotter: clear IMMUTABLE_FL only for committed snapshots
Otherwise, the following error avoids snapshot GC:
level=warning msg="snapshot garbage collection failed" error="failed
to clear IMMUTABLE_FL: failed to open: open /var/lib/containerd/io.
containerd.snapshotter.v1.erofs/snapshots/2/layer.erofs: no such file
or directory

Fixes: b477cf8e97 ("erofs-snapshotter: protect layer blobs with FS_IMMUTABLE_FL")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2025-03-05 10:04:07 +08:00
Gao Xiang
971915797a erofs-snapshotter: force the use of loop devices for single-layer images
Currently, containerd cannot dynamically select between EROFS block
or file-based mounting approaches based on the specific runtime (or
the Linux kernel version of the runtime) due to its static mount
structure.

For example, the EROFS snapshotter fails on Linux 5.4 (Ubuntu 20.04)
with `bin/nerdctl run --net=host --snapshotter=erofs busybox:latest`:

FATA[0005] failed to mount {Type:erofs Source:/var/lib/containerd/
io.containerd.snapshotter.v1.erofs/snapshots/1/layer.erofs Target:
Options:[ro]} on "/tmp/initialC1374142795": block device required

Temporarily fix this by appending `-oloop` for single-layer images.
The upcoming mount manager will make it better [1].

[1] https://github.com/containerd/containerd/issues/11303
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2025-03-04 17:07:01 +08:00
Gao Xiang
b477cf8e97 erofs-snapshotter: protect layer blobs with FS_IMMUTABLE_FL
As documented in ioctl_iflags(2):
```
 FS_IMMUTABLE_FL
  The file is immutable: no changes are permitted to the file contents
  or metadata (permissions, timestamps, ownership, link count, and so
  on).  (This restriction applies even to the superuser.)
```

For example, any user cannot delete/move layer blobs when
FS_IMMUTABLE_FL is set:
``` sh
 # cd /var/lib/containerd/io.containerd.snapshotter.v1.erofs/snapshots/4
 # mv layer{,1}.erofs
 mv: cannot move 'layer.erofs' to 'layer1.erofs': Operation not permitted
 # rm layer.erofs
 rm: cannot remove 'layer.erofs': Operation not permitted
```

Note that it's a best-effort approach for data loss prevention.  IOWs,
just warn out if FS_IMMUTABLE_FL cannot be set anyway (e.g., due to lack
of support in the underlying filesystem.)

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2025-03-03 20:11:48 +08:00
Austin Vazquez
00cb735039 Swap to go.etcd.io/bbolt/errors for bbolt errors
Signed-off-by: Austin Vazquez <macedonv@amazon.com>
2025-02-27 19:06:13 +00:00
Akihiro Suda
aeebc01e42 Merge pull request #11352 from ChengyuZhu6/fsverity
erofs-snapshotter: add fsverity support
2025-02-25 03:08:40 +00:00
ChengyuZhu6
f3b6078f90 erofs-snapshotter: add fsverity support
Add fsverity support to erofs snapshotter to enable data integrity
verification for erofs layers:

- Add an config option `EnableFsverity` for erofs snapshotter
- Add fsverity verification during mount operations
- Enable fsverity on erofs layers during commit
- Add documentation for fsverity support in erofs snapshotter.
- Add TestErofsFsverity to verify fsverity enablement and data protection

The feature can be enabled via config.toml, such as:
```toml
[plugins.'io.containerd.snapshotter.v1.erofs']
    root_path = ''
    ovl_mount_options = []
    enable_fsverity = true
```

Signed-off-by: ChengyuZhu6 <hudson@cyzhu.com>
2025-02-25 10:33:26 +08:00
Kirtana Ashok
6c02321f6e Merge pull request #11179 from ambarve/blocked_cim
Support for importing layers in the block CIM format.
2025-02-24 22:21:10 +00:00
ningmingxiao
44baada6aa device mapper:fix sometimes blkdiscard doesn't have --version flags
Signed-off-by: ningmingxiao <ning.mingxiao@zte.com.cn>
2025-02-14 22:45:08 +08:00
Amit Barve
a1c540085f Support for importing layers in the block CIM format.
Adds a new diff plugin that can import image layers in the block CIM format using the new
block CIM layer writer added in hcsshim repo.

This commit also makes another important change in the way a diff is applied when using
CimFS based layer writers. Currently, the diff plugins call archive.Apply to apply a diff
and pass a function (that can actually apply the diff) as an argument (via
archive.ApplyOptions). This allows the callers to call archive.Apply with either a custom
applier function or if the caller doesn't pass such a function archive.Apply uses the
default naive diff applier.
However, there is drawback to this approach. The applier function passed to the
`archive.Apply` call needs to follow a specific signature. This signature expects it that
all parent layers are represented as an array of strings. In cases like CimFS, we can't
easily represent a set of layers as strings (unless we encode extra data in those strings
in a hacky way). To get around this problem, the diff plugins for CimFS based layers, skip
the archive.Apply call and directly call the layer writer instead.

Signed-off-by: Amit Barve <ambarve@microsoft.com>
2025-02-10 14:10:37 -05:00
zouyee
b983786381 move the device after the options when using mkfs.ext4
Signed-off-by: zouyee <zouyee1989@gmail.com>
2025-02-08 16:07:51 +08:00
Derek McGowan
59c8cf6ea5 Merge pull request #10705 from erofs/erofs-snapshotter
[Feat] erofs snapshotter and differ
2025-02-05 15:21:33 +00:00
Jin Dong
168c49e4dc Fix state/root bug in shim sandbox controller
Signed-off-by: Jin Dong <djdongjin95@gmail.com>
2025-01-31 20:37:44 +00:00
Fu Wei
306c47f6e1 Merge pull request #10033 from ambarve/cimfs_layer_refactor
Update cimfs snapshotter & differ for new hcsshim interface
2025-01-22 19:49:36 +00:00
Phil Estes
98af40b752 Merge pull request #10722 from henry118/uidmap2
Support multiple uid/gid mappings [2/2]
2025-01-17 18:34:40 +00:00
Fu Wei
c6edc62db4 Merge pull request #10955 from mbaynton/userns-ovl-followup
Make ovl idmap mounts read-only
2025-01-14 21:14:47 +00:00
Gao Xiang
2f15d6586b Add tests for EROFS snapshotter
Some basic tests for now.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2025-01-13 16:31:21 +08:00
Gao Xiang
2486d542a5 Introduce EROFS Snapshotter
It allows us to mount each EROFS blob layer (generated by the EROFS
differ) independently, or use the "unpacked" fs/ directories (if
some other differ is used.)

Currently, it's somewhat like the overlay snapshotter, but I tend
to separate the new EROFS logic into a self-contained component,
rather than keeping it tangled in the very beginning.

Existing users who use the overlay snapshotter won't be impacted
at all but they have a chance to use this new snapshotter to
leverage the EROFS filesystem.

Signed-off-by: cardy.tang <zuniorone@gmail.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2025-01-13 16:31:11 +08:00
Gao Xiang
c73c8e5d52 Introduce EROFS differ
The EROFS differ only applies to EROFS layers which are marked by
a special file `.erofslayer` generated by the EROFS snapshotter.

Why it's needed?  Since we'd like to parse []mount.Mount directly
without actual mounting and convert OCI layers into EROFS blobs,
`.erofslayer` gives a hint that the active snapshotter supports
the output blob generated by the EROFS differ.

I'd suggest it could be read together with the next commit.

Signed-off-by: cardy.tang <zuniorone@gmail.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2025-01-13 16:16:54 +08:00
Jin Dong
fb44e37ff2 Remove confusing warning in cri runtime config migration
Signed-off-by: Jin Dong <djdongjin95@gmail.com>
2025-01-12 02:38:14 +00:00
Amit Barve
b81ace8724 Update cimfs snapshotter & differ for new hcsshim interface
hcsshim recently [updated](microsoft/hcsshim@1d406d0) the interface of APIs that are used
for importing OCI layers. It now expects that the CimFS snapshotter mounts contain the
full cim paths for parent layers. This change updates the cimfs differ & snapshotter to
use that new interface.

Signed-off-by: Amit Barve <ambarve@microsoft.com>
2025-01-10 17:06:57 -05:00
Maksym Pavlenko
3871c2f265 Merge pull request #11165 from djdongjin/fix-cri-image-snapshotter-loading
Fix runtime platform loading in cri image plugin init
2025-01-10 20:37:22 +00:00
Kazuyoshi Kato
5ad6a150b6 Merge pull request #11189 from djdongjin/move-to-go-native-fuzz
Move fuzz tests to go native fuzz [part1]
2025-01-10 01:27:14 +00:00