90 Commits

Author SHA1 Message Date
Austin Vazquez
88af11e081 core/runtime/v2: fix race on Windows deferredPipeConnection.c in Read
Read short-circuited on `if dpc.c == nil` before calling
`dpc.wg.Wait()` which races with the dialer goroutine spawned in
openShimLog. The dialer assigns `dpc.c = c` (and may set `dpc.conerr`)
outside any lock; the only synchronization is the WaitGroup, and Read
skipped it on the fast path.

Signed-off-by: Austin Vazquez <austin.vazquez@docker.com>
2026-05-21 17:15:58 -05:00
Maksym Pavlenko
3e0ebf0f6d Deprecate shim.Command
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2026-04-29 10:51:55 -07:00
Samuel Karp
f7150e2215 Merge pull request #13126 from dmcgowan/handle-mount-already-exists
Fix mount manager activation error when already exists
2026-04-24 16:16:40 +00:00
Davanum Srinivas
c30f23452c cri: use upstream Kubernetes modules
Switch the CRI integration layer from containerd's forked Kubernetes helpers
and clients to the upstream Kubernetes modules, and finalize the dependency
update to Kubernetes v0.36.0.

Replace the remaining internal helper copies with upstream packages:
- internal/cri/clock -> k8s.io/utils/clock
- internal/cri/executil -> upstream CRI exec helpers
- internal/cri/resourcequantity -> k8s.io/apimachinery/pkg/api/resource
- internal/cri/setutils -> k8s.io/apimachinery/pkg/util/sets
- internal/cri/types/labels.go -> internal/cri/labels
- integration/cri-api/pkg/apis/services.go -> k8s.io/cri-api/pkg/apis/services.go

Adopt the upstream CRI clients directly:
- add k8s.io/cri-client v0.36.0, k8s.io/cri-streaming v0.36.0, and
  k8s.io/streaming v0.36.0 as direct dependencies
- promote k8s.io/utils to a direct dependency and pull in
  k8s.io/component-base v0.36.0 indirectly
- keep integration/remote as a thin containerd adapter around cri-client,
  because the integration tests still need the stream-shaped
  GetContainerEvents RPC

Finalize the Kubernetes dependency update from v0.36.0-rc.0 to v0.36.0,
refresh vendor/, and drop the obsolete internal utility copies.

Also fix the protobuf MessageState mutex-copy vet failures exposed by the new
APIs and close the temporary integration CRI clients explicitly.

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2026-04-23 12:59:58 +02:00
Derek McGowan
80ec03fe7e core/mount: Fix mount manager activation error when already exists
Correctly handle cases where the mount activation still exists:
- If activation is fully activate, then just return already exists and
  allow the caller to return error or call Info to continue.
- If activation is stale or incomplete due to crash during activation,
  overwrite the identifier and cleanup the incomplete activation during
  activate.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2026-04-20 18:04:16 -07:00
Derek McGowan
e07a1aa491 Add configuration for socket directory to the shim manager
Allow the socket directory to be directly configured by the shim manager
with reasonable defaults when not set. The default for root users will
still be the same directory under the default state directory. For
non-root users a temp directory will be used as default if the state
directory is not owned by the user.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2026-04-15 00:21:29 -07:00
Derek McGowan
d806373feb Make shim socket directory use configured state
Send the socket directory from containerd to the shim. The shim still
decides where the socket goes but can use the environment variable
passed from containerd to ensure the socket is placed in the configured
directory with proper permission.

This is needed for some rootless cases which do not have permission to
the default state directory as currently set. The directory being
hardcoded by the shim means it is currently not possible to change the
location the shim will listen at.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2026-04-15 00:21:18 -07:00
Esteban Ginez
01e5fa616f fix: address review feedback on awaitPipeReady
- Use time.NewTimer + Stop() instead of time.After to avoid timer leaks
- Treat context.DeadlineExceeded as retryable (pipe busy, not just missing)
- Wrap last dial error instead of os.ErrNotExist for better diagnostics
- Update makeConnection godoc to reflect current BootstrapResult type

Signed-off-by: Esteban Ginez <esteban.ginez@docker.com>
2026-04-09 15:15:32 -07:00
Esteban Ginez
1e98ebaf0e fix(windows): verify pipe readiness before returning shim address
The shim "start" helper returns the named pipe address before the
daemon process has created the pipe via winio.ListenPipe(). On busy
Windows systems, containerd may try to connect before the pipe exists.

Add awaitPipeReady() — the start helper now polls the pipe address
(up to 5s, 10ms intervals) before writing the bootstrap result to
stdout. This follows hcsshim's readiness pattern where the shim
verifies its endpoint is ready before signaling the parent.

As a safety net, also parameterize makeConnection() with a dialer so
binary.Start() uses AnonDialer (retry) for new shims while loadShim()
keeps AnonReconnectDialer (fail-fast) for reconnects per #3659.

On Unix, awaitPipeReady() is a no-op: domain sockets appear atomically.

Signed-off-by: Esteban Ginez <esteban.ginez@docker.com>
2026-04-09 15:07:06 -07:00
Maksym Pavlenko
3c0e8a55b6 Update comments wording about when to deprecate and remove the old path
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2026-04-08 08:29:28 -05:00
Maksym Pavlenko
9dc864fd0f Switch to proto instead of json
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2026-04-08 08:29:28 -05:00
Derek McGowan
243cab594e Deprecate old pkg/shim interfaces
Signed-off-by: Derek McGowan <derek@mcg.dev>
2026-04-08 08:29:28 -05:00
Maksym Pavlenko
d957b1bf53 Use log level instead of debug flag
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2026-04-08 08:29:28 -05:00
Maksym Pavlenko
fa02acee20 Generate shim CLI flags under Command
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2026-04-08 08:29:28 -05:00
Maksym Pavlenko
fc8062f379 Rename CommandConfig field to better reflect their purpose
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2026-04-08 08:29:28 -05:00
Maksym Pavlenko
7f39b2d933 Update shim to support new bootstrap api
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2026-04-08 08:29:28 -05:00
Jintao Zhang
22e6e1541c Add binary-v2 logging readiness scheme
Expose log URI schemes in runtime v2 metadata

Signed-off-by: Jintao Zhang <zhangjintao9020@gmail.com>
2026-04-05 13:41:15 +00:00
Maksym Pavlenko
169e000389 Move runtime v2 docs to ./docs
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2026-04-04 14:10:29 -07:00
Fletcher Woodruff
b32cecd318 fix: handle nil spec for hostNetwork containers
When running hostNetwork containers with the shim sandboxer, the spec
passed to NewBundle is never initialized to a typed nil and is instead a
bare golang nil. Fix this by guarding the dereference to avoid crashing
when such a container is created.

Signed-off-by: Fletcher Woodruff <fwood@amazon.com>
2026-03-26 20:06:00 +00:00
Sebastiaan van Stijn
9bdcacc45f *: modernize: waitgroup
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2026-03-17 13:36:17 +01:00
Sebastiaan van Stijn
92b0b289e3 core: modernize: any
go install golang.org/x/tools/go/analysis/passes/modernize/cmd/modernize@latest
    modernize -any -fix ./...

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2026-03-15 18:24:14 +01:00
Fu Wei
591de24057 Merge pull request #12821 from haytok/propagate-deadline-exceeded-error
fix: propagate `context deadline exceeded` error properly
2026-02-21 21:58:31 +00:00
Hayato Kiwata
f078cebbd1 fix: propagate context deadline exceeded error properly
When a shim becomes unresponsive (e.g., stopped via SIGSTOP), ttrpc
communication times out with `context deadline exceeded`.

Currently, this error is not properly propagated, causing redundant API
calls and slow container listing by client sides.

Specifically, when executing the API to check the task state, it appears
that the `context deadline exceeded` error via ttrpc is not being handled
within `shimTask.State()` and `getProcessState()`.

As a result, when this error occurs, clients such as nerdctl cannot
recognize this error, and it is thought that the issue described below is
occurring:

- https://github.com/containerd/nerdctl/issues/4720

Therefore, this commit adds error handling to ensure timeouts are properly
handled by client sides.

Signed-off-by: Hayato Kiwata <dev@haytok.jp>
2026-01-26 22:58:28 +09:00
Maksym Pavlenko
992597bfe9 Fix TTRPC prefix
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2026-01-15 13:06:50 -08:00
Maksym Pavlenko
253fbe756a Cleanup unit tests
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2026-01-06 10:52:49 -08:00
Kal
5b184601d3 Fix typo in README.md for shim author section
Signed-off-by: Kal <119035751+Kaleab-Ayenew@users.noreply.github.com>
2025-12-16 15:20:25 +03:00
Sebastiaan van Stijn
459a95287b core/runtime/v2: remove uses of otelgrpc.UnaryClientInterceptor
The otelgrpc.UnaryClientInterceptor and otelgrpc.StreamClientInterceptor
options were deprecated and removed in favor of NewClientHandler.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-12-01 14:32:11 +01:00
Akihiro Suda
910171e90e Fix directory permissions
- Create /var/lib/containerd with 0o700 (was: 0o711).
- Create config.TempDir with 0o700 (was: 0o711).
- Create /run/containerd/io.containerd.grpc.v1.cri with 0o700 (was: 0o755).
- Create /run/containerd/io.containerd.sandbox.controller.v1.shim with 0o700 (was: 0o711).
- Leave /run/containerd and /run/containerd/io.containerd.runtime.v2.task created with 0o711,
  as required by userns-remapped containers.
  /run/containerd/io.containerd.runtime.v2.task/<NS>/<ID> is created with:
  - 0o700 for non-userns-remapped containers
  - 0o710 for userns-remapped containers with the remapped root group as the owner group.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2025-11-04 16:04:30 +09:00
Akihiro Suda
2e0e47c472 Deprecate cgroup v1
For issue 12443

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2025-10-31 19:34:16 +09:00
Derek McGowan
d9d3c82237 Cleanup load shim info
Avoid unnecessary new configuration and instead cache the info result
when first used.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-10-29 13:04:18 -07:00
Fu Wei
e95415b723 Merge pull request #12333 from dmcgowan/erofs-block-files
Update erofs snapshotter to use mount manager
2025-10-20 02:43:50 +00:00
Derek McGowan
92bc4fadc8 Update task manager to use mount transformers
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-10-14 22:53:43 -07:00
ningmingxiao
932b65a492 restart:use goroutine to speedup loadShims
Signed-off-by: ningmingxiao <ning.mingxiao@zte.com.cn>
2025-10-09 09:48:09 +08:00
Derek McGowan
f3af360f6d Update runtime to read allowed custom mounts from runtime
Let the runtime specify which custom mounts it will support

Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-09-30 13:37:11 -07:00
Derek McGowan
efc9950114 Add runtimes option to task manager
Allow task manager to fetch info on runtimes at startup.
Use this info to configure whether the runtime allows formatted mounts.
This info could also be used in the future to enforce policy such as
requiring a pre-known set of runtimes or specific runtime properties.

Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-09-29 17:08:38 -07:00
Derek McGowan
39f128b994 fix invalid deactivate error
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-09-29 17:08:35 -07:00
Derek McGowan
76a877bb5c Add mount activation support to task service
Signed-off-by: Derek McGowan <derek@mcg.dev>
2025-09-29 17:08:35 -07:00
Swagat Bora
7340a7a28f fix: create bootstrap.json with 0644 permission
Signed-off-by: Swagat Bora <sbora@amazon.com>
2025-08-06 22:29:37 +00:00
Derek McGowan
4ccee60e57 Merge pull request #12030 from thaJeztah/denoise_cleanupWorkDirs
core/runtime/v2: shimManager.cleanupWorkDirs ignore non-existing path
2025-07-14 23:45:29 +00:00
Sebastiaan van Stijn
bf1c47f5ed core/runtime/v2: shimManager.cleanupWorkDirs ignore non-existing path
While testing some things in moby, which involved removing storage paths,
I noticed some errors being printed for non-existing paths. As this function
is meant to delete the directories, it should probably be fine to consider it
idempotent (as further processing in the function also does).

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-06-26 21:33:34 +02:00
yashsingh74
b529973720 fix: ST1019: removed the duplicate imports
Signed-off-by: yashsingh74 <yashsingh1774@gmail.com>
2025-06-18 14:16:41 +05:30
Mike Brown
d123295973 Merge pull request #11886 from ningmingxiao/err_log
shim-v2:improve shim error message
2025-06-09 01:52:12 +00:00
Wei Fu
1ac97c2c13 *: properly shutdown non-groupable shims to prevent resource leaks
Previously, to address issue #11708, PR #11793 changed containerd to always
invoke the shim binary to establish shim connections, rather than reusing the
sandbox shim. However, this change did not ensure that the Shutdown API was
called to stop the shim process.

Starting with containerd v2.0.0, the Shutdown API is only invoked for sandbox
containers (when container.SandboxID is empty). This approach works for
groupable shims, where multiple containers share a single socket address and
only require a single Shutdown call. However, for non-groupable shims, each
container requires its own Shutdown call during cleanup to avoid leaking shim
processes.

Additionally, PR #11793 introduced a corner case during upgrades:
- T1: An old container-shim-runc-v2 (<=v1.7.X) is running for pod A.
- T2: containerd is upgraded to v2.X.Y.
- T3: A new container A-C1 is created in pod A using the new shim-runc-v2 binary.
- T4: bootstrap.json indicates version:3 protocol, but it is downgraded to version:2 in memory.
- T5: containerd is restarted.
- T6: containerd fails to connect to A-C1.
- T7: The A-C1 container is left in EXITED status in the CRI plugin.

To address this, ensure that loadShimTask downgrades to version:2 if necessary,
and always invoke the Shutdown API for each non-groupable shim during cleanup to
prevent resource leaks and handle upgrade scenarios correctly.

(Introduced by #11793)

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2025-06-02 00:01:08 -04:00
ningmingxiao
811d044229 shim-v2:improve shim error message
Signed-off-by: ningmingxiao <ning.mingxiao@zte.com.cn>
2025-05-22 23:20:44 +08:00
Phil Estes
315e0005e4 Merge pull request #11815 from thaJeztah/no_warn_logs
core/runtime/v2: cleanup shim-cleanup logs
2025-05-08 16:23:52 +00:00
Sebastiaan van Stijn
ada7bdf19f replace "cleanup.Background" for "context.WithoutCancel"
the `cleanup.Background` utility was introduced in f606c4eba7,
at which time the project used go1.19, and Go's stdlib context did not yet
have [`context.WithoutCancel`], which was introduced in go1.21.

This patch replaces `cleanup.Background` for `context.WithoutCancel`, which
is near-identical, and part of go stdlib;

`cleanup.Background`:

    type clearCancel struct {
        context.Context
    }

    func (cc clearCancel) Deadline() (deadline time.Time, ok bool) {
        return
    }

    func (cc clearCancel) Done() <-chan struct{} {
        return nil
    }

    func (cc clearCancel) Err() error {
        return nil
    }

    // Background creates a new context which clears out the parent errors
    func Background(ctx context.Context) context.Context {
        return clearCancel{ctx}
    }

`context.WithoutCancel`:

    // WithoutCancel returns a derived context that points to the parent context
    // and is not canceled when parent is canceled.
    // The returned context returns no Deadline or Err, and its Done channel is nil.
    // Calling [Cause] on the returned context returns nil.
    func WithoutCancel(parent Context) Context {
        if parent == nil {
            panic("cannot create context from nil parent")
        }
        return withoutCancelCtx{parent}
    }

    type withoutCancelCtx struct {
        c Context
    }

    func (withoutCancelCtx) Deadline() (deadline time.Time, ok bool) {
        return
    }

    func (withoutCancelCtx) Done() <-chan struct{} {
        return nil
    }

    func (withoutCancelCtx) Err() error {
        return nil
    }

    func (c withoutCancelCtx) Value(key any) any {
        return value(c, key)
    }

    func (c withoutCancelCtx) String() string {
        return contextName(c.c) + ".WithoutCancel"
    }

[`context.WithoutCancel`]: https://pkg.go.dev/context@go1.21.0#WithoutCancel

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-05-07 12:34:59 +02:00
Sebastiaan van Stijn
e58bc944f7 core/runtime/v2: cleanup shim-cleanup logs
I noticed that the "cleaning up after shim disconnected" was logged as a
warning, but I couldn't find a reason why it was considered a warning (
and didn't see an error attached to the log);

    INFO[2025-05-07T08:57:56.576773500Z] ignoring event                                container=589c33c88591a9936cb8fd64453e4b0c1357ae2f79d975c34645ff8f0aa10485 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
    INFO[2025-05-07T08:57:56.576938958Z] shim disconnected                             id=589c33c88591a9936cb8fd64453e4b0c1357ae2f79d975c34645ff8f0aa10485 namespace=moby
    WARN[2025-05-07T08:57:56.577356208Z] cleaning up after shim disconnected           id=589c33c88591a9936cb8fd64453e4b0c1357ae2f79d975c34645ff8f0aa10485 namespace=moby
    INFO[2025-05-07T08:57:56.577393417Z] cleaning up dead shim                         namespace=moby

I think this must've been set as a warning by acccident, so changing
this log to an "info" log as it seems part of the normal flow or operations.

While updating, also updating related logs, such as the "cleaning up dead shim"
log, to include the id, so that these logs can more easily be correlated.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2025-05-07 12:27:16 +02:00
yylt
5dc29f0e7e core/runtime: should invoke shim binary
Signed-off-by: yang yang <yang8518296@163.com>
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2025-05-06 13:20:42 -04:00
QiPing Wan
be9ca11a14 fix call fmt.Errorf with wrong error
Signed-off-by: QiPing Wan <alingse@foxmail.com>
2025-04-08 09:09:40 +08:00
Fu Wei
c07fb699a9 Merge pull request #11475 from yylt/ctx
fix: call checkCopyShimLogError(shimCtx) to avoid expected error log flood
2025-03-28 19:24:00 +00:00