1400 Commits

Author SHA1 Message Date
Michael Vogt
df12d58f8b hostname: add $ hostname substitution and petnames
This commit adds support to /etc/hostname for substitution
of $ wordlists from {/etc,/run,/usr/lib}/systemd/hostname-wordlist.
The first $ will lookup hostname-wordlist/1, the next
hostname-wordlist/2 and so on.

With that we can do a petname [1] style hostname in systemd, e.g.
below a possible expansion for a hostname template:

    $-$-$-????  ->  wildly-happy-octopus-92a9

The substitution of words is stable (based on machine-id) but
not persisted, it is picked on every boot via a stable file
offset so the operation is cheap. But this means that if the
wordlist changes the hostname would change. The next commit
will add the pattern to the firstboot.hostname credential which
is persistet with the resolved names to avoid this issue.

This also includes a wordlist from the "petname" project
that can be optionally installed.

Thanks to Dustin Kirkland for this wonderful project.

[1] https://github.com/dustinkirkland/petname
2026-06-20 14:26:41 +02:00
Daan De Meyer
27c5a2f2d7 docs: Update AI usage policy
The previous policy was primarily written from a standpoint
that AI models are not very good and we didn't wanna waste any
time reviewing PRs generated by AI. Now that AI models have become
actually good and their output is just as good as regular contributions,
let's stop requiring the disclosure as its pointless to still have it,
it doesn't really matter anymore whether a patch was written with or without
AI. It's up to the author to make sure they're not wasting our time by 
submitting unreviewed, untested code upstream, regardless of whether that
code is written by an AI or not.

The new policy is inspired by https://github.com/lxc/incus/pull/3506, with
various removals to be less adverse to the usage of AI.
2026-06-17 09:59:34 +00:00
Jordan Petridis
84d10a9004 docs/INHIBITOR_LOCKS: Update sentence for the new mode added
804874d26a added a new mode but the
sentence wasn't updated and it was still stating that there are Two modes
instead.
2026-06-06 13:44:49 +01:00
Valentin David
856ab04a29 units: Run systemd-pcrnvdone in initrd
The measurement that systemd-pcrnvdone corresponds to
`src/pcrlock/pcrlock.d/770-nvpcr-separator.pcrlock`, and 770 is supposed to
happen in the initrd (which ends at 800).
2026-06-04 11:23:59 +01:00
Zbigniew Jędrzejewski-Szmek
84389ff656 docs: say that the github form is preferred for security vulnerabilities 2026-06-01 15:06:20 +01:00
Luca Boccassi
36896a27b1 docs: specifically mention that braces in if blocks do not need to be symmetric
The claude bot keeps getting this wrong again and again:

  Claude: nit: systemd coding style requires braces on both branches of
               an if/else when one branch uses them. Here the if branch
               is a single statement without braces but the else branch
               uses braces

Specifically mention this is not the case in the coding style doc
to hopefully make it stop hallucinating this rule
2026-06-01 10:55:40 +01:00
Zbigniew Jędrzejewski-Szmek
58e7959988 Add $SYSTEMD_INVOKED_AS
This allows multi-call binaries to be easily invoked with a different
name. After installation, the name is set by creating a symlink. But in
build directories, we don't create the symlinks. (There are also other
ways to achieve the same thing, e.g. zsh supports $ARGV0, and exec -a
can be used, but those are either non-portable or are more complicated
to use.)  The primary use-case for me is to test --help output for
multicall binaries.

Also reorder the help for env vars to group the more generic ones near
the top.

This was initially proposed in https://github.com/systemd/systemd/pull/24054,
but there were some comments about the implementation. I had a branch
with the patch, but I don't think I ever actually submitted it as a
pull request.
2026-05-25 12:37:36 +02:00
Lennart Poettering
b7ad539c7d docs: document new measurement 2026-05-22 12:15:15 +02:00
Lennart Poettering
c3d8c2d25c docs: document the new smbios measurements 2026-05-21 21:44:06 +01:00
Lennart Poettering
89c837a5f8 docs: document the .nvpcr file format superficially 2026-05-21 16:31:51 +02:00
Luca Boccassi
20615d2d47 core: add FileDescriptorStorePreserve=on-success option (#42160)
Currently with FileDescriptorStorePreserve=yes the FD store is kept
around
regardless of what happens to a unit, which is useful in many cases. But
in
some cases, for example when complex services crash horribly, it's hard
to
reason about what was in the intermediate state, and it's better to
start
fresh.

Add a new 'on-success' option for the FileDescriptorStorePreserve=
setting
that keeps it around only for as long as the unit doesn't go to a
persistently
failed state.

This is especially useful in combination with LUO, where we don't want
to
keep around LUO sessions created by units that then proceeded to crash
and
burn, and might be in a bad state afterwards.
2026-05-21 12:08:33 +01:00
Lennart Poettering
9c2d331f5b docs: extend credentials docs with notes about imds 2026-05-21 10:51:59 +01:00
Luca Boccassi
67dea1409b core: add FileDescriptorStorePreserve=on-success option
Currently with FileDescriptorStorePreserve=yes the FD store is kept around
regardless of what happens to a unit, which is useful in many cases. But in
some cases, for example when complex services crash horribly, it's hard to
reason about what was in the intermediate state, and it's better to start
fresh.

Add a new 'on-success' option for the FileDescriptorStorePreserve= setting
that keeps it around only for as long as the unit doesn't go to a persistently
failed state.

This is especially useful in combination with LUO, where we don't want to
keep around LUO sessions created by units that then proceeded to crash and
burn, and might be in a bad state afterwards.
2026-05-21 10:14:08 +01:00
Daan De Meyer
74d392ed1b tree-wide: standardize header names across src/fundamental, src/basic and src/shared
Drop the -fundamental suffix from src/fundamental/ headers in favor of names
that match their src/basic/ or src/shared/ counterparts (e.g.
macro-fundamental.h -> macro.h, assert-fundamental.h -> assert-util.h,
cleanup-fundamental.h -> cleanup-util.h). Rename src/basic/{btrfs,label}.{c,h}
to use the -util suffix to match the existing shared/btrfs-util and
shared/label-util siblings. Rename src/shared/mkdir-label.{c,h} to mkdir.{c,h}
and src/shared/tmpfile-util-label.{c,h} to tmpfile-util.{c,h} to match the
corresponding src/basic names.

This saves us from having to come up with separate names for files that do
the same thing across tiers, and it makes it easier to move stuff between
src/fundamental, src/basic and src/shared: consumers just #include "foo.h"
and pick up whichever tier their -I path resolves to first, so call sites
don't need to be updated when an API moves between layers.

Where a higher-tier wrapper exists (e.g. src/basic/macro.h wrapping
src/fundamental/macro.h), the wrapper uses an explicit "../fundamental/foo.h"
or "../basic/foo.h" relative include for the lower-tier header. We can't use
GCC's #include_next directive for this — when the wrapper is reachable both
via same-dir-as-source lookup and via -I (e.g. -Isrc/shared) for the
directory it lives in, #include_next advances by exactly one slot in libcpp's
internal directory chain and lands on the same physical directory it was
already in, never reaching the lower-tier sibling (see make_cpp_dir() in
gcc/libcpp/files.cc:1986).

To make sure the right headers are always picked up, the include directories
are reordered so that e.g. src/shared always takes priority over src/basic and
similar for the other directories.

Co-developed-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 10:33:03 +09:00
Luca Boccassi
4a82fc67c6 homed/fscrypt: add new xattr format hardening key sealing
The current key sealing format has some less-than-ideal weaknesses:

- PBKDF2 with only 65k iterations, where recommendations are ~200k
- AES with null IV, relying on salt for uniqueness
- lack of AES MAC/AEAD

However improbable, it is at least theorically possible that with
a lot of resources an offline bruteforce could be attempted.

Add a v2 sealing format, keeping unsealing compatibility with
the current format:

 v2:<iterations>:<salt>:<IV>:<ciphertext>:<aes tag>

and use 600k iterations for the PBKDF2 sha512
2026-05-19 13:35:19 +01:00
Luca Boccassi
c10c2674f3 LUO: add support for preserving third party sessions
LUO sessions cannot be nested under other sessions. This means we need
to handle them explicitly, and held them open in the shutdown binary
like we do with our own internal session, to allow services to create
their own.

The requirement to support third party sessions comes from VMMs that
wish to preserve VM(s) state(s) across kexec, as some file descriptors
(KVM's vmfd from the KVM_CREATE_VM ioctl) cannot be transfered between
processes via SCM_RIGHTS, so they cannot be stashed in the FD Store
directly. Also some file descriptors have to be handled all together or
not at all, again to do with KVM and devices that are all part of the
same vm.
2026-05-15 13:46:08 +01:00
Luca Boccassi
0854ac48ef docs: document LUO support 2026-05-15 13:46:08 +01:00
Luca Boccassi
5608575e01 core: propagate FDs from store from user to system manager
In order to allow FD Stores of user units to survive a user
session restart, propagate FDs received via the protocol up one
level from user to system manager via sd_notify.

And the other way around, propagate them down via LISTEN_FDS
tagging them with the unit name so that the child manager can
inject them in the appropriate unit.

Ensure units that are dead or not loaded can get FDs added to
their stores, and that they are correctly propagated once the
unit is started or loaded. When the unit is not loaded we don't
know what the FD max limit is, so simply increase it for each FD
injected, and then when the unit is realised prune it down to
match the unit's now available config in case the limit is lower
than the number of FDs in the store.

Each FD sent up or down is assigned a monotonic index, and the manager
also sends a JSON map that associates the index with the original
unit and FDNAME:

 {
   "unit-name.service": [
     { "name": "fdname1", "index": 1 },
     { "name": "fdname2", "index": 2 }
   ],
   ...
 }

This allows the manager to assign back the FDs to the appropriate
unit using the appropriate name, given the FDNAMEs are not unique.
2026-05-15 13:46:08 +01:00
glemco
43b53679da cgroup: Add CPUSetPartition= setting
Add support for configuring cpuset partition type via the
CPUSetPartition= unit file setting. This controls the kernel's
cpuset.cpus.partition cgroup attribute.

The setting takes one of "member", "root", or "isolated". This is
useful for real-time workloads that require dedicated CPU resources
without interference from other processes.

When set, systemd will write the partition type to the
cpuset.cpus.partition cgroup file. If the kernel rejects the value
(e.g., due to partition hierarchy rules), a warning is logged and the
unit continues with the kernel's default partition type.

Co-developed-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2026-05-14 08:55:44 +02:00
Daan De Meyer
7b9d76cba7 boot,vconsole: Propagate UEFI HII keyboard layout to the OS
UEFI firmware can report the currently-active keyboard layout via
EFI_HII_DATABASE_PROTOCOL.GetKeyboardLayout(). The layout descriptor
includes an RFC 4646 / BCP 47 language tag (e.g. "en-US"). Query this
from sd-boot/sd-stub and write it to a new LoaderKeyboardLayout EFI
variable, advertised through a new EFI_LOADER_FEATURE_KEYBOARD_LAYOUT
feature bit.

On the OS side, systemd-vconsole-setup reads the variable as a
lowest-priority fallback for the console keymap. To map the BCP 47
tag to a vconsole keymap we extend /usr/share/systemd/kbd-model-map
with an optional sixth column listing the comma-separated BCP 47 tags
each row covers; a new find_vconsole_keymap_for_bcp47() helper walks
the file, preferring an exact tag match and otherwise falling back to
the row whose tag matches the input's primary subtag. Credentials,
/etc/vconsole.conf, and vconsole.keymap= on the kernel command line
continue to take precedence.

bootctl status surfaces the new variable, printing the language tag
or "n/a (not reported by firmware)" when sd-boot advertises the
feature but the firmware HII database didn't expose a layout (common
on QEMU without a USB keyboard, since EDK2's PS/2 driver does not
register an HII keyboard layout).
2026-05-11 21:10:11 +02:00
Daan De Meyer
012d87c1fc vmspawn,journal-remote: add journal forwarding disk usage options
Add options to vmspawn to configure journal-remote disk usage limits
when forwarding journal entries from the VM. These are passed through
as --max-use=, --keep-free=, --max-file-size=, and --max-files=
command-line arguments to systemd-journal-remote.

Add --max-use=, --keep-free=, --max-file-size=, and --max-files=
command-line options to systemd-journal-remote to allow overriding the
corresponding settings from the configuration file.

Add $SYSTEMD_JOURNAL_REMOTE_CONFIG_FILE environment variable support
to systemd-journal-remote. When set, the specified file is used
instead of the default configuration file and drop-in directories.
When set to the empty string or /dev/null, configuration file parsing
is skipped entirely. vmspawn sets this to /dev/null in the child
process to avoid inheriting the host's journal-remote configuration.

Make fork_notify() argv parameter optional. When NULL is passed,
fork_notify() returns 0 in the child (with $NOTIFY_SOCKET set) and
lets the caller run custom code before exec. Returns 1 in the parent.
This allows vmspawn to set environment variables in the child without
polluting the parent process.

Co-developed-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-22 20:05:38 +02:00
roib
db1ca20591 docs: update footer to 2026 2026-04-14 17:32:37 +01:00
Ivan Kruglov
3dd09ccea2 docs: clarify when to use varlink enum types vs plain strings
Add guidance on when a field should use a proper varlink enum type
versus remaining a plain string: user-controlled/API fields should be
enums, engine-internal state fields may stay as strings.

Co-developed-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-14 13:37:02 +02:00
rusty-snake
b40ed2067f docs: fix capability name, it's CAP_MKNOD not CAP_SYS_MKNOD (#41621) 2026-04-13 16:41:33 +01:00
Luca Boccassi
a1813a40ec docs: beef up SECURITY.md rules for reporting
With yeswehack.com suspended due to funding issues for triagers being
worked out, reports on GH are starting to pile up. Explicitly define
some ground rules to avoid noise and time wasting.
2026-04-10 17:34:03 +02:00
Daan De Meyer
5ade3f6a01 docs: Fix window in PRESSURE.md 2026-04-09 22:47:10 +02:00
Daan De Meyer
158f2d50bf docs: Update MEMORY_PRESSURE.md => PRESSURE.md
Make the doc more generic and mention all pressure types, not just
memory.
2026-04-09 22:47:10 +02:00
Kit Dallege
adc4757b9e docs: fix misleading VM/machined documentation
Fix two issues in WRITING_VM_AND_CONTAINER_MANAGERS.md:

1. The Host OS Integration section implied that -M switch and
   machinectl shell/login work for VMs, but they currently only
   work for containers. Add a note clarifying this limitation.

2. The Guest OS Integration section said "there's only one" VM
   integration API (SMBIOS Product UUID), but VM_INTERFACE.md
   documents five. Replace the outdated single-API description
   with a reference to VM_INTERFACE.md listing all five.

Fixes #40935

Co-developed-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-02 12:55:33 +02:00
Adrian Wannenmacher
f377be7081 fix list of inhibitor lock types
Markdown and HTML don't support mixing ordered and unordered items
within a single list. This means the previous syntax actually produced
three separate lists.

Also, markdown converters don't necesarrily respect the first number in
an ordered list, and may just overwrite it to one. This is the case for
the one that generates the systemd.io page. And even if that wasn't the
case, the numbering of the second ordered list would be off by one.
2026-03-28 20:57:52 +00:00
Daan De Meyer
53d5f5c02f ci: Drop codeql workflow
After analyzing all 218 CodeQL alerts across the project's history, the
workflow has not justified its CI cost:

- The most impactful query (PotentiallyDangerousFunction) was a custom
  systemd-specific query that has already been replaced by clang-tidy's
  bugprone-unsafe-functions check (6fb5ec3dd1).

- Of the remaining C++ queries, 6 never triggered at all
  (bad-strncpy-size, unsafe-strcat, unsafe-strncat,
  suspicious-pointer-scaling, suspicious-pointer-scaling-void,
  inconsistent-null-check).

- Several high-value-sounding queries had extreme false positive rates:
  toctou-race-condition (95% FP), use-after-free (88% FP),
  cleartext-transmission (100% FP).

- Many queries that did trigger are already covered by compiler warnings
  (-Wshadow, -Wformat, -Wunused-variable, -Wreturn-type,
  -Wtautological-compare) or existing clang-tidy checks
  (bugprone-sizeof-expression).

- Across all alerts, only 3 genuinely useful C++ fixes can be
  attributed to CodeQL: 1 tainted-format-string, 2
  incorrectly-checked-scanf. The rest were either false positives or
  incidental fixes during refactoring that weren't prompted by CodeQL.

- The Python queries are largely superseded by ruff (already in CI) and
  had an 89% false positive rate on the security-focused checks.

The workflow consumed significant CI resources (40+ minutes per run) and
the ongoing maintenance burden of triaging false positives outweighs the
marginal value of the 2-3 real findings it produced across its entire
lifetime.
2026-03-24 17:55:26 +01:00
Dylan M. Taylor
7a858878a0 userdb: add birthDate field to JSON user records
Add a birthDate field to the JSON user record, stored internally as a
struct tm with INT_MIN/negative sentinels for unset fields. The field
is serialized as a YYYY-MM-DD string in JSON and validated via
parse_birth_date(), which shares its core logic with
parse_calendar_date() through a new parse_calendar_date_full()
function.

For birth dates, timegm() is called directly (rather than
mktime_or_timegm_usec) to support pre-epoch dates. The wday field is
used to distinguish timegm() failure from a valid (time_t) -1 return.

birthDate is excluded from user_record_self_modifiable_fields(), so
only administrators can set or change it via homectl. The field
remains in the regular (non-privileged) JSON section, keeping it
readable by the user and applications.
2026-03-18 17:33:22 -04:00
Luca Boccassi
3236700a67 docs: update security policy to suggest GH advisories 2026-03-18 10:36:43 +00:00
davidak
bbd707a882 docs: document AI use disclosure consistently
The example also adds the model version to have it for reference.
2026-03-16 10:55:02 +01:00
Rito Rhymes
e5a6cc3a6f docs: contain image sizing and prevent overflow on mobile
`max-width: 100%` keeps images from expanding beyond
their container and creating horizontal overflow scroll
on small screens.

`height: auto` ensures the image scales proportionally
when width is adjusted.
2026-03-12 09:38:03 +01:00
Rito Rhymes
f18df62e71 docs: wrap bare enum constants in inline code in JOURNAL_FILE_FORMAT 2026-03-12 09:37:24 +01:00
Rito Rhymes
f9d4dce604 docs: allow long inline code to wrap to prevent overflow on mobile 2026-03-12 00:22:19 -04:00
Rito Rhymes
4443626b16 docs: allow long links to wrap to prevent overflow on mobile 2026-03-12 00:22:19 -04:00
Zbigniew Jędrzejewski-Szmek
52abc9fe96 basic/stdio-util: introduce asprintf_safe
asprintf is nice to use, but the _documented_ error return convention is
unclear:
> If  memory  allocation  wasn't possible, or some other error occurs,
> these functions will return -1, and the contents of strp are undefined.

What exactly "undefined" means is up for debate: if it was really
undefined, the caller wouldn't be able to meaningfully clean up, because
they wouldn't know if strp is a valid pointer. So far we interpreted
"undefined" — in some parts of the code base — as "either NULL or a
valid pointer that needs to be freed", and — in other parts of the
codebase — as "always NULL". I checked glibc and musl, and they both
uncoditionally set the output pointer to NULL on failure.

There is also no information _why_ asprintf failed. It could be an
allocation error or format string error. But we just don't have this
information.

Let's add a wrapper that either returns a good string or a NULL pointer.
Since there's just one failure result, we don't need a separate return
value and an output argument and can simplify callers.
2026-03-06 17:46:59 +01:00
Lennart Poettering
66a43b02e6 docs: document the "verity" NvPCR measurements
I forgot this when I posted 32f405074a,
let's add it now.
2026-03-02 23:26:21 +00:00
Daan De Meyer
838528104b nsresourced: Optionally map foreign UID range
Whenever delegating UID ranges to a user namespace, it can also be
useful to map the foreign UID range, so that the container running in
the user namespace with delegated UID ranges can download container
images and unpack them to the foreign UID range.

Let's add an option mapForeign to make this possible. Note that this option
gives unprivileged users full access to the any foreign UID range owned directory
that they can access. Hence it is recommended (and already was recommended) to
store foreign UID range owned directories in a 0700 directory owned by the
owner of the tree to avoid access and modifications by other users.

This is already the case for the main users of the foreign UID range,
namely /var/lib/machines, /var/lib/portables and /home/<user> which all
use 0700 as their mode.

Users will also be able to create foreign UID range owned inodes in any
directories their own user can write to (on most systems this means /tmp,
/var/tmp and /home/<user>).
2026-02-24 18:29:37 +01:00
r-vdp
450e0dce02 systemd-boot: add a preferred setting that's similar to default but avoids booting known-bad entries
Motivation:
Currently, when setting the default boot pattern, boot assessment status
is not taken into account. This means that with boot assessment enabled,
when an explicit boot entry is configured as the default entry using an
EFI var, as is common for instance in A/B boot schemes, the configured
entry will be booted indefinitly, regardless of the entry's boot
assessment status.
In order to allow for this use case in combination with boot assessment,
we introduce a new `preferred` keyword, both in the config file and in the
bootctl CLI, that acts very similar to the existing `default` keyword but
takes boot assessment into account and never selects any entries that
have been marked as bad.
If the preferred pattern does not resolve to any bootable entry, and a
default pattern is also specified, then the default pattern will be
considered next, and we may then still select a known-bad entry to be
booted.

Fixes: https://github.com/systemd/systemd/issues/31215
Fixes: https://github.com/systemd/systemd/issues/40192
2026-02-18 03:28:12 +09:00
Yu Watanabe
618952f07e CODING_STYLE: fix typo
Follow-up for 83b4a5bb3d.
2026-02-16 14:35:21 +09:00
andre4ik3
4afa6cf07a locale-util: allow overriding locale directory via environment 2026-02-16 08:09:33 +09:00
Michael Vogt
b5f8725989 varlinkctl: add pluggable protocol support to sd-varlink
When sd_varlink_connect_url() gets an unknown URL we now
check if there is a `$LIBEXECDIR/varlink-bridges/$scheme`
binary and execute it (with the url as the first arguments).

This makes varlink more flexible as it provides a way to
dynamically add "bridges" in LIBEXECDIR/varlink-bridges/. This is
conceptually similar to the libvarlink `varlink --bridge` command
and allows to e.g. call varlink over http{,s} via e.g. the new
varlink-http-bridge.

With a running varlink-http-bridge [0] one can do:
```console
$ varlinkctl call http://localhost:8080/ws/sockets/io.systemd.Hostname \
    io.systemd.Hostname.Describe {}
{
        "Hostname" : "top",
...
```

Closes: https://github.com/systemd/systemd/issues/40640

[0] https://github.com/mvo5/varlink-http-bridge/pull/1
2026-02-13 19:07:20 +01:00
Tabis Kabis
b15b337f8e Switch back to 'http' in SVG files (#40661)
Firefox & Chrome don't render images because of 'https' being used in the SVG.
Switch back to 'http'.

Follow-up for 0922f62126
2026-02-12 17:49:19 +00:00
Lennart Poettering
83b4a5bb3d CODING_STYLE: add a brief log msg style guide 2026-02-12 18:00:37 +01:00
Luca Boccassi
e67b008fa3 journald: set a lower size limit for FDs from unpriv processes
Unprivileged processes can send 768M in a FD-based message to journald,
which will be malloc'ed in one go, likely causing memory issues.
Set the limit for unprivileged users to 24M.

Allow coredumps as an exception, since we always allowed storing
up to the 768M max core files in the journal.

Reported on yeswehack.com as #YWH-PGM9780-48
2026-02-09 13:51:59 +01:00
Zbigniew Jędrzejewski-Szmek
df8747806b Two cleanups (#40587) 2026-02-09 11:02:41 +01:00
Zbigniew Jędrzejewski-Szmek
25860000b6 Fix wording in two places
Noticed this while going through the stable series…
Also update location after 97318131fd.
2026-02-09 11:01:15 +01:00
Luca Boccassi
a2e55fceee docs: note step to update obs workflow file on release 2026-02-09 09:36:53 +01:00