try out existing nix container made for gitea actions #405

Open
kiara wants to merge 20 commits from kiara/Fediversity:ci-in-nix-docker into main
Owner

this proposal is to try out the first idea from Fediversity/Fediversity#393 (comment), and may supersede #393.
this isn't to say we shouldn't want a more up-to-date container, altho it might help to validate the approach works.

part of #362.

EDIT: potentially superseded by #463

this proposal is to try out the first idea from https://git.fediversity.eu/Fediversity/Fediversity/pulls/393#issuecomment-8008, and may supersede #393. this isn't to say we shouldn't want a more up-to-date container, altho it might help to validate the approach works. part of #362. EDIT: potentially superseded by #463
requested reviews from fricklerhandwerk, Niols 2025-06-24 15:05:26 +02:00
Owner

The CI didn't run; is that normal?

The CI didn't run; is that normal?
Author
Owner

@Niols i retried but no dice. something does seem up. maybe the runner node does need to have the container manually downloaded first?

@Niols i retried but no dice. something does seem up. maybe the runner node does need to have the container manually downloaded first?
kiara changed title from try out existing nix container made for gitea actions to WIP: try out existing nix container made for gitea actions 2025-06-25 19:17:34 +02:00
Owner

What you use in runs-on must correspond to a label on the runner, so this will just hang. If you have a label that runs docker, you can use that label and override the image with:

runs-on: <the-label>
container:
  image: <the-image>

https://forgejo.org/docs/v1.20/user/actions/#jobsjob_idcontainer

What you use in `runs-on` must correspond to a label on the runner, so this will just hang. If you have a label that runs docker, you can use that label and override the image with: ``` runs-on: <the-label> container: image: <the-image> ``` https://forgejo.org/docs/v1.20/user/actions/#jobsjob_idcontainer
kiara added a new dependency 2025-06-27 11:55:58 +02:00
kiara force-pushed ci-in-nix-docker from cf658afa09 to fba8719923 2025-07-01 10:41:28 +02:00 Compare
Niols reviewed 2025-07-01 13:09:56 +02:00
@ -11,3 +11,3 @@
jobs:
check-pre-commit:
runs-on: native
runs-on: nix
Owner

Unless you've pushed the configuration to the runner, I think this should be docker?

Unless you've pushed the configuration to the runner, I think this should be `docker`?
kiara marked this conversation as resolved
kiara force-pushed ci-in-nix-docker from 9621aa56b1 to 3747bade28 2025-07-01 13:35:31 +02:00 Compare
Author
Owner

error: a 'x86_64-linux' with features {kvm, nixos-test} is required to build '/nix/store/mqqx68j6vw7k66qv79zbzh1jwx2pi02k-vm-test-run-peertube.drv', but I am a 'x86_64-linux' with features {benchmark, big-parallel, nixos-test, uid-range}

i.e. missing feature: kvm

this seems to come down to either:

system-features = nixos-test benchmark big-parallel kvm uid-range
> error: a 'x86_64-linux' with features {kvm, nixos-test} is required to build '/nix/store/mqqx68j6vw7k66qv79zbzh1jwx2pi02k-vm-test-run-peertube.drv', but I am a 'x86_64-linux' with features {benchmark, big-parallel, nixos-test, uid-range} i.e. missing feature: `kvm` this seems to come down to [either](https://github.com/nix-community/nixos-generators/issues/83#issuecomment-1718440437): - launching docker with [`--device=/dev/kvm`](https://docs.docker.com/reference/cli/docker/container/run/#device) (e.g. [`containers.<name>.allowedDevices = [ { modifier = "rwm"; node = "/dev/kvm"; } ];`](https://search.nixos.org/options?channel=unstable&show=containers.<name>.allowedDevices&query=containers) / [`virtualisation.oci-containers.containers.<name>.devices = ["/dev/kvm:/dev/kvm"];`](https://search.nixos.org/options?channel=unstable&show=virtualisation.oci-containers.containers.%3Cname%3E.devices&from=0&size=50&sort=relevance&type=packages&query=virtualisation.oci-containers.containers.%3Cname%3E.devices)) - adding `system-features` attribute `kvm` to `nix.conf` in [the image](https://icewind.nl/entry/gitea-actions-nix/#using-nix-to-build-our-nix-image) (implies rebuilding it and making sure the runner knows it): ```ini system-features = nixos-test benchmark big-parallel kvm uid-range ```
kiara force-pushed ci-in-nix-docker from 3747bade28 to 3e8c0c7738 2025-07-02 19:11:57 +02:00 Compare
Author
Owner

i tried a failed attempt (on a branch), yielding:

Error: crun: cannot find `` in $PATH: No such file or directory: OCI runtime attempted to invoke a command that was not found

this triggers on compile failures over missing executables, in the nix context apparently over say dockerTools.buildImage's config.Cmd, which in the base image uses bash. while more info should be in oci-log in podman's run-time directory (/run/podman or in this case /run/user/1004/podman/), on forgejo-ci this unfortunately only still contains podman.sock.

edit: this error does not occur when i override config.cmd, so i believe the issue is i try to run a one-of command (or even empty) as a daemon, while really i just want a way to ensure the image is built ahead of CI looking for the container.

i tried a [failed attempt](https://git.fediversity.eu/kiara/Fediversity/commit/5cb6f03e4eb38df06caf3a9ebc9b9fdc8aafa283) (on a [branch](https://git.fediversity.eu/kiara/Fediversity/compare/ci-in-nix-docker...icewind-container)), yielding: > Error: crun: cannot find `` in $PATH: No such file or directory: OCI runtime attempted to invoke a command that was not found this triggers [on compile failures over missing executables](https://github.com/containers/podman/blob/c8272b23a59846c92a2a2967263a94d5fb5a784e/libpod/oci_util.go#L150-L156), in the nix context [apparently over say `dockerTools.buildImage`'s `config.Cmd`](https://discourse.nixos.org/t/issues-creating-a-basic-docker-image-not-found-in-path/23390), which in the base image [uses bash](https://github.com/NixOS/nix/blob/6bf997e0bdeb65ea6381eb430248e69fd2c0f952/docker.nix#L29). while more info should be in `oci-log` in podman's run-time directory (`/run/podman` or in this case `/run/user/1004/podman/`), on `forgejo-ci` this unfortunately only still contains `podman.sock`. edit: this error does not occur when i override `config.cmd`, so i believe the issue is i try to run a one-of command (or even empty) as a daemon, while really i just want a way to ensure the image is built ahead of CI looking for the container.
kiara force-pushed ci-in-nix-docker from 3e8c0c7738 to d87b1ebd63 2025-07-06 21:37:50 +02:00 Compare
kiara changed title from WIP: try out existing nix container made for gitea actions to try out existing nix container made for gitea actions 2025-07-06 21:40:22 +02:00
kiara force-pushed ci-in-nix-docker from 563a5863e9 to 9a273cada1 2025-07-07 09:51:22 +02:00 Compare
kiara force-pushed ci-in-nix-docker from e011ef6110 to 64aa8049a7 2025-07-07 12:15:26 +02:00 Compare
kiara changed title from try out existing nix container made for gitea actions to WIP: try out existing nix container made for gitea actions 2025-07-07 14:19:17 +02:00
Author
Owner

a while after deploying this, half the step runs would fail on their checkout steps.

a failing step would just output this.

::add-matcher::/var/lib/private/gitea-runner/nix0/.cache/act/d071e2ae85e4660a/act/actions/actions-checkout@v4/dist/problem-matcher.json
Syncing repository: Fediversity/Fediversity
Getting Git version info
Working directory is '/var/lib/gitea-runner/nix0/.cache/act/d071e2ae85e4660a/hostexecutor'
[command]/nix/store/lqx2rv26sdndpa2vyy2vxsahj03km69z-git-2.48.1/bin/git version
git version 2.48.1
Copying '/var/lib/gitea-runner/nix0/.gitconfig' to '/var/lib/gitea-runner/nix0/.cache/act/d071e2ae85e4660a/tmp/5b40f50f-9fff-45b8-a662-2c2d61c03d97/.gitconfig'

rather than this, for a successful checkout step.

::add-matcher::/var/run/act/actions/actions-checkout@v4/dist/problem-matcher.json
Syncing repository: Fediversity/Fediversity
Getting Git version info
Temporarily overriding HOME='/tmp/39b1d8ce-eeaa-474e-a0dc-8973b8c5de70' before making global git config changes
Adding repository directory to the temporary git global config as a safe directory
[command]/bin/git config --global --add safe.directory /workspace/Fediversity/Fediversity
Deleting the contents of '/workspace/Fediversity/Fediversity'
Initializing the repository
Disabling automatic garbage collection
Setting up auth
Fetching the repository
Determining the checkout info
[command]/bin/git sparse-checkout disable
[command]/bin/git config --local --unset-all extensions.worktreeConfig
Checking out the ref
[command]/bin/git log -1 --format=%H
64aa8049a7a37b640e4eba546ea3c694c9347e0d
::remove-matcher owner=checkout-git::

not quite knowing the fix so far, i've deployed the main branch's setup to the runner VM again.

a while after deploying this, half the step runs would [fail](https://git.fediversity.eu/Fediversity/Fediversity/actions/runs/946) on their checkout steps. <details> <summary> a failing step would just output this. </summary> > ::add-matcher::/var/lib/private/gitea-runner/nix0/.cache/act/d071e2ae85e4660a/act/actions/actions-checkout@v4/dist/problem-matcher.json Syncing repository: Fediversity/Fediversity Getting Git version info Working directory is '/var/lib/gitea-runner/nix0/.cache/act/d071e2ae85e4660a/hostexecutor' [command]/nix/store/lqx2rv26sdndpa2vyy2vxsahj03km69z-git-2.48.1/bin/git version git version 2.48.1 Copying '/var/lib/gitea-runner/nix0/.gitconfig' to '/var/lib/gitea-runner/nix0/.cache/act/d071e2ae85e4660a/tmp/5b40f50f-9fff-45b8-a662-2c2d61c03d97/.gitconfig' </details> <details> <summary> rather than this, for a successful checkout step. </summary> > ::add-matcher::/var/run/act/actions/actions-checkout@v4/dist/problem-matcher.json Syncing repository: Fediversity/Fediversity Getting Git version info Temporarily overriding HOME='/tmp/39b1d8ce-eeaa-474e-a0dc-8973b8c5de70' before making global git config changes Adding repository directory to the temporary git global config as a safe directory [command]/bin/git config --global --add safe.directory /workspace/Fediversity/Fediversity Deleting the contents of '/workspace/Fediversity/Fediversity' Initializing the repository Disabling automatic garbage collection Setting up auth Fetching the repository Determining the checkout info [command]/bin/git sparse-checkout disable [command]/bin/git config --local --unset-all extensions.worktreeConfig Checking out the ref [command]/bin/git log -1 --format=%H 64aa8049a7a37b640e4eba546ea3c694c9347e0d ::remove-matcher owner=checkout-git:: </details> not quite knowing the fix so far, i've deployed the `main` branch's setup to the runner VM again.
Author
Owner

redeploying this reproduced the checkout problem.
while the parent directory of /var/lib/gitea-runner/nix0/.cache/act/d071e2ae85e4660a/tmp/5b40f50f-9fff-45b8-a662-2c2d61c03d97/.gitconfig did not yet exist, /var/lib/gitea-runner/nix0/.cache/act does.
the line it gets stuck on, in line with the print, seems await io.cp(gitConfigPath, newGitConfigPath).

not understanding why, especially as earlier attempts appeared to work, maybe my best bet to resolve this would be trying to bisect the code involved.

redeploying this reproduced the checkout problem. while the parent directory of `/var/lib/gitea-runner/nix0/.cache/act/d071e2ae85e4660a/tmp/5b40f50f-9fff-45b8-a662-2c2d61c03d97/.gitconfig` did not yet exist, `/var/lib/gitea-runner/nix0/.cache/act` does. the line it gets stuck on, in line with the print, seems [`await io.cp(gitConfigPath, newGitConfigPath)`](https://github.com/actions/checkout/blob/09d2acae674a48949e3602304ab46fd20ae0c42f/src/git-auth-helper.ts#L113). not understanding why, especially as earlier attempts appeared to work, maybe my best bet to resolve this would be trying to bisect the code involved.
kiara changed title from WIP: try out existing nix container made for gitea actions to try out existing nix container made for gitea actions 2025-07-11 13:44:17 +02:00
kiara changed title from try out existing nix container made for gitea actions to WIP: try out existing nix container made for gitea actions 2025-07-11 13:46:07 +02:00
Author
Owner

switching back from podman to docker seemed to fix this.

that said, i'm not sure the parallelization is as we expect tho: i see multiple steps of the same task run in parallel, tho i'm not yet sure if this works for multiple tasks in parallel yet.

edit: in fact it does, so this seems ready.

switching back from podman to docker seemed to fix this. that said, i'm not sure the parallelization is as we expect tho: i see multiple steps of the same task run in parallel, tho i'm not yet sure if this works for multiple tasks in parallel yet. edit: in fact it does, so this seems ready.
kiara changed title from WIP: try out existing nix container made for gitea actions to try out existing nix container made for gitea actions 2025-07-11 13:56:14 +02:00
kiara force-pushed ci-in-nix-docker from fa902637fa to 95873fd960 2025-07-11 16:19:42 +02:00 Compare
Author
Owner

CI checkout failures at #456 still seem concerning, i may need to figure out what went wrong there - after re-deploying main to make it go thru.

CI checkout failures at #456 still seem concerning, i may need to figure out what went wrong there - after re-deploying `main` to make it go thru.
Author
Owner

update: i hadn't managed this yet, due to an SSH permission denied error, so this still affects CI now.

edit: i can ssh in by procolix@ (i think this was something on my end), so deployed main again for now.

update: i hadn't managed this yet, due to an SSH permission denied error, so this still affects CI now. edit: i can ssh in by `procolix@` (i think this was something on my end), so deployed `main` again for now.

CI checkout failures at #456 still seem concerning

Yes those are extremely weird

> CI checkout failures at #456 still seem concerning Yes those are extremely weird
kiara force-pushed ci-in-nix-docker from 95873fd960 to 5acbf0300d 2025-07-17 16:05:13 +02:00 Compare
Author
Owner

tried a run with log level trace and the following superfluous litany of debug secrets:

  • github actions:
    • ACTIONS_RUNNER_DEBUG: true
  • act runner:
    • RUNNER_DEBUG: 1 (moved to non-sensitive variables to prevent the Forgejo runner from substituting every 1 occurrence in the logs with ***)
  • forgejo runner:
    • ACTIONS_STEP_DEBUG : true
  • checkout action:
    • GIT_CURL_VERBOSE: True
    • GIT_TRACE: True
    • GIT_SSH_COMMAND: ssh -vvv

.. as per that run, the checkout actions had yet to trigger, altho the peertube test somehow seemed to fail without logging output whatsoever.

on a rerun, the `checkout` issue triggered on at least the mastodon test, with the more verbose logs giving a better hint on the issue.
[procolix@forgejo-ci:~]$ systemctl status gitea-runner-nix0.service
● gitea-runner-nix0.service - Gitea Actions Runner
     Loaded: loaded (/etc/systemd/system/gitea-runner-nix0.service; enabled; preset: ignored)
     Active: active (running) since Thu 2025-07-17 16:09:36 CEST; 22min ago
 Invocation: ecaf13de78244b509082b65e5bf5d8ec
    Process: 2263 ExecStartPre=/nix/store/83lglqkhp1gbs4a0zja692v9zk6y50zl-gitea-register-runner-nix0 (code=exited, status=0/SUCCESS)
   Main PID: 2314 (act_runner)
         IP: 739.5K in, 313.7K out
         IO: 11.7M read, 0B written
      Tasks: 27 (limit: 38274)
     Memory: 16.9M (peak: 45.8M)
        CPU: 3.848s
     CGroup: /system.slice/gitea-runner-nix0.service
             └─2314 /nix/store/jqsfar1nmsggvcajdgphr8cpmgcyv24w-forgejo-runner-6.3.1/bin/act_runner daemon --config /nix/store/lmiaszzg3b07x9yysvg1wshx1h5rps4h-config.yaml

Jul 17 16:30:47 forgejo-ci act_runner[2314]: [/check-mastodon] ⭐ Run Post actions/checkout@v4
Jul 17 16:30:47 forgejo-ci act_runner[2314]: time="2025-07-17T16:30:47+02:00" level=trace msg="  ✅  Success - Post actions/checkout@v4" dryrun=false job=/check-mastodon jobID=check-mastodon matrix="map[]" stage=Post step=actions/checkou>
Jul 17 16:30:47 forgejo-ci act_runner[2314]: [/check-mastodon]   ✅  Success - Post actions/checkout@v4
Jul 17 16:30:47 forgejo-ci act_runner[2314]: time="2025-07-17T16:30:47+02:00" level=trace msg="Cleaning up container for job check-mastodon" dryrun=false job=/check-mastodon jobID=check-mastodon matrix="map[]"
Jul 17 16:30:47 forgejo-ci act_runner[2314]: [/check-mastodon] Cleaning up container for job check-mastodon
Jul 17 16:30:47 forgejo-ci act_runner[2314]: time="2025-07-17T16:30:47+02:00" level=trace msg="🏁  Job failed" dryrun=false job=/check-mastodon jobID=check-mastodon jobResult=failure matrix="map[]"
Jul 17 16:30:47 forgejo-ci act_runner[2314]: [/check-mastodon] 🏁  Job failed
Jul 17 16:31:00 forgejo-ci act_runner[2314]: time="2025-07-17T16:31:00+02:00" level=trace msg="deadline exceeded"
Jul 17 16:31:05 forgejo-ci act_runner[2314]: time="2025-07-17T16:31:05+02:00" level=trace msg="deadline exceeded"
Jul 17 16:31:10 forgejo-ci act_runner[2314]: time="2025-07-17T16:31:10+02:00" level=trace msg="deadline exceeded"

the few hints i can find about such a deadline exceeded seem to point in the general direction of https://github.com/moby/buildkit.

tried a [run](https://git.fediversity.eu/Fediversity/Fediversity/actions/runs/1042/jobs/3) with log level `trace` and the following superfluous litany of debug secrets: - [github actions](https://docs.github.com/en/actions/how-tos/monitoring-and-troubleshooting-workflows/troubleshooting-workflows/enabling-debug-logging): - `ACTIONS_RUNNER_DEBUG`: `true` - [act runner](https://github.com/nektos/act/issues/1006#issuecomment-1497101100): - `RUNNER_DEBUG`: `1` (moved to non-sensitive variables to prevent the Forgejo runner from substituting every `1` occurrence in the logs with `***`) - [forgejo runner](https://forgejo.org/docs/latest/user/actions/#debugging-workflows): - `ACTIONS_STEP_DEBUG `: `true` - [`checkout` action](https://github.com/actions/checkout/issues/1713#issuecomment-2325105056): - `GIT_CURL_VERBOSE`: `True` - `GIT_TRACE`: `True` - `GIT_SSH_COMMAND`: `ssh -vvv` .. as per that run, the `checkout` actions had yet to trigger, altho the peertube test somehow seemed to fail without logging output whatsoever. <details> <summary> on a rerun, the `checkout` issue triggered on at least the mastodon test, with the more verbose logs giving a better hint on the issue. </summary> ``` [procolix@forgejo-ci:~]$ systemctl status gitea-runner-nix0.service ● gitea-runner-nix0.service - Gitea Actions Runner Loaded: loaded (/etc/systemd/system/gitea-runner-nix0.service; enabled; preset: ignored) Active: active (running) since Thu 2025-07-17 16:09:36 CEST; 22min ago Invocation: ecaf13de78244b509082b65e5bf5d8ec Process: 2263 ExecStartPre=/nix/store/83lglqkhp1gbs4a0zja692v9zk6y50zl-gitea-register-runner-nix0 (code=exited, status=0/SUCCESS) Main PID: 2314 (act_runner) IP: 739.5K in, 313.7K out IO: 11.7M read, 0B written Tasks: 27 (limit: 38274) Memory: 16.9M (peak: 45.8M) CPU: 3.848s CGroup: /system.slice/gitea-runner-nix0.service └─2314 /nix/store/jqsfar1nmsggvcajdgphr8cpmgcyv24w-forgejo-runner-6.3.1/bin/act_runner daemon --config /nix/store/lmiaszzg3b07x9yysvg1wshx1h5rps4h-config.yaml Jul 17 16:30:47 forgejo-ci act_runner[2314]: [/check-mastodon] ⭐ Run Post actions/checkout@v4 Jul 17 16:30:47 forgejo-ci act_runner[2314]: time="2025-07-17T16:30:47+02:00" level=trace msg=" ✅ Success - Post actions/checkout@v4" dryrun=false job=/check-mastodon jobID=check-mastodon matrix="map[]" stage=Post step=actions/checkou> Jul 17 16:30:47 forgejo-ci act_runner[2314]: [/check-mastodon] ✅ Success - Post actions/checkout@v4 Jul 17 16:30:47 forgejo-ci act_runner[2314]: time="2025-07-17T16:30:47+02:00" level=trace msg="Cleaning up container for job check-mastodon" dryrun=false job=/check-mastodon jobID=check-mastodon matrix="map[]" Jul 17 16:30:47 forgejo-ci act_runner[2314]: [/check-mastodon] Cleaning up container for job check-mastodon Jul 17 16:30:47 forgejo-ci act_runner[2314]: time="2025-07-17T16:30:47+02:00" level=trace msg="🏁 Job failed" dryrun=false job=/check-mastodon jobID=check-mastodon jobResult=failure matrix="map[]" Jul 17 16:30:47 forgejo-ci act_runner[2314]: [/check-mastodon] 🏁 Job failed Jul 17 16:31:00 forgejo-ci act_runner[2314]: time="2025-07-17T16:31:00+02:00" level=trace msg="deadline exceeded" Jul 17 16:31:05 forgejo-ci act_runner[2314]: time="2025-07-17T16:31:05+02:00" level=trace msg="deadline exceeded" Jul 17 16:31:10 forgejo-ci act_runner[2314]: time="2025-07-17T16:31:10+02:00" level=trace msg="deadline exceeded" ``` </details> the few hints i can find about such a `deadline exceeded` seem to point in the general direction of https://github.com/moby/buildkit.
Author
Owner

trying to reproduce the checkout issue using 68b6e146f7160f9f99f2ea6bc3615c135d15a3f6, if that fails i can try and reploy 3d8d6269122f2106411c9c9d6a93e4135825116e to try and reproduce with that.

trying to reproduce the checkout issue using 68b6e146f7160f9f99f2ea6bc3615c135d15a3f6, if that fails i can try and reploy 3d8d6269122f2106411c9c9d6a93e4135825116e to try and reproduce with that.
kiara force-pushed ci-in-nix-docker from 3d8d626912 to df4c184a51 2025-07-17 18:59:52 +02:00 Compare
kiara force-pushed ci-in-nix-docker from df4c184a51 to 87486019ca 2025-07-17 19:02:44 +02:00 Compare
Author
Owner

i did in fact run into this on that commit again, so now deployed the whole branch again to try that.

i did in fact run into this on that commit again, so now deployed the whole branch again to try that.
Author
Owner

actually, that still triggers the checkout issue, so i've yet to really find a solution.

actually, that still triggers the checkout issue, so i've yet to really find a solution.
kiara added a new dependency 2025-08-05 15:31:11 +02:00
kiara removed a dependency 2025-08-07 15:12:26 +02:00
Some checks are pending
/ check-pre-commit (pull_request) Successful in 18s
Required
Details
/ check-data-model (pull_request) Successful in 50s
Required
Details
/ check-mastodon (pull_request) Successful in 23s
Required
Details
/ check-peertube (pull_request) Successful in 43s
Required
Details
/ check-panel (pull_request) Successful in 1m56s
Required
Details
/ check-deployment-basic (pull_request) Successful in 49s
Required
Details
/ check-deployment-cli (pull_request) Successful in 1m3s
Required
Details
/ check-deployment-panel (pull_request) Successful in 2m4s
Required
Details
/ check-resources (pull_request)
Required
This pull request has changes conflicting with the target branch.
  • .forgejo/workflows/ci.yaml
  • .forgejo/workflows/update.yaml
  • machines/dev/forgejo-ci/forgejo-actions-runner.nix
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u ci-in-nix-docker:kiara-ci-in-nix-docker
git checkout kiara-ci-in-nix-docker
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Depends on
#356 reproduce CI runner
fediversity/fediversity
Reference: fediversity/fediversity#405
No description provided.