Introduce test for deploying all services with nixops4 apply
#329
No reviewers
Labels
No labels
api service
blocked
bug
component: fediversity panel
component: nixops4
documentation
estimation high: >3d
estimation low: <2h
estimation mid: <8h
productisation
project-management
question
role: application developer
role: application operator
role: hosting provider
role: maintainer
security
technical debt
testing
type unclear
type: key result
type: objective
type: task
type: user story
user experience
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: fediversity/fediversity#329
Loading…
Add table
Reference in a new issue
No description provided.
Delete branch "Niols/Fediversity:integration-test-multiple-rebased"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes Fediversity/Fediversity#276
This PR adds a CLI deployment test. It builds on top of Fediversity/Fediversity#323. This test features a deployer node and four target nodes. The deployer node runs
nixops4 apply
on a deployment built with our actual code indeployment/default.nix
, which pushes onto the four target machines combinations of Garage/Mastodon/Peertube/Pixelfed depending on a JSON payload. We check that the expected services are indeed deployed on the machines. Getting there involved reworking the existing basic test to extract common patterns, and adding support for ACME certificates negotiation inside the NixOS test.What works:
nixops4 apply
with various payloadsWhat does not work: the services themselves depend a lot on DNS and that is not taken care of at all, so they are probably very broken. Still, this is a good milestone.
Future work:
Test it yourself by running
nix build .#checks.x86_64-linux.deployment-basic -vL
andnix build .#checks.x86_64-linux.deployment-cli -vL
. On the very beefy machine that I am using, the basic test runs in ~4 minutes and the CLI test in ~17 minutes. We know from Fediversity/Fediversity#323 that the basic test runs in ~12 minutes on the CI runner, so maybe about an hour for the CLI test?You can check the rendering of the documentation (in particular for Mermaid diagrams) here.
I believe the commits are good as they are and should be fast-forwarded only.
@ -7,0 +15,4 @@
$ nix build .#checks.<system>.deployment-<name> -vL
```
## Basic deployment check
### Basic deployment check
Done in
cc0c5d0519
.@ -7,0 +14,4 @@
``` console
$ nix build .#checks.<system>.deployment-<name> -vL
```
Maybe worth mentioning here that since
nixops4 apply
operates on a flake, the tests take this repository's flake as a template, and this also why there are some dummy files that will be overwritten inside the test. Although why are we even keeping the unused files? They could just as well all be created during the test, no?Done in
cc0c5d0519
. I did remove most of the files, but some are still necessary when building the VMs initially. I would like to get rid of them, though. We could also imagine getting rid of flakes but creating a whole flake in the test.@ -0,0 +236,4 @@
in
''
deployer.copy_from_host("${targetNetworkJSON}", "/root/target-network.json")
deployer.succeed("mv /root/target-network.json work/${pathFromRoot}/${tm}-network.json")
Why not copy it to the final location directly?
That's how it was done in the nixops4-nixos; I assumed there was a reason and never checked. Done in
cc0c5d0519
.@ -0,0 +35,4 @@
./minimalTarget.nix
(lib.modules.importJSON (pathToRoot + "/${pathFromRoot}/${tm}-network.json"))
]
++ optional enableAcme (makeAcmeClientModule {
It would be a lot simpler and much more idiomatic if you enabled acme right there in the module. Then we wouldn't need the extra file, which makes it harder to track down what's happening.
Done in
b03973603e
.@ -0,0 +92,4 @@
extraTestScript = ''
with subtest("Run deployment with no services enabled"):
deployer.succeed("cd work && nixops4 apply check-deployment-cli-nothing --show-trace --no-interactive 1>&2")
where does this
work
directory come from?It's created in the
unpacking
subtest defined in the test module generator.I think this shows a lack of documentation and that's a good point. Maybe we can even unpack the flake in the current working directory such that we don't need to
cd work
. I have to check whether there are things in the current working directory that would clash with that.Done in
cc0c5d0519
.TODOs from verbal sync:
fake
node into a nested derivation (not a blocker)peertube
andgixy
dependencies (nice to have)0c90805339
tocc0c5d0519
Added two commits. The first one re-modularises the common parts of the test. This is a whole reorganisation. The second one gets rid of the fake node concept and makes a temporary NixOS system to achieve the same goal. I am not sure it makes sense to review these commits separately as they really touch everything.
TODOs for oral sync:
common/acme/*
intocommon/*
testCerts
an option with defaultimport ...
@Niols wrote in Fediversity/Fediversity#329 (comment):
All done. What do we think of the current status of things?
With this last commit, CI should be a happy green. Is there anything that remains before we can merge? I only think of things that can be left out and added in another PR, except maybe for the name of this “CLI” deployment test - now would be a good time to find a better name.
@Niols i'm good, thanks for finishing this!
@fricklerhandwerk?
Introduce CLI deployment testto Introduce test for deploying all services withnixops4 apply
@ -7,0 +56,4 @@
### CLI deployment check
The CLI deployment check factors out the panel by running a direct invokation of
invocation
Done in a328818f1d7c8a5e055b90eb372be82e3fccd252.
@ -7,0 +54,4 @@
deployer -->|deploys| target_machines
```
### CLI deployment check
How about this?
Sounds good. And for the name in the code,
deployment-cli
->deployment-nixops4
?Done in a328818f1d7c8a5e055b90eb372be82e3fccd252.
@ -7,0 +124,4 @@
### [WIP] Panel deployment check
This is a full deployment check running the panel on the deployer machine,
deploying some services and checking that they are indeed on the target
deploying some services through the panel, and...
Done in a328818f1d7c8a5e055b90eb372be82e3fccd252.
This looks pretty good from afar, but I'd like to take another full pass and then merge.
a328818f1d
to0b3b252292
Rebased on recent
main
.ci fail :(
Ah dayum :-(
I guess the culprit is the freshly merged Fediversity/Fediversity#330; we should have waited for this test, probably.
27634f2294
to9dcd0e360e
Pretty good, awesome that it's all green now! Let's merge it like that, and keep cleaning it up. It's still a bit too hard to follow for my taste, because some unnecessary abstraction obscures the actual test cases. Ideally test cases would contain pretty much a (simplified) deployment as one would actually write it for production, and next to it a script that interacts with it to check invariants.
closureInfo
to avoid having to declare extra dependencies manually #338