Add 2024-11-21-roberth-comments-post-meeting.md

This commit is contained in:
Robert Hensing 2024-11-21 12:19:24 +01:00 committed by Valentin Gagarin
parent a0839682a7
commit 8d823f69d6

View file

@ -0,0 +1,130 @@
# Architecture diagram
- Split out central database, netbox. They're separate databases
"Operator" accounts and high-level configuration => NixPanel database
- Netbox will be accessed as a resource; perhaps move down
- Netbox IP resource (IP allocation resource)
- Split up into nodes for each db so that it's clear which parts talk to which state.
- Nix configurator: NixOS is just a resource to NixOps, as Valentin says
- (PlantUML diagram) is this a package diagram? Package diagrams are meant to describe the application level structure (e.g. within NixPanel) rather than the system level structure (e.g. between NixPanel and NixOps)
The arrows are supposed to be code dependencies, but many aren't. They seem to be some sort of interactions.
# Misc Notes
Valentin is on point about the NixOps decoupling
Ansible could be invoked by NixOps, over ssh remotely.
Only if there's a need. 100% on restricting to NixOS for now.
# Platform definitions
Running some "fixed" infrastructure on non-NixOS, like Proxmox, Garage, etc is fine. We do have an ambition to deploy everything from hardware using NixOps and NixOS, but this is not a blocker.
The way to think of this is what is the platform onto which a Fediversity is deployed:
- current platform: Proxmox + Garage + deSEC + netbox
- future alternative platform: bare metal
- see [NixOps4-based-installation process] for an impression; out of scope for now iiuc
- other future alternative platform: Google Cloud, OVH, AWS, Scaleway, etc
- no ambition to do this as part of Fediversity, but someone else may want to develop this
- Nix language is capable of such abstractions
<details><summary>Example: Object Storage</summary>
Depending on the platform, the NixOps configuration will activate different resources. For example, object storage could be provided by
1. A provided Garage instance. The Fediversity NixOps configuration will need admin credentials to be entered.
This is our first and for now only target.
2. Bare metal, in which case the Fediversity NixOps configuration may allocate VMs that run Garage, and configures Garage to give NixOps admin credentials.
3. A cloud native S3 compatible service
</details>
<details><summary>Abstractions in Nix</summary>
In NixOps4 we use the module system to perform abstractions.
Possible pattern:
- Option `objectStorage.backend.s3`: submodule with options to access an S3 compatible service
- configured by hand when deploying onto an existing S3 service, such as Garage on Debian, or a cloud provider
- Option `objectStorage.backend.garage.enable`: whether to deploy a Garage cluster
- Option `objectStorage.buckets.<name>`: submodule with information about the bucket
- Config `resources."bucket-${name}"`: the various bucket resources, derived from the `objectStorage.buckets` option set
- Config `resources.garage.resources."${...}"`, conditional on `objectStorage.backend.garage.enable`: the resources to deploy Garage onto Proxmox or bare metal
The infra layer might be best modeled as a separate NixOps deployment. It does not need to run as part of the NixPanel user interface, but run by those who run the service provider.
</details>
# Secrets
- Sources
- NixOps user level (workstation, not copied anywhere)
- user ssh private key is reused by NixOS resource
- credentials to access the NixOps state (e.g. in bucket object or over ssh)
- (anything that bootstraps access to the NixOps state)
- Outputs of resources will contain credentials (stored in NixOps state)
- e.g. the admin password for anything created through NixOps resources
- Sensitive infrastructure parameters inserted into the NixOps state initially
- e.g. using a `pinentry` resource
- or dummy resource that fails, but whose output attributes can be overridden by the NixOps CLI
- Destinations
- Files on NixOS machines
- copied to very private directory over ssh (`root`-only)
- copied to their final place by NixOS systemd units
<details>
- work with NixOS community to standardize this
- e.g. each service specifies with options where it expects its secrets, and waits for secret-specific systemd units using some convention
- `nixops4-nixos`, `agenix`, `sops-nix`, etc. can all implement this interface
</details>
- handled in such a way that they don't end up in the Nix store (use protective wrapper when passed around in the Nix language)
- NixOps resources
- Passing one resource's output to another resource's input: the purpose of NixOps
- NixOps state
- Encrypted at rest
- Process
- Simplest is to pass credentials around through NixOps, but indirections are possible. Benefits:
- Implement key rotation without redeployment => vault-style secret management with short-lived secrets so that credential leaks are less damaging and less valuable
- Fewer places with secrets => brings more structure to the organization's credentials management
- Only if all secrets from the NixOps state can be moved over to the secret management system.
- This will require more complexity in NixOps; would not recommend for now
- Single point of failure, _or_ "use one hinge and oil it well" -- paraphrasing [age] authors or reference, out of context
- `vaultwarden` can be written to by NixOps as needed, so that it can grant access declaratively
Security: compromise of a NixOps user compromises the whole deployment, and same for the NixOps state.
This probably includes equivalents of aforementioned "NixOps user level" secrets
- State is likely compromised by compromising a NixOps user
- NixPanel credentials are probably in the state, and they will need to have similar capabilities to those of the NixOps user credentials
Configuration management is expected to be able to do anything. (No need to freak out, but do use caution.)
# Terminology
- **operator**: I'm not super happy about "operator" instead of e.g. "customer". You're ignoring the Ops in NixOps. I can accept that, but it's not great communication.
For NixOps it makes sense to use "operator" and "author" roles to describe its users. I don't see a better alternative for NixOps.
- **target platform**: Core services, infrastructure services, etc may be useful terms to distinguish the roles in terms of behavior and functionality.
I think the missing term is "target platform", which captures the boundary of the scope of the Fediversity project. As described, this scope can be variable.
For example, _infrastructure services_ like Garage or email can each individually be part of the _target platform_, as they are right now, or they can become part of Fediversity's infrastructure services.
Being managed by Fediversity code is a less intrinsic property than it being "infrastructure", although arguably that's still a bit fuzzy; S3 not so much, but email is.
# Meta notes
- My contract is to develop NixOps, not NixPanel. I give advice.
- Usually "foo is done by NixOps" does not mean that it is implemented in the NixOps project, but rather that some action is initiated by calling a NixOps command, and the details of the operation are declared in a combination of
- NixPanel output of some sort
- A NixOps resource provider implementation
- A set of Nix expressions developed by the Fediversity project
- A Nix expression that's specific to a Fediversity deployment, which refers to the generic Fediversity expressions
(kept to a minimum, things like does Fediversity deploy a garage or is it provided by the underlying platform)
[NixOps4-based-installation process]: https://git.fediversity.eu/Fediversity/meta/src/branch/main/architecture-docs/NixOps4-based-installation-process.md
[age]: https://github.com/FiloSottile/age