meta/secrets-management.md

# Evaluation of secret management schemes

2024-12-03 Robert, Nicolas, Valentin

## Requirements

- Store and manage secrets in a central place
- Must be able to rotate keys (some state management)
- Minimal state on contributors' end, ideally exactly one per-user credential or even SSO

## Non-requirements

- Don't need (or need only very basic) RBAC, all contributors are equal (maybe infra admins have special access)
- Components which require secrets don't have to be a secret (this would be a requirement for personal setups, where we don't want to leak e.g. which accounts exist)
- No need to retrieve secrets for very old versions
- No need for forward secrecy (thoroughly destroying keys as required by e.g. secure messaging protocols)

## Design considerations

- Storing secrets

  Some secrets need to be persisted, and there are multiple formats and technologies to do that.

- Managing secrets

  Secrets need to be shared with contributors, and changed or rotated.
  Different systems have different degrees of comfort for these operations.

- Deploying secrets

  Secrets need to be made available to programs and services.

- Versioning

  For key rotation we need at least two versions: old to access the machine, new for rotating in

- Setup complexity

  Different systems have different requirements to get going, and may require more or less manual intervention for new contributors. This distinguishes:

  - complexity to set up for experts
  - complexity to contribute as a beginner

- Scalability, sustainability

  Questions to consider:
  - What if a contributor works on 100 such projects?
  - What if a project has 100 contributors?
  - What if a project runs over 10 years, how much effort does secret handling incur?
  - What if someone messes up the central server?
  - How fast can we set up a working system?
  - How hard is it to migrate from one scheme to another?

## Overview

|Name|management|deployment|storage|versioning|setup|scalability|
|-|-|-|-|-|-|-|
|[agenix]|yes (CLI)|yes (tempfiles)|repo ([age])|Git|[partially manual](#agenix-setup)|[details](#agenix-scalability)|
|[sops-nix]|yes (CLI)|yes (tempfiles)|repo ([SOPS])|Git|[partially manual](#sops-nix-setup)|[more moving parts than agenix](#sops-nix-scalability)|
|[Vaultwarden]|yes (web GUI)|no|database|yes, on demand|[details](#Vaultwarden-setup)|[more up-front effort](#Vaultwarden-scalability)|
|ssh/scp|yes (manual) |yes (manual)|per-user|manually|[details](#sshscp-setup)|[details](#sshscp-scalability)|

[agenix]: https://github.com/ryantm/agenix
[age]: https://github.com/FiloSottile/age
[sops-nix]: https://github.com/Mic92/sops-nix
[SOPS]: https://github.com/getsops/sops
[LoadCredential]: https://systemd.io/CREDENTIALS/
[Vaultwarden]: https://github.com/dani-garcia/vaultwarden

## Details on setup complexity

### agenix setup

- include module into configuration
- manage per-user ssh public keys
- each user needs to manage their public keys manually

### sops-nix setup

- include module into configuration
- manage per-user ssh public keys
- each user needs to manage their public keys manually

### Vaultwarden setup

- deploy Vaultwarden, set up backups
- manage per-user authentication with Vaultwarden

### ssh/scp setup

- each contributor has to manage private keys and ssh config manually
- have to take care of distribution of secrets and deployment separately

## Details on scalability

### agenix scalability

- allows reusing ssh key workflows

### sops-nix scalability

- some extra complexity due to multiple encryption schemes
- allows reusing ssh key workflows
- some additional local setup for contributors

### Vaultwarden scalability

- allows reusing password handling workflows (typically better automation than for ssh keys)
- more up-front work for initial deployment
- disaster recovery needs special care, doesn't implicitly distribute copies to contributors
- less interaction for managing contributor access
- separate source of truth (workflows, audit log, etc.) as opposed to everything in the Git repo
- adds an extra security boundary; encrypted secrets are not world-readable

### ssh/scp scalability

- requires taking care of distributing keys
- per-user key management typically not automated, requires taking care of that separately

## Additional notes

- Managing the interface between public confiuration and secrets is a concern of the code
    - For a scalable setup you want something like modules that take secrets as settings
- It is possible to split the git-stored secret schemes into private repositories
    - Then you have to handle synchronisation, e.g. by importing the public part from the secret part
    - This would incur extra overhead for managing access, but that would be the same workflow as managing access to the rest of the Git server
- With secrets stored in Git there's a potential for running into merge conflicts, which can be avoided but requires extra care
    - Probably you want a monorepo for the entire organisation
        - Separating public and private parts through git subtrees is possible but requires even more care and automation/tooling when managing outside contributions
        - The upfront effort may be similar (but different in nature) to deploying and maintaining a Vaultwarden server
- There's an experience and skill issue involved in maintaining a sophisticated Git repo or a live server, and what is more appropriate will depend on who will be responsible for the setup long-term