Call for feedback - Aux Secrets

@coded @minion and me (@dfh) have been brainstorming over different ways to do secret management in Aux given the rather bad state in which the nix secrets management stuff currently is.

The problems we are trying to avoid with the new solution:

  • Secrets committed to git
  • Secrets can’t change without redeployment/ being part of system configuration
  • The need for an unencrypted secret on disc, typically SSH key
  • Secret versioning, rotation, attestation and auditing are quite hard

A big aspect for the project is to use declarative methods for secret generation where possible. The gokey utility is an inspiration for this idea, but does not resolve the issues for secrets that are pre-shared in nature (think API tokens, WiFi passwords, etc).

We wanna hear your thoughts and how you would like the solution to look and feel.

If you’re into this topic, please help us out. You can join The Matrix Room or DM us on the discourse

8 Likes

Some thoughts/questions about this:

  • Secrets committed to git

Encrypted and plaintext or just plaintext?

  • Secrets can’t change without redeployment/being part of system configuration

This makes it no longer reproducible, as the secrets are versioned separately. I don’t think it’s a problem for e.g. API tokens as they are inherently not reproducible (you can’t rollback to an old configuration with a revoked API token), but if you mess up your password you might want to rollback to a previous config.

  • The need for an unencrypted secret on disc, typically SSH key

This is less of a problem in personal deployments (e.g. PC or home server, where physical access is not much of a threat) and it is much simpler, so I’d like to still have this as an option for simpler/lower security stuff.

  • Secret versioning, rotation, attestation and auditing are quite hard

Agree.

My conclusion is that you recommend versioning them separately which makes sense (you’d want to use newer passwords/API tokens where possible) and this would allow us to address most other problems (plain text secret on disk, secrets on git, etc.), but it might interfere with rollbacks.

I think agenix/sops is fine for home deployments and easier to manage, but for more professional deployments, having something like a TPM backed password store would be nice.
What appears to fulfill your requirements would be something like pass that would use the system’s TPM instead (systemd-creds might work?), then deployment would go like:

  1. decrypt secrets on build host
  2. encrypt them with target’s ssh/age pubkey
  3. transfer them to target
  4. target re-encrypts them with the TPM key and stores them in a new store.
  5. on test/switch, the TPM decrypts the secrets to /run/secrets and the services can access them (or through systemd-creds).

The secret store on the build host would be managed/versioned separately, but it should support push to a remote host through e.g. ssh so that the secrets can be updated on the fly. It might need to run a script that updates the secrets/reloads the services running to apply the updates though.


The above text was shared by me last night in the matrix server, I modified it slightly before posting it here to fix some mistakes and better represent my opinion.

1 Like

After further discussions I believe we share a similar opinion and are leaning towards a specific design:

We would like to build a “vault” API where the secrets are stored in order to be shared among many machines/versions of machines.
Each “vault” would provide its own script that allows retrieving secrets from the “vault” itself (We lean towards defaulting to a gokey wrapper with some extra functionality).
The secrets would be retrieved and stored on a local store (backed by, e.g. systemd-creds), the store would optionally (but not by default) use the system’s TPM device, and a local secret (e.g. the system’s host ssh key).
The local store would then provide the secrets to the system services.

Example workflows

Initial deployment

  1. The vault is copied over alongside the system configuration
  2. The vault is used to provision the local store
  3. The local store is used to provide the secrets to the services
  4. The system is all setup

Configuration changes not affecting secrets

e.g. disable existing service that does not rely on secrets

  1. Redeploy the system
  2. Nothing to do, neither the local store, nor the vault changed

Configuration changes affecting secrets

e.g. enabling/disabling TPM support in the local store

  1. Copy over new configuration
  2. Re-provision the local store from the vault
  3. Restart affected services & switch to configuration

Vault changes

e.g. API key rollover, password change, etc.

  1. Copy over the new vault
  2. Re-provision local store
  3. Restart affected services

Scripts/extensions

Work that needs to be done to support this solution.

Extend gokey to store fixed secrets

Similarly to pass, create a store folder that contains secrets that cannot be generated based on the gokey seed (e.g. API keys). Keeping it simple, we’ll use the filename to derive a symmetric encryption key and retrieve the secret inside the file.

This will mean the folder will need to be copied over along with the gokey seed file, but it can be versioned separately from the config (e.g. through git). Or along with the config if your setup doesn’t mind that.

Extend nixos-rebuild to copy over the vault/provision the local store

Each vault should define a copy script, this script should either directly copy the vault to the target system, or it should provide a list of pairs of source and destination paths that nixos-rebuild should copy.

Extend the test/switch script to provision the local store

When running nixos-rebuild test/switch/install the local store needs to be provisioned based on the vault’s interface.

Extend modules to accept secrets from the local store

Finally, integrate the local store with the NixOS module system.

5 Likes

y’all mighta well known this, but reading this i remembered that hashicorp did a library literally named Vault. since their license change it was forked, but in other words, for what’s it’s worth there is stuff out there that interfaces with various existing services for secrets.

to what extent that could be useful to interact with in this context i’m not sure. i think nix restricted network access at certain stages to reduce impurity, tho iirc Vault did in fact work in terms of unlock → use → relock. so maybe bridging with the likes of that at least could help offload logic on interacting with other systems, for in as far as that might become desirable here.

We’ve talked about vault but the pain point is it’s extremely involved to set up with. And as I’ve never used it: according to @jakehamilton

vault’s systems for creating secrets, setting policies, and managing engines are far too tedious

I really like the idea of extensibility through an api/scripts, storing the secrets on the machine plain will likely be compatible with most consumption patterns.

I’m having success using sops and sops-nix for secret management on NixOS and nix-darwin (with home-manager).

It lacks (or perhaps, just my usage of it, lacks) systemd-creds (and thus TPM support) at the moment but is otherwise very robust.

Likely everyone involved has already audited this option, if so ignore my post; otherwise if you’re curious about the full workflow reply and I’ll go into more detail :slight_smile:

1 Like

The biggest issue with sops-nix/agenix is that they tie your secrets configuration to your system configuration, this means that if you rollback the configuration, you also rollback the secrets reverting any changes to API keys, passwords, etc.

Our belief is that passwords and secrets should be stored separately from the system configuration.

1 Like