[Distant Future Problem] Shared Inputs Problem (for Distributed Pkgs)

Jeff · May 12, 2024, 6:44pm

Continuing the discussion from SIG Repos: How should they work?:

This is an important problem, and I think we can describe it as the “shared inputs” problem. It is the reason I think Aux can’t go fully distributed, at least right now. We have to have some form of a monorepo to coordinate shared inputs.

I wanted to use this thread as

reference of what the problem is
why its hard
how we might one-day avoid the need for a monorepo

.

I’ve been working on this problem for a while (I feel like I say this about everything, but I swear they’re all interconnected). I was talking with Graham Christensen (flakehub core guy) and told him “look for my post next week”. That was almost 6 months ago , I still have yet to send this post to him because I keep refining it, but I think its worth posting here.

.

My Proposal to Flakehub (Solving Shared Inputs)

I know it doesn’t look like it, but I’ve tried to make this as short as possible.

I would like to start pulling tons packages out of nixpkgs, making them standalone, generating a git-tagged version history for them, all before publishing them to flakehub. But, I feel that I shouldn’t because of a design issue.

The Problem: Shared Inputs

Lets establish a little example for context.

Person1 publishes a numpy flakehub package
Person2 publishes pytorch flakehub package
Person3 is a user trying to use python+numpy+pytorch

All three of the packages (theoretically) take glibc as an input, and they all need the share the same glibc. They can’t each have their own version, at least not reliably. Finding a version of glibc that meets the needs of all three derivations is what I call the shared input coordination problem. Glibc is just one example, as this problem happens with many upstream dependencies (including effectively all python libraries).

The Nixpkgs monorepo solves this problem manually. E.g. if we use one nixpkgs commit for both torch and numpy, then the maintainers already did the work of finding a glibc version that worked for both. The manual approach has problems, as it burdens the maintainers a lot, and it is inflexible with package versions (one torch+numpy combo, can’t necessarily update one or the other). However, it kinda works.

Flakehub is radically different. Right now the burden of the shared input problem falls squarely on the user. Flake locks don’t say what inputs COULD be, they only say what inputs default to. When it comes to finding a shared input, AFAIK users have one and basically only one strategy; guess-and-check until it works. Once a project grows to 100 pip dependencies that all need various libpng, ffmpeg, cuda, and glibc versions, it becomes humanly unsolvable.

Expectations

To keep things grounded and tangible, I want to start with a sneak preview of the conclusion. I will be proposing for flakes on flakehub to include an additional output attribute.

It will be win for me if I can get feedback/iteration on the design, and (eventually) get a thumbs-up from the Determinate Systems team.
Step one is a standalone goal. That said, the hope is the design becomes a requirement for publishing on flakehub.
I’d like to say this third goal is “out of scope” because it is important to simply focus on the first two. However, we wont get the full benefit without this third step; search tools. This could be an incredibly foundational and unique service for flakehub to offer. The tools are going to be really fun to design, like a leetcode optimization problem with tons of room for performance tricks with real world benefit. While the “extra flake attribute” I’m proposing needs to be a good format for the search tools to ingest, beyond that aspect the search tools themselves are not the focus of this post. This article is about eating our veggies, making the problem solvable, so that later we can get to the meat and potatoes of the search tools that finish the job.

Now that you know what I am aiming for, lets start eating our veggies.

“Best” Solution(s)

Nix’s shared input coordination problem is like trying to – simultaneously – jump-rope, juggle, and play hop-scotch. Nix is used for so many things, basically any change anywhere has negative implications for some task somewhere. I’ve been designing, testing, trashing, iterating and revising solutions for the last couple years, and I think I finally have something that

prevents unfixable future problems
is practical to implement and
minimizes rough spots

Like a Rubik’s cube that’s been mislabeled, I think it’s provably impossible to maximize every side of this problem. There is no perfect solution but I do think there is one that stands above the rest.

Problem Components

Conceptually, there are two sides to shared-inputs

A flake-publication needs a way to say “my output can satisfy X,Y requirements”. I’m going call these certificates. For example, python2 and python3 might both “provide” the python certificate.
On the other side, flake inputs need a way to say “my input needs to meet some combination of (‘X or Y’, ‘X and Y’, etc)”. I’m going to call these requirements.

Requirements and certificates are two sides of the same coin. For that reason I’d like to call this the req-cert system.

If we encode these correctly, we should be able to throw our best automated solvers at it (instead of the caveman guess-and-check approach). Getting this encoding correct is the top priority.

Note: I’m going to talk like every flake has only one derivation because its easier that way. We can address multi-derivation flakes with the same approach.

Step 1: Certificates as Strings

Similar to solving a Rubik’s cube for the first time, I’m going to approach one side/color at a time. Meaning, hold criticisms, because we’ll revisit “solved” steps and enhance or replace them. I think it’s easier to read this way.

Think about publishing a python flake. For python, a publisher might say

certificates = [
  "python"
  "v3.10"
  "CPython"
  "python-minimal"
  "python_full"
]

On the other side, something that needs python as an input, might want to say:

# "CPython" AND "python_full"
# OR
# "Jython" AND "python_full"

python_requirements = [
  [ "CPython" "python_full" ]
  [ "Jython" "python_full" ]
]

Each string is effectively a boolean check; either the input has “CPython” in its certificates list or it does not.

Already, this design means:

Some kinds of requirements can’t be made
There is a complexity (e.g. size complexity) concern
There’s a usability/ergonomic concern, and
The input requirements structure isnt really defined yet

Lets continue eating our veggies and address each of these.

Step 1 Problem 1: Versions

You might have noticed the system isn’t graceful at checking versions. E.g. what if we just want to require the python input to be version 3 but not python 3.5. For now, lets try addressing that with as minimal of a change as possible.

certificates = [
  "python"
  "v3"
  "v3.10"
  "v3.10.5"
]

It’s ugly, doesn’t allow true version-range checks, and we will come back to this. But for now, this pattern handles at least the common cases: We can now require something as broad as “v3” or something as specific as “v3.10.5”.

Problem 2: Arbitrary Strings

Consider hundreds of devs, of all ages and backgrounds, who don’t know each other, who don’t have time to talk, but they need to coordinate as a group through an API. An API that they don’t control.

Unfortunately for me, I am already in the midst of this problem when working on syntax highlighting grammars.
Fortunately for you, being in that situation has caused me to spend excessive hours devising ways to avoid it.

Both our step 1 solution and the textmate syntax highlighters use an API of anything-goes-strings.

Sadly, the dark half of “anything is possible” is not only more common, but, specifically here, it increases in dominance over time.

We’ve got to keep eating our veggies, and get on the same page of why and how this problem happens.

Let’s go to exhibit A; VS Code. When VS Code looks at text, like the this keyword in javascript, each word is given list of tags. For example this might have the tags [ “variable” “special” “object” ]. This is very similar to each flake having a list of certificates. Zooming out, that list of tags is created by … someone … somewhere. Lets say tags are created by people-group 1. Everything is fine. Well… except the text in VS Code still has no color, there’s just a bunch of words and tags. That’s where people2 come in. People2 are the theme makers. Themes say “words with the variable tag get colored red, and words with the special tag get colored blue” etc. These rules are very much like input requirements of a flake; two sides of one interface.

Okay, so words get colors. Sounds good right? Maybe.

The thing is, people1 don’t coordinate with each other. Variables are NOT tagged with “variable”. They are tagged with “var” in one language, “variable-name” in another language, “varName” in another, and “identifier” in yet another. On the flip side, not only do people2 need to know every single var-name name (just to highlight a var!) but also there is no organized place for people2 to even find those var-name names! They have to just… look around in random codebases and kind of guess.

But it gets worse. The SAME tag can mean different things in different languages. Identifier in Prolog is very different than identifier in Javascript.

And it gets EVEN WORSE. New tags (from new languages) start with zero support. People2 (themes) already struggle with finding names, so how can they possibly know names of niche recently-released tags? People1 would need to make a PR on each of People2’s codebases, which is beyond impractical.

But devs aren’t the kind to say “I can’t do it the correct way … guess I’ll just … give up”. So what do niche languages (like nix) do? Easy. They intentionally miss-label everything. Theme’s don’t know about col1 and col2 tags for csv? Just call col1 variable and col2 a keyword. Theme’s don’t know what a prolog predicate is? Just say its regex. Theme’s don’t know what an atom is? Just tag it like an html attribute.

The truly worst part is; this can’t be fixed later.

Unfixable, intentionally wrong requirements/certificates would a very very very bad outcome for flakes. How can we avoid this?

Step 2: Naming Hubs

We are trying to juggle and play hopscotch, we want to avoid this naming problem but keep the decentralized nature of nix.

Lets rework the python example:

certificates = [
  "nixos.org:name:python"
  "flakehub.com:version:3"
  "flakehub.com:version:3.10"
  "flakehub.com:version:3.10.5"
  "flakehub.com:feature:cpython"
  "flakehub.com:feature:python-minimal"
  "flakehub.com:feature:python-full"
  "unstandarized:blazingly-slow"
]

Looking unwieldy? Maybe. But, we must first address cant-be-fixed-later then unwieldyness.

The example above is an outcome of the following string restrictions:

Strings need a namespace, a colon, and then text. Everything is lower case.
Normal namespaces need to follow URL syntax restrictions (the syntax after the “https://” or “file:///”). The namespace doesn’t need to be a real URL, it just needs to follow the syntax rules. The idea is; the namespace could correspond with a website that has an exhaustive lists of terms, patterns, conventions, etc.
The text after the namespace could be very restricted, maybe something like words composed of lowercase letters, digits, with dashes inbetween then with colons inbetween words. Or for regex ppl /(:[a-z0-9][a-z0-9\-]*(?<=[a-z0-9]))+$/
Note: being overly-restrictive is fine because its a can-be-fixed-later problem. We can ALWAYS allow more chars in the future, but not vice-versa. These restrictions keep lots of room available for future special/official namespaces. For example, “optional” or “system” could be special namespaces. I’ll come back to these.
The “unstandardized” namespace (at the bottom of the example) is more of “whats possible” example not really a “whats recommended”. Its a special namespace (not following url syntax); a designated area for hacky/temp/tinkering requirements, like someone who made a package for the first time.

There is one big note needed here. Restrictions must be enforced ONLY at publication time not at download time. For example, nix-env can’t download a flake and ASSUME that all requirement-strings contain a colon. This is because a flake from the future might (valid-ly) contain no-colons as part of some future-feature. If nix-env and other tools assume these requirements, we will be unable to solve future problems.

Back to namespace hubs. I think this design will prevent the no-standards duplicate naming and intentionally wrong naming issue. Moving on, we’ve got a lot to cover.

Problem 3: Data Structure Logistics

Where would these certificate/requirement lists go?

Our biggest design restriction is probably that requirements/certificates need to be serialized when publishing. Without that we can’t add them to a search engine, build an efficient index, and discover packages that meet certain requirements.

At first glance, this structure seems nice:

{
  inputs = {
    python = {
      requirements = [ "v3.10" ];
    };
  };
}

It’s serializable and has tight coupling (change an input, and it’s obvious the requirement also needs to change). However, it falls apart quickly:

Requirements can be in groups, e.g. (python310 and glibc68) OR (python311 and glibc99)
The unwieldy-ness problem can usually be solved with imported tools. But importing tools in the inputs area is itself unwieldy. So we can’t choose this design if we want to cut down on boilerplate later.
If there are different requirements for different operating systems, this approach just fails
If there are different fallback behaviors (e.g. torchWithCuda, torchWithoutCuda) there’s no way of handling it

Using outputs instead of inputs is not problem free.

Having requirements as outputs makes it look like requirements can be dynamically computed at install time. Which is not true; they need to be serialized/constant at publication time.
Theres also the risk that people accidentally require building the derivation in order to generate the certificates/requirements. This would make checking the requirements/certificates prohibitively expensive.

Info could be stored outside of the flake.nix, but that has obvious usability downsides.

So, all options considered, I think we can minimize the downsides of using outputs, and have it be the best approach. When publishing to flakehub, we simply evaluate the requirement-attribute of the output, try to serialize it, and give helpful messages when it’s not serializable. Ideally this would be a flake-lock step, but doing it at publication time should suffice.

Step 3: Data Structure v1

The following structure is closer to viable.

{
  inputs.python.url = http://somewhere.com;
  
  outputs = { python, ... }:
    {
      requirements = [
        {
          python = [ "python" "v3" ];
        }
        {
          python = [ "python" "v2" ];
        }
      ];
      certifications = {
        packages.x86_64-linux.default = [ "numpy" "v1" "v1.25" ];
        packages.x86_32-linux.default = [ "numpy" "v1" "v1.25" ];
        packages.aarch64-linux.default = [ "numpy" "v1" "v1.25" ];
      };
      
    }
  ;
}

That translates to “I prefer python v3 but can fallback on python v2”, and “I provide numpy 1.25 for linux”.

Yes, clunky, lets address functionality first though.

Problem 4: Optional Certificates

Can requirements be a function of the OS?
Can which-input-requirements-are-met impact which certifications are provided?

For # 1, yes, there are certainly times where the OS matters for requirements. Some OS’s lack some builtin tools, and they effectively need an additional package to compensate.

For # 2, it seems rare. However, I think there’s two important scenarios. First is feature flags. If there is only one flag, such as “cuda”, then its not too big of a deal to have two packages like torchWithCuda and torchNoCuda. But, with more than one flag, it starts to become catastrophic. Given 5 boolean feature flags, we need (2^n) 32 packages, e.g. openCvWithCudaVfpv3NoNeonVsx, openCvWithVfpv3NoCudaNeonVsx, openCvWithNeonCudaNoFpv3Vsx, etc. Thats just one scenario. The second is that packages often provide an extra-feature for a specific system. So the answer is “yes” which input requirements are met could affect which certificates are provided.

How can we make a datastructure that supports both of these needs?

Step 4: Data Structure v2

The following structure is much closer to viable.

{
  inputs.os.url                = https://os-requirements-placeholder.com;
  inputs.python.url            = https://somewhere;
  inputs.optionalCuda.url      = https://trickery;
  
  outputs = { os, python, optionalCuda, ... }:
    {
      requirementMapping = [
          {
            inputs = {
              os = [ "linux" "x86" "64bit" ];
              python = [ "python" "v3" ];
              optionalCuda = [ "v11" ];
            };
            outputs = {
                packages.default = [ "numpy" "v1" "v1.25" "cuda" ];
            };
          }

          {
            inputs = {
              os = [ "linux" ];
              python = [ "python" "v3" ];
            };
            outputs = {
                packages.default = [ "numpy" "v1" "v1.25" ];
            };
          }

          {
            inputs = {
              os = [ "mac" ];
              python = [ "python" "v2" ];
            };
            outputs = {
                packages.default = [ "numpy" "v1" "v1.25" ];
            };
          }
      ];
    }
  ;
}

Which translates to “if cuda is available, and I’m on linux, then I offer the cuda-enhanced certificate. Otherwise, for theses other systems, I just offer the normal numpy stuff”.

To be clear:

flakes keep their normal output (the packages.${system}.default).
os is not magic. Its a normal input, that is usually null, and will never be system. It’s not a hard-coded name, it is not a required input for building a derivation, nix will treat it as any other input. os will effectively always be null, which I know sounds weird, but the purpose is for metadata. Only the publish step will treat it differently, which I’ll explain some other time because its mostly irrelevant.
In this design, the requirementMapping is checked from top to bottom. If a set of requirements is met, then the remaining requirement mappings are not even checked. This can lead to situations like accidentally putting system = [ "linux" ] first, and then system = [ "linux" "64bit" ] second. The second case will absolutely never be evaluated since its a subset of the first case. Thankfully, this kind of accidental-superset can often be detected automatically at publication time, so we could at least warn the user on obvious cases.

Problem 5: Feature Combinatorics

There’s still at least one functional problem. If something like ffmpeg or opencv has 30 boolean feature flags, then the data structure above will need 2^30 entries; one for every possible certificate-output combination. Thats a problem.

Step 5: Data Structure v3

We can get around this with horn clauses. Or, for those who don’t read first order logic textbooks, conditional statements with a single consequence. The structure should read like “you’ll get the cuda certificate IF you meet one of the following requirement sets”.

If a requirement set is empty, that means there is no requirements at all.

{
  inputs.os.url                = https://os-requirements-placeholder.com;
  inputs.python.url            = https://somewhere;
  inputs.optionalCuda.url      = https://trickery;
  
  outputs = { os, python, optionalCuda, ... }:
    {
      certificateMapping = {
        "nix.org:feature:cuda" = [
          {
             os = [ "nix.org:os:linux" "nix.org:cpu:x86" "nix.org:bitness:64bit" ];
             optionalCuda = [ "nix.org:version:11" ];
           }
        ];
        "nix.org:pkg:numpy" = [ {} ];
        "nix.org:version:1" = [ {} ];
        "nix.org:version:1.25" = [ {} ];
        "nix.org:version:1.25.1" = [ {} ];
        "minimum" = [ # <- special name
            {
              os = [ "nix.org:os:linux" ];
              python = [ "nix.org:pkg:python" "nix.org:version:3" ];
            }
            {
              os = [ "nix.org:os:mac" ];
              python = [ "nix.org:pkg:python" "nix.org:version:3" ];
            }
            {
              os = [ "nix.org:os:linux" ];
              python = [ "nix.org:pkg:python" "nix.org:version:2" ];
            }
            {
              os = [ "nix.org:os:mac" ];
              python = [ "nix.org:pkg:python" "nix.org:version:2" ];
            }
        ];
      };
    }
  ;
}

This translates to “if you want cuda acceleration, then I need linux x86 and cuda 11, here’s my normal certificates (numpy v1.25.1) they don’t require anything, btw you need at minimum linux/mac with python 3 or python 2 to build me”.

The special “minimum” certificate just means the package will not build at-all without those requirements.

This structure does not suffer from the combination explosion.

If you are thinking “well that handles a certificate explosion, but what about a requirement explosion?” then don’t worry, that is not really a thing. Certificates are not symmetrical with requirements in that way. For certificates (outputs) we needed both a feature-flag=true case and feature-flag=false case to be in a list. But for requirements (inputs) we only need the true case. We don’t say “this (numpy) package does NOT need Gnome, and does NOT need Haskell, and does NOT need Lisp, etc” we only list what IS needed, which avoids an exponential explosion.

Problem 6: Ergonomics

Finally, we can start simplifying.

{
  inputs.requirementTools.url  = https://requirement_tools.com;
  inputs.os.url                = https://os-requirements-placeholder.com; 
  inputs.python.url            = https://somewhere.com;
  inputs.optionalCuda.url      = https://trickery.com;
  
  outputs = { requirementTools, os, python, optionalCuda, ... }:
    {
      certificateMapping = (
        let
          inherit (requirementTools) pkg oneOf allOf OS feature build optional;
        in
          requirementTools.build pkg {
            name = "numpy";
            version = "1.25.3";
            requiredInputs = oneOf [
                {
                    python = pkg {
                        version = oneOf [ 3 2 ];
                        features = allOf [ "CPython" ];
                    };
                    os = oneOf [
                        (allOf [
                            (OS allOf [ "linux" "64" ])
                            (OS oneOf [ "x86" "arm" ])
                        ])
                        (allOf [
                            (OS allOf [ "darwin" "64" ])
                            (OS oneOf [ "x86" "arm" ])
                        ])
                    ];
                }
            ];
            features = [
                (optional {
                    value = feature "cuda";
                    requiredInputs = {
                        optionalCuda = pkg { name = "cuda"; version = "11"; };
                        os = OS allOf [ "linux", "x86", "64" ];
                    };
                })
            ];
          }
      );
    }
  ;
}

I’m sure there’s more ergonomic alternatives, but that’s just a rough example of how massively different the API can be, while still using the structure and name-hubs system under the hood for maximum flexibility.

Big Picture

Lets put everything together

Someone makes a flake
They have a requirementTools input
They define a certificateMapping attribute on their flake output
They trigger the github action to publish to flakehub
The github action evaluates certificateMapping attribute, checks if naming standards are followed, the serializes it and pushes to flakehub.org
Flakehub.org probably does some kind of indexing/cache-building.
Some time later, like a week later, someone tries to solve a shared-input problem.

They have list of inputs (for example project dependencies), along with a set of requirements (like the project needs SQL 3.2.1), and an attribute summarizing their OS/machine
They serialize that info, send it to flakehub and ask “what inputs would satisfy these requirements?” (E.g. list of flake url’s and versions)
Flakehub solves this recursively, such as finding that input1 and input2 share a dependency, and finding a version of that dependency that meets input1’s requirements, input2’s requirements, and the project’s requirements. It then returns both a list of viable input versions/combinations, and also the overrides necessary to force input1 and input2 to share the same upstream dependency.

The user receives the response from flakehub. They use the first answer as their project inputs and … the project still might not build. But! There is a list of viable inputs, and after trying a couple more the project actually builds.

Final Design Aspect: Performance

Last bit of veggies that we need to eat; I said I’d come back to version ranges, and really it is one of many lingering things:

malicious packages that claim to have certificates for everything
some packages have constraints but don’t need to be shared (input1 needs python2 and input2 needs python3, they don’t need to share the same python)
paired dependencies. Ex: deno_core_123 and deno_networking_123
stacked dependencies. Ex: I need numpy built-using-clang-12-as-an-input, not numpy built using gcc 10 as an input (assuming that “built with clang 12” is not provided as a certificate by numpy)
The valid-ish need of dynamic certificates, and the complex consequences of trying to support them

There is one hidden aspect I’ve been designing for all along; performance. Going back to jumping rope, juggling, and playing hop-scotch, the final boss is doing all-that at 100mph on top of a train named “performance”. We/flakehub need to be able to solve shared inputs quickly, despite it being NP hard.

It was no accident that, at the beginning, we used strings like boolean-vars.

The boolean SAT problem is also NP hard (NP complex). But people have spent a whole lot of time optimizing SAT solvers. Our shared-inputs is not the same, but it is close. Between boolean SAT and the graph coloring problem, there are a lot of clever tricks we can borrow to optimize the heck out of the shared inputs problem. Even just replacing the strings with bits in a bit array to leverage xor can be a practical and major non-algorithmic speed up.

What does any of that have to do with version ranges?

Well the design already has room to support version ranges (and a bunch of other stuff). For example, a special “nix.org:pkg:[name]:min:1” and “nix.org:pkg:[name]:max:3” requirement, But I think we would loose a ton of performance – possibly so much that it becomes impractical for flakehub to even offer the solver as a service.

There might be a way to support version ranges, but I think we need to get a performant solver first, then see if performance can be maintained while adding true version ranges.

As for stuff like malicious packages, I think there are practical solutions, solutions that belong in a different discussion. This post is already super long.

Conclusion / Actionable Steps

Going all the way back; I’m looking for feedback. That is the next actionable step. Maybe there are aspects I missed, or a completely different way of approaching the problem. Either way, I really want to start making flakehub packages but feel I shouldn’t until there is a structured way to express requirements on the inputs.

liketechnik · May 12, 2024, 9:52pm

This is an incredible write-up. I’m impressed by the work you put into both coming up with the actual ideas and the text to convey them.

I do wonder though: How big is the overlap of “technical issues that need to be/are solved for this feature” between “How should SIG Repos work?”, “Multi-version Packages” and “Shared Inputs Problem”. That is to say: If the proposed solution here is applied to ‘derviation-publication’ instead of ‘flake-publication’, can it not only be used for selecting the right inputs, but also to select the correct dependency derivations?

Why do I ask this? I try to reuse a problem solving approach, if that can be done without sacrificing on simplicity/features/maintainability/etc. And I do feel like the problem you describe as the motivation, does not only apply to flakehub packaging, but also to our idea of a multi-repo, multi-version structure, since we also do have the problem of: select the right input flake/derivation to build this package.

I hope this is also a kind of feedback that you are looking for. I.e. feedback that, rather than proposing alternatives, points to other similar topics and compares them. If this is not the constructive feedback I intend it to be, but rather annoying/distracting from the actual discussion, please do tell me, so that I can stop posting it (and post actual alternatives instead, when I can come up with them).

marshmallow · May 13, 2024, 1:15am

If aux posts anything to flakehub I will be leaving the project.

AngryAnt · May 13, 2024, 5:55am

Very nice overview of the generalised problem. I would imagine flakehub is quite interested in work like this

DetSystems controversy aside, for the concrete problem of Aux I do feel like this is somewhat overkill and flake-specific. Aux controls all the moving parts and repositories, with submodules right there, easily updated by CI - for example on generated, hidden, branches.

Jeff · May 13, 2024, 11:17am

I’m afraid to ask . So I’ll just simply say I know pretty much nothing about flakehub other than using it and giving them feedback about improving the installer and site experience.

Jeff · May 13, 2024, 11:32am

IMO this is a pretty good point. And because its not distributed we could actually use a more simple version of it (for example we dont need naming hubs if we control all the names).

And even without any solver tools, it could be a good help for maintainers when they are trying to bump packages.

We can even combine it nicely with the static list of dependencies that we need for the registry to sort everything.

Ex: at the top of each package-build function in Sig repos, we could have a json comment that is equivalent to the certificateMapping. It would mention all dependencies and also any constraints we knew of for those dependencies.