Visualizing nixpkgs dependencies

srtcd424 · May 7, 2024, 9:22pm

I was wondering if anyone had done any analysis of nixpkgs as a giant dependency graph, and of course it turns out the folks at Tweag have already been there. If nothing else, they have fantastic graphics! It also demonstates a little what we’re letting ourselves in for

Sigma · May 7, 2024, 10:50pm

From Construction and analysis of the build and runtime dependency graph of nixpkgs

[…] some derivations have buildInputs or propagatedBuildInputs that contain themselves

how is this possible, this blow my mind

Sigma · May 7, 2024, 10:53pm

From Mapping a Universe of Open Source Software

What looks remotely like the section of a mouse brain actually represents around 46000 software packages.

The post was published in 2019, so in the span of 5 years, we have tripled the count of packages, that’s insane :o

Jeff · May 9, 2024, 4:07am

I didnt know about those two links! And I should’ve because Ive spent a ton of time trying to recurse across packages.

For determining whether or not to recurse, we can rely on recurseForDerivations and recurseForRelease attributes. There are important sets of derivations that are not derivations while their both recurse attributes are false, such as python3Packages

Its weird he says we can rely on it, and then immediately says an example of how it can’t be relied on. I know in my experience it is unreliable, and I don’t think his whitelist is exhaustive because some packages appear for the first time 4 attributes deep. He did a good job, but nixpkgs even in 2022 was is more messy than what he presented.

Something he didnt mention is, its not always possible to detect cycles in nixpkgs thanks to nix’s decision that a==a is false when a is a function. Because of that, it is impossible to do a generic deep comparison of attributes, and so its impossible to check if in a loop happens in the tree without falling back on something like pname and version. However not all package have pname and version, and also pname and version could be the same between two different packages (rare if any but not enforced-impossible).

There are also many errors that tryEval cannot catch.

Since exploring the tree exhaustively is an ongoing issue (sadly), I had to programmatically control a nix repl instance as a truly reliable try-eval. And instead of pname and version as a node-id we can now use the unsafe get path (like where the code is defined) as a means of detecting loops more generically (but still not perfectly)

Doing that along with an iterative deepening algorithm, a heuristic to reduce the cost of undetectable tree-loops, and a concurrency enhancement, I was able to touch 4-attrs-deep but not fully explore it on a machine with 256Gb ram and running for about ~18 hours (after 18 hours it ran out of ram).

Its pretty bad IMO that it takes a dedicated reseach project just to … list not quite all of packages in a package repo. Also btw nix-env -qa --json does not list all of the packages either. Which is again pretty sad.

All of this is actually pretty relevent to the repo structure discussion.

Jeff · May 9, 2024, 4:15am

By coincidence I happened to write about this earlier today

SIG Repos: How should they work?

Dynamic recursive dependencies: Unfortunately I can confirm there are packages that are deeply, painfully, multi-base-case recursive with optional dependencies.

Let’s start with easiest example. Let’s say registry.autoconf depends on perl. Well registry.perl (ex: perl 6.x) might depend on perl & autoconf. And now we’ve got a multi-recurisve problem; autoconf needs perl and perl needs autoconf (and perl!), its the dependency tree with loops.

Except in reality reality we start with core.perl, then build autoconf::(built with core.perl), then build registry.perl::(built with core.perl and autoconf::(built with core.perl)), and then build autoconf::(built with registry.perl::(built with core.perl and autoconf::(built with core.perl))). It quicky becomes a lot to mentally process … and that’s the simple case!

Nixpkgs does stuff exactly like that behind the scenes, at runtime. Thing is, we don’t have to do it at runtime. We can be way more clear about what is going on by adding stages.

registry.autoconf_stage1, statically depends on core.perl.

registry.perl_stage1, statically depends on registry.autoconf_stage1

registry.autoconf_stage2 statically depends on registry.perl_stage1

All other registry packages use registry.autoconf_stage2 instead of just “some version of autoconf”.

While still complicated, making these stages explicit is, I think, the only way to make this stuff even barely manageable. Just imagine the difference between “Error: autoconf_stage2 failed to build” compared to “Error: autoconf (one of multiple generated at runtime) failed to build”.

While this does require skilled manual labor, there’s not too many packages like this.

Well … except for one category. Cross compliation.

While I think we should have cross compilation in mind from the begining, I don’t think we should immediately (or any time soon) jump into trying to handle cross compiled packages.

The normal (not-cross-compiled) version of a package is going to have less dependencies, and be higher up on the dependency tree. We should focus on those first since they’re the foundation.

That said, I want to recognize what will eventually need to be done for the true deepest most nasty hairball of spaghetti-code in all of nixpkgs; cross compiling of major tools like VS Code, using QEMU virtualization. Not only is it an explosion of dependencies, its possible to depend on the same version of the same package twice, once for the host architecture and again for the target architecture. If we can eventually tackle that, I don’t think it gets any worse.

I know it might feel unclean (give me a chance to talk about SIG sources), but in order to detangle cross compliation, some registry packages will need to have system postfix names like gcc_linux_x86_64, just FY

nat-418 · May 9, 2024, 9:26am

Something like guix graph would be neat: Invoking guix graph (GNU Guix Reference Manual)

Sigma · May 9, 2024, 12:12pm

Check nix-visualize!
It does not offer as many option as the guix graph you mentionned, but it can generate nice dependencies graphs

Sigma · May 9, 2024, 12:20pm

T get a dot graph, you can also do the following:

$ nix shell nixpkgs#graphviz
$ nix build nixpkgs#hello
$ nix-store -q result --graph | dot -Tpng > output.png

output