Paper on reproducible Docker images: "Docker Does Not Guarantee Reproducibility"

kpcyrd kpcyrd at archlinux.org
Sun Jan 25 19:48:30 UTC 2026


Dear list,

I found this in my news feed and wanted to share:

- https://arxiv.org/pdf/2601.12811
- https://dl.acm.org/doi/10.1145/3736731.3746146

For people reading along who are not super familiar with the topic, note there's 
a distinction between "Docker image" and "Dockerfile":

- the Docker image is the compiled artifact
- the Dockerfile is a file with build instructions

The Docker image is what you get out of `docker build`, but since this is 
essentially just a tar file you could also use something like apko[0] to 
generate them. From what I understand this is a fairly straight-forward way to 
repack your binary, without having to involve yourself with namespaces, kernel 
capabilities and base images.

At that point you only need to worry about reproducible builds for your executable.

[0]: https://github.com/chainguard-dev/apko

The Dockerfile is what most people use to build their containers, this 
technology also notably doesn't have a dependency lockfile like you are used to 
with modern programming language package managers.

This is also what the paper mostly (but not exclusively) focuses on.

Lastly, there's also another problem[1] that I see very rarely talked about - if 
you can build your Docker image on two different computers with bit-for-bit 
identical outputs, this still does *not* mean you can independently authenticate 
the contents of a container registry.

The image is only fully "built" after it has been published to the registry, 
since the manifest file is being re-written by the registry (in an 
undefined/unspecified way). This is, in my opinion, the biggest problem in the 
Docker/container ecosystem, the other ones we can work around by switching from 
`docker build` to different tools if we have to.

[1]: https://github.com/sigstore/cosign/issues/2516 (2022)

---

I would love to get some input on this, especially if I got anything wrong or if 
there has been progress on authenticating the content of e.g. hub.docker.com (or 
ghcr.io for that matter).

The authors of the paper are also most likely subscribed here (hi!).

Very interested,
kpcyrd


More information about the rb-general mailing list