Reproducible XFS Filesystems Builds for VMs

Luca DiMaio luca.dimaio at chainguard.dev
Mon Apr 14 16:49:23 UTC 2025


On Sat, Apr 12, 2025 at 2:37 PM Jelle van der Waa <jelle at vdwaa.nl> wrote:
>
> For XFS they seem to use protofiles to directly create a filesystem with
> data on it. [1] [2] [3]
> So maybe that helps? A common source of reproducibility issues with
> mounting the filesystem which creates a bunch of irreproducible
> meta-data to be created which is hard to get rid of.

Thanks for this pointer!
Indeed using prototype files works!

Together with using LD_PRELOAD to overwrite gettime and getrandom:

```
~$ tar --sort=name --warning=no-timestamp --xattrs
--xattrs-include='*' -xpf rootfs.tar.gz --numeric-owner -C rootfs/
~$ xfs_protofile rootfs > rootfs.protofile
~$ export LD_PRELOAD="./deterministic_rng.so /usr/lib/faketime/libfaketime.so.1"
~$ mkfs.xfs \
-b size=4096 \
-d agcount=4 \
-d noalign \
-i attr=2 \
-i projid32bit=1 \
-i size=512 \
-l size=67108864 \
-l su=4096 \
-l version=2 \
-m crc=1 \
-m finobt=1 \
-m uuid=$ROOTFS_UUID \
-n size=16384 \
-p rootfs.protofile \
-n version=2 disk1.img

~$ mkfs.xfs \
-b size=4096 \
-d agcount=4 \
-d noalign \
-i attr=2 \
-i projid32bit=1 \
-i size=512 \
-l size=67108864 \
-l su=4096 \
-l version=2 \
-m crc=1 \
-m finobt=1 \
-m uuid=$ROOTFS_UUID \
-n size=16384 \
-p rootfs.protofile \
-n version=2 disk2.img

~$ md5sum disk*
dd06b8c8fe79e979d961291a4f78b72e  disk1.img
dd06b8c8fe79e979d961291a4f78b72e  disk2.img
```

This is a huge step ahead, but it seems like there is indeed some
underlying bug/unsupported features, as right
now we're losing extended attributes, and both files and directories
timestamps, which are reset at the time of
when we launch the command (which is fixed by libfaketime but that's
still not preserving the original ones)

Thanks for the pointer again :)

On Sat, Apr 12, 2025 at 6:17 PM Simon Josefsson via rb-general
<rb-general at lists.reproducible-builds.org> wrote:
>
> Jelle van der Waa <jelle at vdwaa.nl> writes:
>
> > But funnily enough NO_COLOR sort of is. [2]
> > [2] https://no-color.org/
>
> That doesn't seem like a build variables does it?  It seems more similar
> to the POSIXLY_CORRECT, GZIP, TAR_OPTIONS etc variables.  Compare with
> the (excellent!) specification for SOURCE_DATE_EPOCH that in my reading
> makes it clear that it is intended to be used during package builds:
>
> https://reproducible-builds.org/specs/source-date-epoch/
> https://reproducible-builds.org/docs/source-date-epoch/
>
> If SOURCE_DATE_EPOCH is to be a runtime exposed environment variable
> that end-users are supposed to use to influence behaviour of help2man,
> mkfs etc, I think the specification doesn't make sense and would need
> some additional discussion.
>
> > But I am not going to fall into the trap of discussion if this is
> > wanted or not :)
>
> :)
>
> /Simon


More information about the rb-general mailing list