Reproducible XFS Filesystems Builds for VMs
David A. Wheeler
dwheeler at dwheeler.com
Tue Apr 15 14:54:32 UTC 2025
May I suggest collecting this general advice for creating reproducible *images*
and posting it on the reproducible builds website?
Once there's a common spot for it, others may provide suggestions on how to
improve it further. It's fine if you want to collect ideas into the best first try,
but I'd hate it if this mailing list was the only place to find this out.
Few people will create reproducible images - or ask for them - if it's a mystery
how to create them.
Thanks!
--- David A. Wheeler
> On Apr 14, 2025, at 12:49 PM, Luca DiMaio via rb-general <rb-general at lists.reproducible-builds.org> wrote:
>
> On Sat, Apr 12, 2025 at 2:37 PM Jelle van der Waa <jelle at vdwaa.nl> wrote:
>>
>> For XFS they seem to use protofiles to directly create a filesystem with
>> data on it. [1] [2] [3]
>> So maybe that helps? A common source of reproducibility issues with
>> mounting the filesystem which creates a bunch of irreproducible
>> meta-data to be created which is hard to get rid of.
>
> Thanks for this pointer!
> Indeed using prototype files works!
>
> Together with using LD_PRELOAD to overwrite gettime and getrandom:
>
> ```
> ~$ tar --sort=name --warning=no-timestamp --xattrs
> --xattrs-include='*' -xpf rootfs.tar.gz --numeric-owner -C rootfs/
> ~$ xfs_protofile rootfs > rootfs.protofile
> ~$ export LD_PRELOAD="./deterministic_rng.so /usr/lib/faketime/libfaketime.so.1"
> ~$ mkfs.xfs \
> -b size=4096 \
> -d agcount=4 \
> -d noalign \
> -i attr=2 \
> -i projid32bit=1 \
> -i size=512 \
> -l size=67108864 \
> -l su=4096 \
> -l version=2 \
> -m crc=1 \
> -m finobt=1 \
> -m uuid=$ROOTFS_UUID \
> -n size=16384 \
> -p rootfs.protofile \
> -n version=2 disk1.img
>
> ~$ mkfs.xfs \
> -b size=4096 \
> -d agcount=4 \
> -d noalign \
> -i attr=2 \
> -i projid32bit=1 \
> -i size=512 \
> -l size=67108864 \
> -l su=4096 \
> -l version=2 \
> -m crc=1 \
> -m finobt=1 \
> -m uuid=$ROOTFS_UUID \
> -n size=16384 \
> -p rootfs.protofile \
> -n version=2 disk2.img
>
> ~$ md5sum disk*
> dd06b8c8fe79e979d961291a4f78b72e disk1.img
> dd06b8c8fe79e979d961291a4f78b72e disk2.img
> ```
>
> This is a huge step ahead, but it seems like there is indeed some
> underlying bug/unsupported features, as right
> now we're losing extended attributes, and both files and directories
> timestamps, which are reset at the time of
> when we launch the command (which is fixed by libfaketime but that's
> still not preserving the original ones)
>
> Thanks for the pointer again :)
>
> On Sat, Apr 12, 2025 at 6:17 PM Simon Josefsson via rb-general
> <rb-general at lists.reproducible-builds.org> wrote:
>>
>> Jelle van der Waa <jelle at vdwaa.nl> writes:
>>
>>> But funnily enough NO_COLOR sort of is. [2]
>>> [2] https://no-color.org/
>>
>> That doesn't seem like a build variables does it? It seems more similar
>> to the POSIXLY_CORRECT, GZIP, TAR_OPTIONS etc variables. Compare with
>> the (excellent!) specification for SOURCE_DATE_EPOCH that in my reading
>> makes it clear that it is intended to be used during package builds:
>>
>> https://reproducible-builds.org/specs/source-date-epoch/
>> https://reproducible-builds.org/docs/source-date-epoch/
>>
>> If SOURCE_DATE_EPOCH is to be a runtime exposed environment variable
>> that end-users are supposed to use to influence behaviour of help2man,
>> mkfs etc, I think the specification doesn't make sense and would need
>> some additional discussion.
>>
>>> But I am not going to fall into the trap of discussion if this is
>>> wanted or not :)
>>
>> :)
>>
>> /Simon
More information about the rb-general
mailing list