Reproducible Arch Linux (August 2023)

kpcyrd kpcyrd at archlinux.org
Fri Aug 25 04:56:36 UTC 2023


hello,

during the normal packaging work that I'm doing for Arch Linux I noticed 
the following updates since I last wrote about this outside of irc (>1 
years ago).

## python-* and PEP-518

The python ecosystem has deprecated `./setup.py` semi-recently and 
pyproject.toml files started becoming more available. Before then, Arch 
Linux used to run `./setup.py build` during build() and `./setup.py 
install` during package(). Between these two stages there's check() 
which is used for testing; if this stage imports python files from the 
build folder it may re-write some of the .pyc files, causing the package 
output to depend on whether you built it with or without check().

With PEP-518 in Arch Linux, you would use `python -m build` to create a 
wheel (essentially a .zip file) inside of build() and then extract the 
wheel during package() into $pkgdir with `python -m installer`. The 
contents of the $pkgdir folder is then bundled up into the Arch Linux 
package.

The code in check() may still rewrite some .pyc files inside the build 
folder, but we do not care about this anymore since the wheel is already 
generated at this point (and tests are very unlikely to rewrite the 
wheel by accident).

Changing to PEP-518 looks like this:

https://gitlab.archlinux.org/archlinux/packaging/packages/python-bcrypt/-/commit/162ef0f6cf46001c0a0b6632ab32a75d0f0a841b

It seems this is indeed the common denominator for python-* packages 
that are not reproducible in Arch Linux at the moment (about 176).

It looks like this in diffoscope:

│ │ │ -   111         664 LOAD_CONST              55 (<code object 
<lambda>, file 
"/build/python-cligj/src/cligj-0.7.2/build/lib/cligj/__init__.py", line 
111>)
│ │ │ +   111         664 LOAD_CONST              55 (<code object 
<lambda>, file "/usr/lib/python3.11/site-packages/cligj/__init__.py", 
line 111>)

## Deterministic debuglink modifications

This is a second, separate issue that affects elf files in Arch Linux, 
it's technically a file-order issue (the order in which debug 
information is removed from the executable).

This issue was reported by Nspace on irc in #archlinux-reproducible but 
nobody had time to fix it yet (I'm also not sure it was documented 
anywhere outside of irc and signal DMs):

To summarize quickly, there's a function called `tidy_strip()` which is 
executed if the `strip` option is enabled in the Arch Linux build system 
(which it is by default). This function 1) removes any debug information 
from the binary and 2) writes it into a separate file that is then 
distributed in the -debug package 3) adds a debuglink entry to the 
binary to inform a debugger about detached debug information being 
available.

It seems the order for this has an impact on the elf binary.

```
find . -type f -perm -u+w -print0 2>/dev/null | while IFS= read -rd '' 
binary ; do
	[...]
	strip_file "$binary" ${strip_flags}
	(( STRIPLTO )) && strip_lto "$binary"
done
```

Usually the rebuilder manages to reproduce the package after 1-2 
attempts if the ordering happens to align with the original build.

Copying from irc logs:

2023-05-22 00:19:39 Nspace  The problem (I think) is that 
/usr/lib/getconf/POSIX_V7_LP64_OFF64 is a hard link to /usr/bin/getconf, 
and when makepkg strips the binaries and writes the debuglink section 
(https://gitlab.archlinux.org/pacman/pacman/-/blob/master/scripts/libmakepkg/tidy/strip.sh.in#L96) 
it processes the files in the order that find returns them. So if find 
returns POSIX_V7_LP64_OFF64 first, it will write 
POSIX_V7_LP64_OFF64.debug to .gnu_debuglink, and vice
2023-05-22 00:19:39 Nspace   versa. And since the order in which find 
traverses the filesystem is unpredictable, it can give different results 
on each run.

---

I've signed up for the Reproducible Builds Summit in Hamburg 2023 and 
would be interested in an attempt to get the 2nd issue resolved (before 
or during the summit).

The PEP-518 approach is more labour intensive and gives an estimated 1% 
improvement from 86%->87% reproducible. The second issue I'm not sure 
and can't give an estimate.

cheers,
kpcyrd


More information about the rb-general mailing list