Arguing about source inputs
Andrius Štikonas
andrius at stikonas.eu
Mon Mar 31 01:05:57 UTC 2025
Hi,
In general that is good advice but it comes at huge packaging and maintenance
cost.
We try to do this in live-bootstrap project and we even have various scripts
to help us find pre-generated files https://github.com/fosslinux/problematic-source.
But this often results in huge cleanup functions like:
https://github.com/fosslinux/live-bootstrap/blob/
2057d551e0d072f85dd3c8b046e90e6be81a3604/steps/gcc-13.3.0/pass1.sh#L5
Or python suddenly needs a long chain in order to be built:
python 2.0.1->python 2.3.7->python 2.5.6->python 3.1.5 (2 passes) -> python
3.3.7->python 3.4.10->python 3.8.16->python 3.11.1. Compare this with most
distros that just go straight for the latest python. There is a similar but
even worse chain for Autotools itself, see some entries here: https://
github.com/fosslinux/live-bootstrap/blob/master/parts.rst#62autoconf-252.
Or even worse in case of GNU Autogen https://github.com/fosslinux/live-bootstrap/blob/master/steps/autogen-5.18.16/pass1.sh where we had to write
some extra code https://github.com/schierlm/gnu-autogen-bootstrapping/
(very similar story is with GNU Guile and it's pp-syntax bootstrapping or
Bison bootrapping...). And maintainers of these projects are not interested in
integrating these bootstrapping solutions (often they don't even understand
that there is a problem).
Kind regards,
Andrius
2025 m. kovo 30 d., sekmadienis 22:51:23 Britanijos vasaros laikas Simon
Josefsson via rb-general rašė:
> "David A. Wheeler via rb-general"
>
> <rb-general at lists.reproducible-builds.org> writes:
> > Based on the xz experience, the OpenSSF
> > "Concise Guide for Developing More Secure Software"
> > <https://best.openssf.org/Concise-Guide-for-Developing-More-Secure-Softwar
> > e> added the following point:
> > "If a source code (unbuilt) package is released, it should only
> > include content from the version control system (VCS), and source
> > package users should rebuild, if needed, to create production (built)
> > package(s). E.g., if autotools is used, if a source package is
> > released it should not include a generated configure file, while
> > recipients should ignore pre-generated files like configure and
> > instead rebuild from source (e.g., with autoreconf). This eliminates a
> > malware-hiding mechanism, as illustrated by an attack on xz utils."
>
> That is great advice! I wish that this was more widely adopted.
>
> I have suggested that maintainers essentially do the following when they
> release an autotools-based projects:
>
> git archive --prefix=libntlm-v1.8/ -o libntlm-1.8-src.tar.gz HEAD
> gpg -b libntlm-1.8-src.tar.gz
>
> With the git repository only holding configure.ac and not configure etc.
>
> Many distributions, Debian included, work with source tarballs that were
> generated by "make dist" and contains generated vendored files like
> ./configure, ./Makefile.in etc. This used to be acceptable because
> there were only a few files, and easy to audit. However xz showed us
> that this is a bad idea. With wide use of Gnulib and other vendor
> libraries, this no longer scales.
>
> Distributions should stop building from "make dist" tarballs!
>
> Most normal users are helped by the vendored files included in "make
> dist" tarballs, and they are useful for bootstrapping purposes. However
> I believe they are a bad idea for most distributions. Over the years,
> people slowly realized this and started to introduce broken workarounds
> like running 'autoreconf -fi' within the tarballs, something that was
> never intended for what it is being used for. And doesn't even do what
> people expect it to do.
>
> I've recently released GNU libtasn1, libidn, libidn2, inetutils, and
> gsasl which all were done using the recipe above. When converting the
> Debian packaging for these packages from "make dist" to "git-archive"
> style tarballs, several forgotten build dependencies were discovered
> (including gperf, gengetopt) indicating that the packages never really
> built things from source before. I think this is a widespread problem.
>
> I've written/talked a bit about this:
>
> https://blog.josefsson.org/2024/04/01/towards-reproducible-minimal-source-co
> de-tarballs-please-welcome-src-tar-gz/
>
> https://blog.josefsson.org/2024/04/13/reproducible-and-minimal-source-only-t
> arballs/
>
> https://debconf24.debconf.org/talks/126-de-vendor-origtargz-gnulib-and-more/
>
> One advantage with "git archive" releases is that they are easy to
> reproduce by one, which avoids the boring work involved to turn "make
> dist" tarballs reproducible:
>
> https://blog.josefsson.org/2025/03/24/reproducible-software-releases/
>
> What possibly could be missing is guidelines on what to store in source
> control repository: if you put some generated vendored file in git (or
> some non-free firmware blob), we are back where we started.
>
> /Simon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.reproducible-builds.org/pipermail/rb-general/attachments/20250331/ab9ef5cd/attachment.sig>
More information about the rb-general
mailing list