Arguing about source inputs

Andrius Štikonas andrius at stikonas.eu
Mon Mar 31 01:05:57 UTC 2025


Hi,

In general that is good advice but it comes at huge packaging and maintenance 
cost.

We try to do this in live-bootstrap project and we even have various scripts 
to help us find pre-generated files https://github.com/fosslinux/problematic-source.

But this often results in huge cleanup functions like:
https://github.com/fosslinux/live-bootstrap/blob/
2057d551e0d072f85dd3c8b046e90e6be81a3604/steps/gcc-13.3.0/pass1.sh#L5

Or python suddenly needs a long chain in order to be built:
python 2.0.1->python 2.3.7->python 2.5.6->python 3.1.5 (2 passes) -> python 
3.3.7->python 3.4.10->python 3.8.16->python 3.11.1. Compare this with most 
distros that just go straight for the latest python. There is a similar but 
even worse chain for Autotools itself, see some entries here: https://
github.com/fosslinux/live-bootstrap/blob/master/parts.rst#62autoconf-252.

Or even worse in case of GNU Autogen https://github.com/fosslinux/live-bootstrap/blob/master/steps/autogen-5.18.16/pass1.sh where we had to write 
some extra code https://github.com/schierlm/gnu-autogen-bootstrapping/
(very similar story is with GNU Guile and it's pp-syntax bootstrapping or 
Bison bootrapping...). And maintainers of these projects are not interested in 
integrating these bootstrapping solutions (often they don't even understand 
that there is a problem).

Kind regards,
Andrius

2025 m. kovo 30 d., sekmadienis 22:51:23 Britanijos vasaros laikas Simon 
Josefsson via rb-general rašė:
> "David A. Wheeler via rb-general"
> 
> <rb-general at lists.reproducible-builds.org> writes:
> > Based on the xz experience, the OpenSSF
> > "Concise Guide for Developing More Secure Software"
> > <https://best.openssf.org/Concise-Guide-for-Developing-More-Secure-Softwar
> > e> added the following point:
> > "If a source code (unbuilt) package is released, it should only
> > include content from the version control system (VCS), and source
> > package users should rebuild, if needed, to create production (built)
> > package(s). E.g., if autotools is used, if a source package is
> > released it should not include a generated configure file, while
> > recipients should ignore pre-generated files like configure and
> > instead rebuild from source (e.g., with autoreconf). This eliminates a
> > malware-hiding mechanism, as illustrated by an attack on xz utils."
> 
> That is great advice!  I wish that this was more widely adopted.
> 
> I have suggested that maintainers essentially do the following when they
> release an autotools-based projects:
> 
> git archive --prefix=libntlm-v1.8/ -o libntlm-1.8-src.tar.gz HEAD
> gpg -b libntlm-1.8-src.tar.gz
> 
> With the git repository only holding configure.ac and not configure etc.
> 
> Many distributions, Debian included, work with source tarballs that were
> generated by "make dist" and contains generated vendored files like
> ./configure, ./Makefile.in etc.  This used to be acceptable because
> there were only a few files, and easy to audit.  However xz showed us
> that this is a bad idea.  With wide use of Gnulib and other vendor
> libraries, this no longer scales.
> 
> Distributions should stop building from "make dist" tarballs!
> 
> Most normal users are helped by the vendored files included in "make
> dist" tarballs, and they are useful for bootstrapping purposes.  However
> I believe they are a bad idea for most distributions.  Over the years,
> people slowly realized this and started to introduce broken workarounds
> like running 'autoreconf -fi' within the tarballs, something that was
> never intended for what it is being used for.  And doesn't even do what
> people expect it to do.
> 
> I've recently released GNU libtasn1, libidn, libidn2, inetutils, and
> gsasl which all were done using the recipe above.  When converting the
> Debian packaging for these packages from "make dist" to "git-archive"
> style tarballs, several forgotten build dependencies were discovered
> (including gperf, gengetopt) indicating that the packages never really
> built things from source before.  I think this is a widespread problem.
> 
> I've written/talked a bit about this:
> 
> https://blog.josefsson.org/2024/04/01/towards-reproducible-minimal-source-co
> de-tarballs-please-welcome-src-tar-gz/
> 
> https://blog.josefsson.org/2024/04/13/reproducible-and-minimal-source-only-t
> arballs/
> 
> https://debconf24.debconf.org/talks/126-de-vendor-origtargz-gnulib-and-more/
> 
> One advantage with "git archive" releases is that they are easy to
> reproduce by one, which avoids the boring work involved to turn "make
> dist" tarballs reproducible:
> 
> https://blog.josefsson.org/2025/03/24/reproducible-software-releases/
> 
> What possibly could be missing is guidelines on what to store in source
> control repository: if you put some generated vendored file in git (or
> some non-free firmware blob), we are back where we started.
> 
> /Simon

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 228 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.reproducible-builds.org/pipermail/rb-general/attachments/20250331/ab9ef5cd/attachment.sig>


More information about the rb-general mailing list