tests.reproducible-builds.org: build logs sent as gzip but no content-encoding, but only sometimes?

наб nabijaczleweli at nabijaczleweli.xyz
Wed Nov 15 21:42:25 UTC 2023


Hi!

On Wed, Nov 15, 2023 at 03:38:29PM +0000, Holger Levsen wrote:
> On Wed, Nov 15, 2023 at 03:23:59PM +0100, наб wrote:
> > So I'm assuming if you made it not do that then you'd both
> > save on load and have a wider compatibility range?
> seems reasonable, though I have to admit this won't get on my
> todo list anytime soon, so if you could provide a patch against
> https://salsa.debian.org/qa/jenkins.debian.net/-/blob/master/hosts/jenkins/etc/apache2/sites-available/jenkins.debian.net.conf
> that would be very much appreciated.

Okay I can repro this behaviour
(though, in apache's usual hateful manner, not on small files,
 and I started with a compressed "hello")
in a sid chroot (with
http://snapshot.debian.org/archive/debian/20230414T144855Z/pool/main/a/apache2/apache2_2.4.57-2_amd64.deb
http://snapshot.debian.org/archive/debian/20230414T144855Z/pool/main/a/apache2/apache2-bin_2.4.57-2_amd64.deb
http://snapshot.debian.org/archive/debian-ports/20230414T140955Z/pool/main/a/apache2/apache2-data_2.4.57-2_all.deb
http://snapshot.debian.org/archive/debian/20230414T144855Z/pool/main/a/apache2/apache2-utils_2.4.57-2_amd64.deb
and /etc/apache2/mods-enabled# ln -s ../mods-available/*.load .)
having reduced the config you linked to something that ought to be equivalent:
  <Macro common-directives $name>
  	ServerName $name
  	ServerAdmin root
  
  	<Directory />
  		Options Indexes FollowSymLinks MultiViews
  		AllowOverride None
  		Require all granted
  		AddType text/plain .log
  	</Directory>
  
  	<FilesMatch "\.gz$">
  		Header append Content-Encoding gzip
  		# this causes errors 406 to clients connecting without Accept-Encoding=gzip.
  		#AddEncoding gzip .gz
  		ForceType text/plain
  	</FilesMatch>
  
  	RewriteEngine on
  	ProxyRequests Off
  
  	ErrorLog ${APACHE_LOG_DIR}/error.log
  	# Possible values include: debug, info, notice, warn, error, crit,
  	# alert, emerg.
  	LogLevel warn
  	CustomLog ${APACHE_LOG_DIR}/access.log combined
  </Macro>
  
  <Macro r-b-artifacts $base>
  	<Directory /srv/$base>
  		HeaderName .HEADER.html
  	</Directory>
  </Macro>
  
  <VirtualHost *:8000>
  	Use common-directives tests.reproducible-builds.org
  
  	DocumentRoot /srv
  	AddDefaultCharset utf-8
  
  	Use r-b-artifacts debian
  	Use r-b-artifacts debian/live_build
  
  	# for watching service logfiles
  	ScriptAlias /cgi-bin /srv/jenkins/bin/cgi-bin
  	<Directory "/srv/jenkins/bin/cgi-bin">
  	    AllowOverride None
  	    Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch
  	    Require all granted
  	</Directory>
  
  	<Proxy *>
  		Require all granted
  	</Proxy>
  </VirtualHost>
then
  /srv/debian# wget https://tests.reproducible-builds.org/debian/rbuild/unstable/amd64/systemd-cron_2.3.0-1.rbuild.log.gz
and
  $ curl -v http://tarta:8000/debian/systemd-cron_2.3.0-1.rbuild.log.gz
  < ETag: "7a1-60a37723b3eb9"
  < Content-Encoding: gzip
  $ curl -v http://tarta:8000/debian/systemd-cron_2.3.0-1.rbuild.log.gz --compressed
  < ETag: "7a1-60a37723b3eb9-gzip"
  < Content-Encoding: gzip, gzip
&c.

However, I cannot reproduce the
  # this causes errors 406 to clients connecting without Accept-Encoding=gzip.
bit. Making that sexion just be
  <FilesMatch "\.gz$">
    AddEncoding gzip .gz
    ForceType text/plain
  </FilesMatch>
yields
  $ curl -sv http://tarta:8000/debian/systemd-cron_2.3.0-1.rbuild.log.gz  | gzip -d > /dev/null
  *   Trying 127.0.1.1:8000...
  * Connected to tarta (127.0.1.1) port 8000 (#0)
  > GET /debian/systemd-cron_2.3.0-1.rbuild.log.gz HTTP/1.1
  > Host: tarta:8000
  > User-Agent: curl/7.88.1
  > Accept: */*
  >
  < HTTP/1.1 200 OK
  < Date: Wed, 15 Nov 2023 21:24:51 GMT
  < Server: Apache/2.4.57 (Debian)
  < Last-Modified: Wed, 15 Nov 2023 21:18:36 GMT
  < ETag: "7a1-60a37723b3eb9"
  < Accept-Ranges: bytes
  < Content-Length: 1953
  < Content-Type: text/plain; charset=utf-8
  < Content-Encoding: gzip
  <
  { [1953 bytes data]
  * Connection #0 to host tarta left intact
and
  $ curl -H 'accept-encoding: gzip' -sv http://tarta:8000/debian/systemd-cron_2.3.0-1.rbuild.log.gz  | gzip -d > /dev/null
  *   Trying 127.0.1.1:8000...
  * Connected to tarta (127.0.1.1) port 8000 (#0)
  > GET /debian/systemd-cron_2.3.0-1.rbuild.log.gz HTTP/1.1
  > Host: tarta:8000
  > User-Agent: curl/7.88.1
  > Accept: */*
  > accept-encoding: gzip
  >
  < HTTP/1.1 200 OK
  < Date: Wed, 15 Nov 2023 21:25:06 GMT
  < Server: Apache/2.4.57 (Debian)
  < Last-Modified: Wed, 15 Nov 2023 21:18:36 GMT
  < ETag: "7a1-60a37723b3eb9"
  < Accept-Ranges: bytes
  < Content-Length: 1953
  < Content-Type: text/plain; charset=utf-8
  < Content-Encoding: gzip
  <
  { [1953 bytes data]
  * Connection #0 to host tarta left intact
was it just an issue with some old apache? I see it blaming to
  $ git show 8319ffcadc
  commit 8319ffcadcc17084fda7a6318f9f92a3c53414da
  Author: Mattia Rizzolo <mattia at debian.org>
  Date:   Tue May 29 15:20:03 2018 +0200
  
      apache: re-add Content-Encoding=gzip to .gz files
  
      this will make them displayed in browsers, etc
      Also, drop the auto-inflate thing, that causes apt and clients that
      actually expects compressed content to blow up (and we really shouldn't
      decompress silently anyway: if we say we are encoding in gzip we should
      serve stuff encoded with gz.
  
      AddEncoding seems to do something more than just add the
      Content-Encoding header, as client not specifing Accept-Encoding=gzip
      would fail with 406.  Let's just add the single header manually for now.
  
      Signed-off-by: Mattia Rizzolo <mattia at debian.org>
  
  diff --git a/hosts/jenkins/etc/apache2/sites-available/jenkins.debian.net.conf b/hosts/jenkins/etc/apache2/sites-available/jenkins.debian.net.conf
  index b66e91661..6853d30f3 100644
  --- a/hosts/jenkins/etc/apache2/sites-available/jenkins.debian.net.conf
  +++ b/hosts/jenkins/etc/apache2/sites-available/jenkins.debian.net.conf
  @@ -41,13 +41,12 @@
                  Require all granted
          </Directory>
  
  -       #<FilesMatch "(?!Packages.gz)\.gz$">
  -       #       AddEncoding gzip .gz
  -       #       ForceType text/plain
  -       #       FilterDeclare gzipInflate CONTENT_SET
  -       #       FilterProvider gzipInflate inflate "%{req:Accept-Encoding} !~ /gzip/"
  -       #       FilterChain +gzipInflate
  -       #</FilesMatch>
  +       <FilesMatch "\.gz$">
  +               Header append Content-Encoding gzip
  +               # this causes errors 406 to client connecting without Accept-Encoding=gzip.
  +               #AddEncoding gzip .gz
  +               ForceType text/plain
  +       </FilesMatch>
  
          RewriteEngine on
          ProxyRequests Off
2018 is squarely in the stretch era,
so maybe apache's improved in the mean-time.
Pigs flying, &c., &c.

Best,
-- >8 --
Subject: [PATCH] jenkins.d.n: AddEncoding gzip .gz, such that the files aren't
 re-compressed as gzip, which some clients can't deal with, and it's a waste
 of time

The current approach, under bookworm apache 2.4.57-2, returns:
  (no accept-encoding)    -> content-encoding: gzip       + raw gzipped file
  (accept-encoding: gzip) -> content-encoding: gzip, gzip + gzip(raw gzipped file)

This is valid, but on systems with ESET, or under lynx, opening
  https://tests.reproducible-builds.org/debian/rbuild/unstable/amd64/systemd-cron_2.3.0-1.rbuild.log.gz
would just return a garbled result: they match content-encoding
directly, instead of tokenising it.

The scary comment doesn't seem to apply to apache 2.4.57-2.
---
 .../etc/apache2/sites-available/jenkins.debian.net.conf       | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/hosts/jenkins/etc/apache2/sites-available/jenkins.debian.net.conf b/hosts/jenkins/etc/apache2/sites-available/jenkins.debian.net.conf
index 289c6240c..5e40b9aaa 100644
--- a/hosts/jenkins/etc/apache2/sites-available/jenkins.debian.net.conf
+++ b/hosts/jenkins/etc/apache2/sites-available/jenkins.debian.net.conf
@@ -87,9 +87,7 @@
 	</Directory>
 
 	<FilesMatch "\.gz$">
-		Header append Content-Encoding gzip
-		# this causes errors 406 to clients connecting without Accept-Encoding=gzip.
-		#AddEncoding gzip .gz
+		AddEncoding gzip .gz
 		ForceType text/plain
 	</FilesMatch>
 
-- 
2.39.2
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.reproducible-builds.org/pipermail/rb-general/attachments/20231115/0cf921fe/attachment.sig>


More information about the rb-general mailing list