[diffoscope] Large layers in diffoscope docker image

Greg Chabala greg.chabala at gmail.com
Sun May 24 15:28:17 UTC 2020


On Sun, May 24, 2020 at 5:09 AM Mattia Rizzolo <mattia at mapreri.org> wrote:

> On Fri, May 22, 2020 at 11:35:51AM -0500, Greg Chabala wrote:
> > Just discovered diffoscope and was giving it a try, but I was surprised
> to
> > see how large the docker image is:
> >
> > $ docker run --rm -t -w $(pwd) -v $(pwd):$(pwd):ro \
>> >    ]  210.7MB/1.212GB
> >
> > I considered opening an issue for this, but I thought I'd ask about it
> > first. I don't think 1GB+ image layers is a good idea. Was this necessary
> > for some reason?
>
> "Unfortunately" that's not a bug.
> Diffoscope uses **many** external programs, many of which are quite
> huge.  They are all optional, but if they aren't installed diffoscope
> would only be able to output an hex dump, which is rarely useful.
>

I considered this. But in the land of linux tools, one gigabyte is many,
many tools. I suspect more is being installed than is truly needed.


>
> To give you a list of used tools (from `diffoscope --list-tools):
>     Rscript, abootimg, apksigner, apktool, bsdtar, bzip2, cbfstool,
>     cd-iccdump, cmp, compare, convert, db_dump, diff, docx2txt, dumppdf,
>     dumpxsb, enjarify, fdtdump, ffprobe, getfacl, ghc, gifbuild, gpg,
>     gzip, h5dump, identify, img2txt, isoinfo, javap, js-beautify,
>     kbxutil, lipo, llvm-bcanalyzer, llvm-dis, lsattr, lz4, msgunfmt, nm,
>     objcopy, objdump, ocamlobjinfo, odt2txt, oggDump, openssl, otool,
>     pdftotext, pedump, pgpdump, ppudump, procyon, ps2ascii, readelf,
>     rpm2cpio, showttf, sng, sqlite3, ssconvert, ssh-keygen, stat,
>     tcpdump, unsquashfs, wasm2wat, xxd, xz, zipinfo, zipnote, zstd
>
>
Any idea about which tools might be the worst citizens with regard to disk
space?

When I tried building the Dockerfile myself, I ran out of disk space; I had
9GB free prior to starting. But that's likely because the 'apt-get update'
and subsequent cleanup are in different layers.

Following my initial email I found this issue:
https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/104

and I used the improved Dockerfile in the comments to build locally. But I
was still left with a 3.5GB image.

I hope that the changes from #104 get implemented, but there's likely more
to be done. A multi-stage build
<https://docs.docker.com/develop/develop-images/multistage-build/> might
help to separate building dependencies from runtime dependencies in the
final image.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.reproducible-builds.org/pipermail/diffoscope/attachments/20200524/3c01f3e2/attachment.htm>


More information about the diffoscope mailing list