[diffoscope] Diffoscope falls back to xxd for two (seemingly) text files with identical content

Chris Lamb chris at reproducible-builds.org
Fri Apr 11 17:58:44 UTC 2025


Hi Aman,

> > https://github.com/chains-project/reproducible-central/issues/20#issuecomment-2794657117
>
> […]
>
> Reference.java: HTML document, ASCII text, with very long lines (6135)
> Rebuild.java:   Java source, ASCII text, with very long lines (6135)
>
> First thing that is strange here is "HTML" document for Reference.java. 
> That seems like a bug.

Yes that is indeed a bug... but this is a bug in file(1). Can you file
another bug with upstream? If it helps make your bug report more
helpful to them, the cause of this is the "<title>" on line 30.
Removing this line results in file(1) correctly identifying it as
"Java source, ASCII text, with very long lines (6135)".

> But both files are ASCII this time so diffoscope should be able to
> use diff tool, right?

As before, unfortunately, diffoscope must rely on file(1) returning
the right file type, and the important bit is the "Java source" (or
"HTML document") bit.


Best wishes,

-- 
      o
    ⬋   ⬊      Chris Lamb
   o     o     reproducible-builds.org 💠
    ⬊   ⬋
      o


More information about the diffoscope mailing list