[diffoscope] Diffoscope falls back to xxd for two (seemingly) text files with identical content

Chris Lamb chris at reproducible-builds.org
Fri Feb 21 12:21:28 UTC 2025


Hey Aman,

> This is quite strange as the diff is not clear at all. The content 
> seems to be identical.

>From a quick glance, it seems that the difference is that one of them
has 0d0a (CRLF) line-endings and one of them has single 0a line
endings.

However, the question still remains why diffoscope is not detecting
these as text. The immediate cause of this is that file(1) is not
recognising them as text:

  $ file ref reb
  ref: data
  reb: data

diffoscope must essentially rely on file(1) getting it right — could
you therefore file a bug against file about this?

Quickly experimenting, it appears the cause of this is the strange
character (0x1e) on line 1185.


Best wishes,

-- 
      o
    ⬋   ⬊      Chris Lamb
   o     o     reproducible-builds.org 💠
    ⬊   ⬋
      o


More information about the diffoscope mailing list