[diffoscope] Diffoscope falls back to xxd for two (seemingly) text files with identical content

Aman Sharma amansha at kth.se
Mon Feb 24 14:07:21 UTC 2025


Hi Chris,


I have reported it to file(1). I assume the conversation regarding it will be posted here<https://mailman.astron.com/pipermail/file/2025-February/thread.html>. It is not there yet, but I send another mail here once it is there.


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/
________________________________
From: Chris Lamb <chris at reproducible-builds.org>
Sent: Friday, February 21, 2025 1:21:28 PM
To: diffoscope
Cc: Aman Sharma
Subject: Re: [diffoscope] Diffoscope falls back to xxd for two (seemingly) text files with identical content

Hey Aman,

> This is quite strange as the diff is not clear at all. The content
> seems to be identical.

From a quick glance, it seems that the difference is that one of them
has 0d0a (CRLF) line-endings and one of them has single 0a line
endings.

However, the question still remains why diffoscope is not detecting
these as text. The immediate cause of this is that file(1) is not
recognising them as text:

  $ file ref reb
  ref: data
  reb: data

diffoscope must essentially rely on file(1) getting it right — could
you therefore file a bug against file about this?

Quickly experimenting, it appears the cause of this is the strange
character (0x1e) on line 1185.


Best wishes,

--
      o
    ⬋   ⬊      Chris Lamb
   o     o     reproducible-builds.org 💠
    ⬊   ⬋
      o
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.reproducible-builds.org/pipermail/diffoscope/attachments/20250224/44ee5fea/attachment.htm>


More information about the diffoscope mailing list