<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} --></style>
</head>
<body dir="ltr">
<div id="divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Garamond,Georgia,serif;" dir="ltr">
<div id="divtagdefaultwrapper" dir="ltr" style="font-size: 12pt; color: rgb(0, 0, 0); font-family: Garamond, Georgia, serif, "EmojiFont", "Apple Color Emoji", "Segoe UI Emoji", NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", EmojiSymbols;">
<p>Hi Chris,</p>
<p><br>
</p>
<p>I got a response from the file tool. It is on their mailing list - <a href="https://mailman.astron.com/pipermail/file/2025-March/001476.html" class="OWAAutoLink">
https://mailman.astron.com/pipermail/file/2025-March/001476.html</a>. It contains UTF 8 characters <span>\xc3\xa9\xc3\xa9</span> (éé) in ref and reb and that is why file returns data. I have asked them the possibility if it should return UTF 8 instead of data.<br>
</p>
<p><br>
</p>
<div id="Signature">
<div id="divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:rgb(0,0,0); font-family:Calibri,Helvetica,sans-serif,"EmojiFont","Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols">
<div id="m_4935352394101912768Signature">
<div name="divtagdefaultwrapper"><font size="2" color="#808080"><span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)"><span id="divtagdefaultwrapper" style="font-size:12pt">
<div style="margin-top:0; margin-bottom:0"><span style="color:rgb(0,0,0); font-family:Garamond,Georgia,serif">Regards,</span></div>
<span style="font-family:Garamond,Georgia,serif"></span><span style="font-family:Garamond,Georgia,serif"></span><span style="color:rgb(0,0,0)"></span><span style="font-family:Garamond,Georgia,serif"></span><span style="font-family:Garamond,Georgia,serif"></span>
<div style="margin-top:0; margin-bottom:0"><span style="color:rgb(0,0,0); font-family:Garamond,Georgia,serif">Aman Sharma</span></div>
</span><br>
</span></font></div>
<div name="divtagdefaultwrapper"><font size="2" color="#808080"><span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)"></span><span class="im">PhD Student<br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
<span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)">KTH Royal Institute of Technology</span><br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
</span><span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)">School of Electrical Engineering and Computer Science (EECS)</span><br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
<span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)">Department of Theoretical Computer Science (TCS)</span><br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
<span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)"><a href="http://www.kth.se" target="_blank" id="LPNoLP"></a><a href="https://www.kth.se/profile/amansha" class="OWAAutoLink" id="LPNoLP"></a><a href="https://www.kth.se/profile/amansha" class="OWAAutoLink" id="LPNoLP"></a></span></font></div>
</div>
<a href="https://www.kth.se/profile/amansha" class="OWAAutoLink" id="LPNoLP"><span style="font-size:10pt"></span></a><a href="https://algomaster99.github.io/" class="OWAAutoLink" id="LPNoLP">https://algomaster99.github.io/</a><br>
</div>
</div>
</div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Aman Sharma<br>
<b>Sent:</b> Monday, February 24, 2025 3:07:21 PM<br>
<b>To:</b> Chris Lamb; diffoscope<br>
<b>Subject:</b> Re: [diffoscope] Diffoscope falls back to xxd for two (seemingly) text files with identical content</font>
<div> </div>
</div>
<div>
<meta content="text/html; charset=UTF-8">
<style type="text/css" style="">
<!--
p
        {margin-top:0;
        margin-bottom:0}
-->
</style>
<div dir="ltr">
<div id="x_divtagdefaultwrapper" dir="ltr" style="font-size: 12pt; color: rgb(0, 0, 0); font-family: Garamond, Georgia, serif, "EmojiFont", "Apple Color Emoji", "Segoe UI Emoji", NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", EmojiSymbols;">
<p>Hi Chris,</p>
<p><br>
</p>
<p>I have reported it to file(1). I assume the conversation regarding it will be posted
<a href="https://mailman.astron.com/pipermail/file/2025-February/thread.html" class="x_OWAAutoLink">
here</a>. It is not there yet, but I send another mail here once it is there.<br>
</p>
<p><br>
</p>
<div id="x_Signature">
<div id="x_divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:rgb(0,0,0); font-family:Calibri,Helvetica,sans-serif,"EmojiFont","Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols">
<div id="x_m_4935352394101912768Signature">
<div name="x_divtagdefaultwrapper"><font size="2" color="#808080"><span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)"><span id="x_divtagdefaultwrapper" style="font-size:12pt">
<div style="margin-top:0; margin-bottom:0"><span style="color:rgb(0,0,0); font-family:Garamond,Georgia,serif">Regards,</span></div>
<span style="font-family:Garamond,Georgia,serif"></span><span style="font-family:Garamond,Georgia,serif"></span><span style="color:rgb(0,0,0)"></span><span style="font-family:Garamond,Georgia,serif"></span><span style="font-family:Garamond,Georgia,serif"></span>
<div style="margin-top:0; margin-bottom:0"><span style="color:rgb(0,0,0); font-family:Garamond,Georgia,serif">Aman Sharma</span></div>
</span><br>
</span></font></div>
<div name="x_divtagdefaultwrapper"><font size="2" color="#808080"><span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)"></span><span class="x_im">PhD Student<br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
<span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)">KTH Royal Institute of Technology</span><br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
</span><span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)">School of Electrical Engineering and Computer Science (EECS)</span><br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
<span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)">Department of Theoretical Computer Science (TCS)</span><br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
<span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)"><a href="http://www.kth.se" target="_blank" id="LPNoLP"></a><a href="https://www.kth.se/profile/amansha" class="x_OWAAutoLink" id="LPNoLP"></a><a href="https://www.kth.se/profile/amansha" class="x_OWAAutoLink" id="LPNoLP"></a></span></font></div>
</div>
<a href="https://www.kth.se/profile/amansha" class="x_OWAAutoLink" id="LPNoLP"><span style="font-size:10pt"></span></a><a href="https://algomaster99.github.io/" class="x_OWAAutoLink" id="LPNoLP">https://algomaster99.github.io/</a><br>
</div>
</div>
</div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="x_divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Chris Lamb <chris@reproducible-builds.org><br>
<b>Sent:</b> Friday, February 21, 2025 1:21:28 PM<br>
<b>To:</b> diffoscope<br>
<b>Cc:</b> Aman Sharma<br>
<b>Subject:</b> Re: [diffoscope] Diffoscope falls back to xxd for two (seemingly) text files with identical content</font>
<div> </div>
</div>
</div>
<font size="2"><span style="font-size:10pt">
<div class="PlainText">Hey Aman,<br>
<br>
> This is quite strange as the diff is not clear at all. The content <br>
> seems to be identical.<br>
<br>
From a quick glance, it seems that the difference is that one of them<br>
has 0d0a (CRLF) line-endings and one of them has single 0a line<br>
endings.<br>
<br>
However, the question still remains why diffoscope is not detecting<br>
these as text. The immediate cause of this is that file(1) is not<br>
recognising them as text:<br>
<br>
  $ file ref reb<br>
  ref: data<br>
  reb: data<br>
<br>
diffoscope must essentially rely on file(1) getting it right — could<br>
you therefore file a bug against file about this?<br>
<br>
Quickly experimenting, it appears the cause of this is the strange<br>
character (0x1e) on line 1185.<br>
<br>
<br>
Best wishes,<br>
<br>
-- <br>
      o<br>
    ⬋   ⬊      Chris Lamb<br>
   o     o     reproducible-builds.org 💠<br>
    ⬊   ⬋<br>
      o<br>
</div>
</span></font></div>
</div>
</body>
</html>