<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style type="text/css" style="display:none;"><!-- P {margin-top:0;margin-bottom:0;} --></style>
</head>
<body dir="ltr">
<div id="divtagdefaultwrapper" style="font-size:12pt;color:#000000;font-family:Garamond,Georgia,serif;" dir="ltr">
<meta content="text/html; charset=UTF-8">
<div dir="ltr">
<div id="x_divtagdefaultwrapper" dir="ltr" style="font-size: 12pt; color: rgb(0, 0, 0); font-family: Garamond, Georgia, serif, "EmojiFont", "Apple Color Emoji", "Segoe UI Emoji", NotoColorEmoji, "Segoe UI Symbol", "Android Emoji", EmojiSymbols;">
<p>Hi Chris,</p>
<p><br>
</p>
<p>Thanks for you answer and the pointers to the codebase! I agree it would not make sense to diff only specific files given there are so many ways to archive files.</p>
<p><br>
</p>
<div id="x_Signature">
<div id="x_divtagdefaultwrapper" dir="ltr" style="font-size:12pt; color:rgb(0,0,0); font-family:Calibri,Helvetica,sans-serif,"EmojiFont","Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols">
<div id="x_m_4935352394101912768Signature">
<div name="x_divtagdefaultwrapper"><font size="2" color="#808080"><span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)"><span id="x_divtagdefaultwrapper" style="font-size:12pt">
<div style="margin-top:0; margin-bottom:0"><span style="color:rgb(0,0,0); font-family:Garamond,Georgia,serif">Regards,</span></div>
<span style="font-family:Garamond,Georgia,serif"></span><span style="font-family:Garamond,Georgia,serif"></span><span style="color:rgb(0,0,0)"></span><span style="font-family:Garamond,Georgia,serif"></span><span style="font-family:Garamond,Georgia,serif"></span>
<div style="margin-top:0; margin-bottom:0"><span style="color:rgb(0,0,0); font-family:Garamond,Georgia,serif">Aman Sharma</span></div>
</span><br>
</span></font></div>
<div name="x_divtagdefaultwrapper"><font size="2" color="#808080"><span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)"></span><span class="x_im">PhD Student<br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
<span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)">KTH Royal Institute of Technology</span><br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
</span><span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)">School of Electrical Engineering and Computer Science (EECS)</span><br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
<span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)">Department of Theoretical Computer Science (TCS)</span><br style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif">
<span style="font-family:Arial,"Helvetica Neue",helvetica,sans-serif; background-color:rgb(255,255,255)"><a href="http://www.kth.se" target="_blank" id="LPNoLP"></a><a href="https://www.kth.se/profile/amansha" class="x_OWAAutoLink" id="LPNoLP"></a><a href="https://www.kth.se/profile/amansha" class="x_OWAAutoLink" id="LPNoLP"></a></span></font></div>
</div>
<a href="https://www.kth.se/profile/amansha" class="x_OWAAutoLink" id="LPNoLP"><span style="font-size:10pt"></span></a><a href="https://algomaster99.github.io/" class="x_OWAAutoLink" id="LPNoLP">https://algomaster99.github.io/</a><br>
</div>
</div>
</div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="x_divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Chris Lamb <chris@reproducible-builds.org><br>
<b>Sent:</b> Tuesday, December 31, 2024 2:00:17 PM<br>
<b>To:</b> Aman Sharma<br>
<b>Cc:</b> diffoscope<br>
<b>Subject:</b> Re: [diffoscope] What file to diff on after root paths are compared?</font>
<div> </div>
</div>
</div>
<font size="2"><span style="font-size:10pt">
<div class="PlainText">Hello Aman,<br>
<br>
> My question is how does it know which files to diff over? For<br>
> example, consider the diff output below between two jars. It first<br>
> runs some zip archive related tools and then shows the diff between<br>
> the files (MANIFEST and properties file) that was actually causing<br>
> the changes in the archive. Does diffoscope know about these files<br>
> from zipinfo/zipnote/zipdetails output or does it simply diff over<br>
> all files in the archive?<br>
<br>
The latter; it diffs all files within the archive, regardless of what<br>
zipinfo, zipnote and zipdetails might say — the output of these tools<br>
is not parsed.<br>
<br>
The mechanism for this is not entirely clear within diffoscope's<br>
codebase, however, This is partly because of the ways diffoscope<br>
itself provides an abstraction layer to support many different file<br>
formats, but also because, for the majority of archive types, the<br>
actual extraction process is delegated to the libarchive library.<br>
<br>
You can see this in action within diffoscope/comparators/utils/libarchive.py,<br>
where a call to get_member_names() calls ensure_unpacked() and thereby<br>
extracting the archive's contents to a temporary directory.<br>
<br>
<br>
Best wishes,<br>
<br>
-- <br>
o<br>
⬋ ⬊ Chris Lamb<br>
o o reproducible-builds.org 💠<br>
⬊ ⬋<br>
o<br>
</div>
</span></font></div>
</body>
</html>