<div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-family:monospace,monospace;font-size:small">Hi vagrant,</div><div class="gmail_default" style="font-family:monospace,monospace;font-size:small"><br></div><div class="gmail_default" style="font-family:monospace,monospace;font-size:small"><span style="color:rgb(14,16,26);background:transparent;margin-top:0pt;margin-bottom:0pt">Thank you for the reply! Because we are still learning about the ideas/practices of reproducible builds, our current practices may not align with the best/typical practices done by people who are more experienced. That should be why what I described earlier may confuse you. At the end of your reply, you asked what I was trying to accomplish. My primary goal is </span><strong style="color:rgb(14,16,26);background:transparent;margin-top:0pt;margin-bottom:0pt"><span style="background:transparent;margin-top:0pt;margin-bottom:0pt">learning</span></strong><span style="color:rgb(14,16,26);background:transparent;margin-top:0pt;margin-bottom:0pt">: to learn whether we are managing the DSC files and using Reprepro in the appropriate way so I can help my company to do it better. By "appropriate," I mean "what other code builders and Debian repository maintainers typically do," and your reply showed me that we are doing them in a way that may cause us the current problems.</span><br></div><div class="gmail_default" style="font-family:monospace,monospace;font-size:small"><br></div><div class="gmail_default" style="font-family:monospace,monospace;font-size:small">I'll try to clarify what I said and answer your questions right below your comments. Hope they help. Thanks again for your time!<br></div><div class="gmail_default" style="font-family:monospace,monospace;font-size:small"><br></div><div class="gmail_default" style="font-family:monospace,monospace;font-size:small">Cheers!</div><div class="gmail_default" style="font-family:monospace,monospace;font-size:small">Yaobin</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, May 26, 2022 at 12:23 PM Vagrant Cascadian <<a href="mailto:vagrant@reproducible-builds.org">vagrant@reproducible-builds.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 2022-05-26, Yaobin Wen wrote:<br>
> In my company, we use *Ubuntu (18.04)* and are practicing reproducible<br>
> builds. Our code is built into a lot of .*deb* packages using *debuild* (and<br>
> related tools). We have made a lot of effort to make our builds<br>
> reproducible by following the Achieve deterministic builds<br>
> <<a href="https://reproducible-builds.org/docs/" rel="noreferrer" target="_blank">https://reproducible-builds.org/docs/</a>> and Known issues related to<br>
> reproducible builds<br>
> <<a href="https://tests.reproducible-builds.org/debian/index_issues.html" rel="noreferrer" target="_blank">https://tests.reproducible-builds.org/debian/index_issues.html</a>>. We have<br>
> made a lot of progress and are still working on it.<br>
<br>
It is great to hear of your work on reproducible builds!<br>
<br>
<br>
> We set up a company-wise *Reprepro server to serve the Debian packages<br>
> that we build regularly*. We publish both *.deb* files and the *GPG-signed<br>
> .dsc* files to the Reprepro server. BTW, our build system is designed to do<br>
> an "*change-only build*": if a package is not changed since the last build,<br>
> i.e., its *changelog* is not changed, the package is not built again this<br>
> time. *Their .deb and .dsc files <span class="gmail_default" style="font-family:monospace,monospace;font-size:small"></span>are still added to Reprepro but because<br>
> these files remain unchanged, Reprepro can successfully "add" them (but in<br>
> fact they are skipped)*. I figured this point may be important to<br>
> understand why I have my questions below.<br>
<br>
I don't follow what you mean by "the package is not built again this<br>
time" and "Their .deb and .dsc files are still added to Reprepro". How<br>
can a package be added if it is not built?<br>
<br></blockquote><div><span style="font-family:monospace,monospace">I forgot to mention that <b><span class="gmail_default" style="font-family:monospace,monospace;font-size:small"></span>we cache the packages we have already built</b>. For example, when we build version 1 (v1) of a package, we first put the results (including at least the .deb file and the .dsc file) to a "build cache" and then publish it to our Reprepro instance. Then, when we kick off a build the next time, and this package remains unchanged (i.e., still v1), we will directly use the cached result and not build it again. However, if the package has been updated to a newer version, e.g., v1.1 or v2, our build system will build the package (and cache the new results).</span><br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<span class="gmail_default" style="font-family:monospace,monospace;font-size:small"></span>What's unclear to me is why you're uploading the same .deb and .dsc<br>
files to the same reprepro repository multiple times... ?</blockquote><div><div class="gmail_default" style="font-family:monospace,monospace;font-size:small"><br></div><div class="gmail_default" style="font-family:monospace,monospace;font-size:small">The reason that we upload the same .deb and .dsc files to the same Reprepro repository might be <b>because we didn't know that</b>, as you pointed out at the end of your reply, "with a Debian-style repository, it's not expected that you will ever upload the same version of any given object more than once."</div><div class="gmail_default" style="font-family:monospace,monospace;font-size:small"><br></div><div class="gmail_default" style="font-family:monospace,monospace;font-size:small">But just try to clarify with more details: In my original email, when I said "are still added to Reprepro", I meant that we issue the command "reprepro -b <base-dir> -T deb/dsc -C <component-name> includedeb/includedsc <distro-name> <path-of-deb/dsc-file>" on the same files again. Because Reprepro already has them, Reprepro will skip the inclusion (with the message "Skipping inclusion of '<package-name>' '<version>' ... as it has already '<version>'."). <b>None of the files in Reprepro are changed. We didn't forcibly upload/overwrite (e.g., using `cp` or `rsync`) any file in the `pool` or any other directory.</b></div></div><div><div class="gmail_default" style="font-family:monospace,monospace;font-size:small"></div></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> Although we have solved many reproducibility issues in the .*deb* files, *I<br>
> found the .dsc files were changed when* I rebuilt the packages (by deleting<br>
> the previously built *.deb* and *.dsc* files) so Reprepro refuses to<br>
> include them and reports the following error:<br>
><br>
> ERROR: '<path-to-dsc-on-build-machine>' cannot be included as<br>
>> 'pool/<path-to-dsc-in-reprepro>'.<br>
>> Already existing files can only be included again, if they are the same,<br>
>> but:<br>
>> md5 expected: <md5-1>, got: <md5-2><br>
>> sha1 expected: <sha1-1>, got: <sha1-2><br>
>> sha256 expected: <sha256-1>, got: <sha256-2><br>
><br>
><br>
> *diffoscope* told me the `.*dsc*` files *only differ in their GPG<br>
> signatures* - the related source tarball (<filename>.orig.tar.gz) and<br>
> debian tarball (<filename>.debian.tar.xz) *have not changed between<br>
> builds.*<br>
<br>
That's to be expected...<br>
<br>
<br>
> I understand that, because as this SO answer says<br>
> <<a href="https://security.stackexchange.com/a/78958/80050" rel="noreferrer" target="_blank">https://security.stackexchange.com/a/78958/80050</a>>, the GPG signature is<br>
> generated using the creation time as an input. I found the issue<br>
> cryptographic_signature<br>
> <<a href="https://tests.reproducible-builds.org/debian/issues/unstable/cryptographic_signature_issue.html" rel="noreferrer" target="_blank">https://tests.reproducible-builds.org/debian/issues/unstable/cryptographic_signature_issue.html</a>><br>
> that<br>
> made me think we should not have signed our .*dsc* files, but the Debian<br>
> Admin's Handbook<br>
> <<a href="https://www.debian.org/doc/manuals/debian-handbook/sect.source-package-structure.en.html" rel="noreferrer" target="_blank">https://www.debian.org/doc/manuals/debian-handbook/sect.source-package-structure.en.html</a>><br>
> shows that the .*dsc* files are supposed to be signed by the maintainers.<br>
> In addition, in the Known Issues list<br>
> <<a href="https://tests.reproducible-builds.org/debian/index_issues.html" rel="noreferrer" target="_blank">https://tests.reproducible-builds.org/debian/index_issues.html</a>>, I didn't<br>
> seem to find any issue that's related with the .*dsc* files.<br>
<br>
If you want to know which party claims to have built a given .dsc file,<br>
you need the signatures on them. If you track that information some<br>
other way, you *could* use unsigned .dsc files...<br>
<br>
Or you could re-use the original .dsc files, if all the contents they<br>
reference are bit-for-bit identical. If you want to store the new ones<br>
somewhere else as a "proof of having rebuilt it again" you could do<br>
that, but obviously not in the exact same repository.<br>
<br>
<br>
> *After reading around, I'm guessing my understanding about<br>
> reproducible builds may not be totally correct, so I want to ask here:*<br>
><br>
> 1. *Should the .dsc files be reproducible, too?* Because Reprepro can<br>
> manage .*dsc* files, I've been thinking that .*dsc* files should be<br>
> reproducible, but now it seems not?<br>
<br>
If you build them in the same build environment, with the same source<br>
code, they should be reproducible *minus the signatures*, as you've<br>
noted...<br>
<br>
Generally, from a reproducible builds perspective, the .dsc file is<br>
considered an input to the build process rather than a result of a build<br>
process. Though admittedly, .dsc files are themselves artifacts of a<br>
source-only build, so it is a bit of a grey area.<br></blockquote><div><br></div><div><div class="gmail_default" style="font-family:monospace,monospace;font-size:small">It makes total sense to me that the .dsc files should be used as an input. I should have realized this earlier. I took a glance of these files before but have never studied them. I'm also aware of the .buildinfo files and I think both .buildinfo and .dsc should be used as the input for rebuilding the code. Am I right? But by browsing the Reproducible Builds website ("Tools" page, specifically), I didn't seem to find the tools that use these two kinds of files. <b>Are there such tools somewhere?</b></div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> 2. In my case, since my company maintains both .*deb* files and .*dsc* files<br>
> in Reprepro, if one day we need to build the code of an earlier version, we<br>
> would inevitably generate different .*dsc* files because of the GPG<br>
> signatures. *Am I supposed to publish the .dsc files to the same<br>
> Reprepro server that we maintain our regular build?* Because I've been<br>
> thinking .*dsc* files should also be reproducible, I've been thinking we<br>
> should keep using the same Reprepro server. *But now it looks like we<br>
> need to prepare a second Reprepro server to hold the packages of the<br>
> earlier version.*<br>
<br>
So you're looking to be able to recreate your whole repository from<br>
scratch (maybe from git repositories or some other VCS?) at some future<br>
date, reproducibly?<br></blockquote><div class="gmail_default" style="font-family:monospace,monospace;font-size:small"><b>Yes, I'm looking to be able to recreate the whole Reprepro repository from scratch, using the code in git. (Or, more accurately, I want to learn how to achieve this.)</b> As I said earlier, we have a company-wise Reprepro instance. We regularly build the code for day-to-day development (incrementally with the cache to help reduce the unnecessary build of code that hasn't changed) and publish the result artifacts to Reprepro that developers/testers can access. Recently, we wanted to rebuild an earlier reversion of the codebase for testing. I checked out a new copy of the code and kicked off the build process without using the cache. Because I didn't use any cache, the packages that are not built in our day-to-day environment got rebuilt this time. <b>When I published the rebuilt packages of the earlier reversion to the same Reprepro instance, I found the .dsc files got rejected because of the differences in the GPG signature. This was the primary motivation I came and asked the questions.</b></div><div><br></div><div><div class="gmail_default" style="font-family:monospace,monospace;font-size:small">But your reply made me realize that we may not be using Reprepro or practicing reproducible builds correctly. <b>In general, I'm trying to figure out the appropriate practices for two major scenarios: day-to-day development and the occasions when we need to rebuild an earlier code revision.</b> Right now, we are trying to use one single Reprepro instance and publish everything onto it. But it looks like we should at least use two Reprepro instances: one for daily development and one for the rebuilt historical builds.</div></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
> 3. *How does everyone else maintain their Reprepro server?* Do they keep<br>
> publishing the build artifacts to the same server after a build? Or do they<br>
> delete the previously published artifacts before publishing the new build?<br>
> Or do they even recreate the Reprepro server every time they make a new<br>
> build?<br>
<br>
Typically with a <span class="gmail_default" style="font-family:monospace,monospace;font-size:small"></span>Debian-style repository, it's not expected that you<br>
will ever upload the same version of any given object more than<br>
once. This is partly because apt and related tools are designed with<br>
that assumption in mind, and will behave poorly if you in fact feed a<br>
repository packages with the same version but different content and then<br>
try to use those packages in the real world.</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
So, still a bit unsure what you're actually trying to accomplish; if you<br>
spell that out a little more clearly it might be easier to make good<br>
suggestions.<br>
<br>
<br>
live well,<br>
vagrant<br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div>Software Engineer</div><a href="https://www.minevisionsystems.com/" title="http://www.minevisionsystems.com" style="color:rgb(17,85,204)" target="_blank">Mine Vision Systems</a><br>5877 Commerce St. <br>Suite 118<br>Pittsburgh PA, USA, 15206<br></div></div></div>