[diffoscope] [Reproducible-builds] Support for --ignore-profile flag in diffoscope

Ximin Luo infinity0 at debian.org
Fri May 13 12:44:11 CEST 2016


Jérémy Bobbio:
> Ximin Luo:
>> This is quite an open-ended problem and there is no single "correct"
>> answer. I don't even know myself what would be best, at this stage.
> 
> I think what we need to come up with now is a list of use cases. Then we
> can decide which one we want to support and how easy it should be.
> 
> Is anyone willing to share examples where being able to ignore stuff
> would have made their life easier?
> 
> The last one I spotted that could go on the list, ignoring irrelevant
> differences in two Android App packages:
> https://github.com/WhisperSystems/Signal-Android/blob/master/apkdiff/apkdiff.py
> 

For a start, there's the list of already-known issues. https://tests.reproducible-builds.org/index_issues.html and I'd imagine people analysing diffs would want an easy way to distinguish "issues that someone else has already solved" vs "issues nobody has seen before".

(This is why I suggested looking through the existing data: if this mailing list discussion only produces 2 or 3 use-cases, this not immensely helpful to build a lasting tool with. But we already have a lot of data to go through as inspiration for use-cases.)

On a side note, the terminology should be more be precise. I know that you know this, but in a public context it's a bit dangerous to say "irrelevant" since it gives the impression (to an uncritical reader) that it actually is 100% irrelevant. But it's not, see my previous email. The purpose of --ignore-profiles is to make it easier to achieve bitwise reproducibility and anything less than that is still unsafe. I'm worried about the scenario where (e.g.) someone might market reproducibility as "do this build then run apkdiff.py, you can see it's the same (ignoring "irrelevant" differences)".

Concretely I have some suggestions:

1. instead of calling this "ignore" we call it "hide". and instead of "irrelevant" we say "common"/"minor"/"known"

2. diffoscope --ignore-* (or --hide-*) MUST NOT return 0 or otherwise give the impression that two non-identical files are the same, even if all differences are "hidden". It should report "n differences hidden".

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
git://github.com/infinity0/pubkeys.git

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: OpenPGP digital signature
URL: <http://lists.reproducible-builds.org/pipermail/diffoscope/attachments/20160513/ee65fedf/attachment.sig>


More information about the diffoscope mailing list