reproducible .pyc files (& python-for-android)

Bernhard M. Wiedemann bernhardout at lsmod.de
Mon Jan 4 11:48:33 UTC 2021


Am 04.01.21 um 11:23 schrieb Chris Lamb:
> Hi Felix,
> 
>> p4a compiles those with "hostpython -OO -m compileall -b -f" (where
>> hostpython is the cross-compiled Python for the target -- arm64-v8a or
>> armeabi-v7a -- which is thus definitely the same version on both
>> machines).
> 
> As I understand it, recent versions Python can use SOURCE_DATE_EPOCH
> in its internal py_compile routine to ensure that .pyc files are
> reproducible.

This is not a timestamp issue, though. If those are varying, they are in
the header (first 12 bytes) of the .pyc.


│  000000f0: 6d5a 0d62 6469 7374 5f77 696e 696e 7374  mZ.bdist_wininst
│ -00000100: 5a05 6368 6563 6b5a 0675 706c 6f61 644e  Z.checkZ.uploadN
│ +00000100: da05 6368 6563 6b5a 0675 706c 6f61 644e  ..checkZ.uploadN


I have seen this before and remember something about python string
reference counters being dumped into these pickle files and that varied
from ordering, so that
py_compile py1.py py2.py
produced different results than
py_compile py2.py py1.py

One way to get reproducible results is to delete and recreate all .pyc
files with
find -type f -a -name "*.py" -print0 |
  sort -z |
  xargs -0 $python_binary -m py_compile


Maybe related: creating .pyc files on i586 and x86_64 (with identical
toolchain) always produced different results for me.


More information about the rb-general mailing list