[rb-general] Addresses in (I)Python output

Rebecca N. Palmer rebecca_palmer at zoho.com
Thu Sep 26 07:14:04 UTC 2019


nbconvert does *not* work as an "all builds and no interactive uses go 
through here" point - the dependency tree is

nbsphinx*             jupyter-nbconvert*
(Sphinx extension)    (script)
           |          /
      nbconvert*     /       ipython-directive
      (Python library)    /  (Sphinx extension)
                  |      /
                 ipython
                    |
                  python

Standalone .ipynb files enter at one of the points marked *, so go 
through nbconvert, but ipython:: sections in Sphinx files enter at 
ipython-directive.  Hence, if we want it in one place and to apply to 
builds not interactive use, that place will need to check whether an 
intended-as-reproducible build is currently in progress, e.g. via 
SOURCE_DATE_EPOCH.

Disabling the address output at source would be in Python itself 
(object.__repr__).  A post-processor to remove them could instead be in 
ipython.

Santiago Torres Arias wrote:
> https://github.com/sphinx-doc/sphinx/pull/3897

That's the issue that not all address output is from object.__repr__: 
some is from objects with their own address-including __repr__ or 
__str__, and may be in a different format.

This may be a reason to prefer a post-processor over a direct 
object.__repr__ replacement, though as noted in the link, that still 
won't work on nonstandard-format address output without risking false 
positives.

Daniel Shahaf wrote:
> it would invalidate documentation that relies on
> the hex addresses to demonstrate object identity.

If we want to keep this and get reproducibility, we could use a 
post-processor that replaces addresses by numbers assigned in the order 
they first appear in the document:

addresses_already_seen={}
next_address=16**2 # 0x100
for a in re.finditer(' at 0x[0-9a-f]+>',document):
     if a in addresses_already_seen:
         #replace a by addresses_already_seen[a]
     else:
         #replace a by next_address
         addresses_already_seen[a] = next_address
         next_address = next_address + 16**2

(This could in theory be an object.__repr__ replacement instead, but I 
suspect it's too much to be appropriate to do in such a low-level place.)



More information about the rb-general mailing list