Reproducibility tool

Dan Shearer dan at shearer.org
Mon May 31 13:52:32 UTC 2021


Hello reproducibility people,

Reproducibility often requires source code assembly from multiple
upstreams, with multiple versions, source formats and mutually
incompatible maintenance cycles. There is now a tool called Not-Forking that does what
cannot be addressed by a VCS or patch/merge/quilt, see
https://lumosql.org/src/not-forking/doc/trunk/README.md . 

The diagrams illustrate common use cases, but it caters for other
needs as well, and I suspect from conversations with various operating
system maintainers that it could be well-suited for use there.

Not-Forking can detect its own (very minimal) dependencies, upgrade
itself, and use a down-level version of itself if needed.  Not-Forking
maintains a local cache that can reliably tell when an upstream has
changed, or a branch within an upstream, regardless of the VCS or other
system the upstream uses.

Not-Forking was developed for the LumoSQL project, where we were faced
with an extreme case in the sense that we are making changes to the
world's most-used software (SQLite). The SQlite project is
understandably very conservative in making changes that have
compatibility implications, and perhaps because of this the SQLite
library has been vendored, forked, relicensed etc hugely. We wanted to
show how a lot of that may not be necessary.

LumoSQL modifies SQLite to use multiple different key-value stores, adds
per-row integrity checking and encryption and various other things.  If
SQLite were not so carefully engineered and maintained would be one of
the biggest reproducibility weaknesses around, and in some ways it still
is due to all the vendoring and forking - but this email is about
Not-Forking.

Best,

--
Dan Shearer
dan at shearer.org


More information about the rb-general mailing list