October 2019 in Reproducible Builds
Chris Lamb <chris@...>
⬋ ⬊ October 2019 in
o o Reproducible Builds
Welcome to the October 2019 report from the Reproducible Builds
In our monthly reports we attempt outline the most important things
that we have been up to recently. As a reminder on what our little
project is all about, whilst anyone can inspect the source code of
free software for malicious changes most software is distributed to
end users or servers as precompiled binaries. Reproducible builds
tries to ensure that no changes have been made during these
compilation processes by promising identical results are always
generated from a given source, allowing multiple third-parties to
come to a consensus on whether a build was compromised.
In this month's report, we will cover:
* Media coverage & conferences — Reproducible builds in Belfast
* Reproducible Builds Summit 2019 — Registration & attendees, etc.
* Distribution work — The latest work in Debian, OpenWrt, openSUSE,
* Software development — More diffoscope development, etc.
* Getting in touch — How to contribute & get in touch
If you are interested in contributing to our venture, please visit our
*Contribute* page on our website.
Media coverage & conferences
Jonathan McDowell  gave an introduction on Reproducible Builds in
Debian  at the Belfast Linux User Group.
Whilst not strictly related to reproducible *builds*, Sean Gallagher
from Ars Technica wrote an article entitled *Researchers find bug in
Python script may have affected hundreds of studies* :
A programming error in a set of Python scripts commonly used for[ 2] https://www.earth.li/~noodles/
[ 3] https://www.meetup.com/belfast-lug/events/264951460/
[ 6] https://arstechnica.com/information-technology/2019/10/chemists-discover-cross-platform-python-scripts-not-so-cross-platform/
Reproducible Builds Summit 2019
Registration for our fifth annual Reproducible Builds summit that
will take place between the 1st and 8th December in Marrakesh,
Morocco has opened and invitations have been sent out.
Similar to previous incarnations of the event, the heart of the
workshop will be three days of moderated sessions with surrounding
"hacking" days and will include a huge diversity of participants
from Arch Linux, coreboot, Debian, F-Droid, GNU Guix, Google,
Huawei, in-toto, MirageOS, NYU, openSUSE, OpenWrt, Tails, Tor
Project and many more. We are still seeking additional sponsorship
for the event. Sponsoring enables us to enable the attendance of
people who would not otherwise be able to attend. If you or your
company would be able to sponsor the event, please contact
If you would like to learn more about the event and how to register,
please visit our dedicated event page:
GNU Guix  announced that they had significantly reduced the size
of their "bootstrap seed"  by replacing binutils , GCC 
and glibc  with smaller alternatives resulting in the package
manager "possessing a formal description of how to build all
underlying software" in a reproducible way from a mere 120MB seed.
OpenWrt  is a Linux-based operating system targeting wireless
network routers and other embedded devices. This month Paul Spooren
(*aparcar*) posted a patch to their mailing list adding KCFLAGS to
the kernel build flags  to make it easier to rebuild the
Bernhard M. Wiedemann posted his monthly Reproducible Builds status
update  for the openSUSE  distribution which describes how
rpm was updated  to run most builds with the -flto=auto
argument, saving mirror disk space/bandwidth. In addition,
maven-javadoc-plugin received a toolchain patch  (originating
from Debian ) in order to normalise a date.
: :' :
In Debian this month Didier Raboud (*OdyX*) started a discussion on
the debian-devel  mailing list regarding building Debian source
packages in a reproducible manner (thread index at ). In
addition, Lukas Pühringer prepared an upload of in-toto , a
framework to protect supply chain integrity by the Secure Systems
Lab  at New York University  which was uploaded by Holger
Holger Levsen started a new section on the Debian wiki  to
centralise to document the progress made on various Debian-specific
reproducibility issues  and noticed that the "essential" package
set in the *bullseye* distribution  became unreproducible again,
likely due to a a bug in Perl  itself. Holger also restarted a
discussion  on Debian bug #774415  which requests that the
devscripts collection of utilities that "make the life of a Debian
package maintainer easier" adds a script/wrapper to enable easier
end-user testing of whether a package is reproducible.
Johannes Schauer (*josch*) explained that their mmdebstrap  tool
can create bit-for-bit identical  Debian chroots  of the
*unstable* and *buster* distributions for both the essential and
minbase bootstrap "variants" , and Bernhard M. Wiedemann
contributed to a discussion  regarding adding a "global" build
switch to enable/disable Profile-Guided Optimisation  (PGO) and
Link-time optimisation  in the dpkg-buildflags tool, nothing
that "overall it is still very hard to get reproducible builds with
64 reviews of Debian packages were added, 10 were updated and 35
were removed this month adding to our knowledge about identified
issues. Three new types were added by Chris Lamb (*lamby*):
Lastly, there was a far-reaching discussion regarding the
correctness and suitability of setting the TZ environment variable
 to UTC when it was noted that the value UTC0  was
"technically" more correct.
The Reproducible Builds project detects, dissects and attempts to
fix as many currently-unreproducible packages as possible. We
endeavour to send all of our patches upstream where appropriate.
This month, we wrote a large number of such patches, including:
* Bernhard M. Wiedemann:
* keeperrl  (merged, date)
* sphinx-doc  (nondeterminism from parallelism via
* vlc  (sort tar)
* A number of expiring SSL testing certificates have been extended
to 2049 to fix future builds:
* python-M2Crypto 
* python-aiosmtplib 
* python-distlib 
* python-geventhttpclient 
* python-moto  (has a remaining year 2038 bug)
* python-oslo.service 
* python-thriftpy2 
* Chris Lamb (*lamby*):
* #934698 filed against libchamplain (merged upstream ).
* #941714 filed against bst-external.
* #941715 filed against checkinstall.
* #941716 filed against gobject-introspection.
* #942005 filed against elph.
* #942006 filed against squeak-plugins-scratch.
* #942009 filed against stgit. (forwarded upstream ).
* #942342 filed against traitlets (forwarded upstream ).
* #942479 filed against frobby.
* #942767 filed against python-oslo.reports.
* #942847 filed against cloudkitty.
* #942848 filed against designate.
* #943471 filed against khard (forwarded upstream ).
* #943674 filed against flask (forwarded upstream ).
* #943694 filed against ros-genpy (forwarded upstream ).
* #943829 filed against pmemkv.
* #943954 filed against tm-align
* #943956 filed against snakemake (forwarded upstream )
* spirv-tools .
* #942867 & #942870: Filed against r-base (not
respecting nocheck and nodoc Debian build profiles ).
* Mattias Ellert:
* #942671 filed against doxygen.
Lastly, a request from Steven Engler  to sort fields in the
PKG-INFO files generated by the setuptools  Python module
build utilities was resolved  by Jason R. Coombs  and
Vagrant Cascadian added SOURCE_DATE_EPOCH  support to LTSP
's manual page generation.
strip-nondeterminism & reprotest
strip-nondeterminism  is our tool to remove specific non-
deterministic results from successful builds. This month, Chris Lamb
made a number of changes including uploading version 1.6.1-1 was to
Debian unstable . This dropped a bug_803503.zip test fixture as
it is no longer compatible with the latest version of Perl's
Archive::Zip  module (#940973) .
reprotest is our end-user tool to build same source code twice in
widely differing environments and then checks the binaries produced
by each build for any differences. This month, Iñaki Malerba updated
our Salsa CI  scripts  as well as adding a --control-build
parameter . Holger Levsen uploaded the package as 0.7.10,
bumping the Debian "standards version"  to 4.4.1 .
diffoscope  is our in-depth and content-aware diff utility that
can locate and diagnose reproducibility issues. It is run countless
times a day on our testing infrastructure  and is essential for
identifying fixes and causes of non-deterministic behaviour.
This month, Chris Lamb (*lamby*) made the following changes,
including uploading versions 126, 127, 128 and 129 to the Debian
* Disassembling and reporting on files related to the R (programming
* Expose an .rdb file's absolute paths in the semantic/human-
readable output, not hidden deep in a hexdump. 
* Rework and refactor the handling of .rdb files with respect to
locating the parallel .rdx prior to inspecting the file to
ensure that we do not add files to the user's filesystem in the
case of directly comparing two .rdb files or — worse —
overwriting a file in is place. 
* Query the container for the full path of the parallel .rdx file
to the .rdb file as well as looking in the same directory. This
ensures that comparing two Debian packages shows any varying
* Correct the matching of .rds files by also detecting newer
versions of this file format. 
* Don't read the site and user environment when comparing .rdx,
.rdb or .rds files by using Rscript's --vanilla option.
* Ensure all object names are displayed, including ones beginning
with a fullstop (.)  and sort package fields when
dumping data from .rdb files .
* Mask/hide standard error when processing .rdb files 
and don't include useless/misleading NULL when dumping data
from them. 
* Format package contents as foo = bar rather than using ugly and
misleading brackets, etc.  and include the object's
* Don't pass our long script to parse .rdb files via the command
line; use standard input ) instead. 
* Call the deparse function to ensure that we do not error out
and revert to a binary diff when processing .rdb files with
internal "vector" types; they do not automatically coerce to
* Other misc/cosmetic changes. 
* When printing an error from a command, format the command for the
* Truncate very long command lines when displaying them as an
external source of data. 
* When formatting command lines ensure newlines and other
metacharacters appear escaped as \n, etc. 146]
* When displaying the standard error from commands, ensure we use
the escaped version. 
* Use "exit code" over "return code" terminology when referring to
UNIX error codes in displayed differences. 
* Internal API:
* Add ability to pass bytestring  input to external commands.
* Split out command-line formatting into a separate utility
* Add support for easily masking the standard error of commands.
* To match the libarchive  container, raise a KeyError
exception if we request an invalid member from a directory.
* Correct string representation output in the traceback when we
cannot locate a specific item in a container. 
* Move build-dependency on python-argcomplete to its Python 3
equivalent to facilitate Python 2.x removal. (#942967 )
* Track and report on missing Python modules. (#72 )
* Move from deprecated $ADTTMP to $AUTOPKGTEST_TMP in the
autopkgtests . 
* Truncate the tcpdump expected diff to 8KB (from ~600KB).
* Try and ensure that new test data files are generated
dynamically, ie. at least no new ones are added without "good"
* Drop unused BASE_DIR global in the tests. 
In addition, Mattia Rizzolo updated our tests to run against all
supported Python versions  and to exit with a UNIX exit status
 of 2 instead of 1 in case of running out of disk space .
Lastly Vagrant Cascadian updated diffoscope 126  and 129 
in GNU Guix , and updated inputs for additional test suite
trydiffoscope  is the web-based version of diffoscope  and
this month Chris Lamb migrated the tool to depend on the
python3-docutils package over python-docutils to allow for Python
2.x removal (#943293 ) as well as updating the packaging to the
latest Debian standards and conventions .
There was yet more effort put into our our website this month,
including Chris Lamb improving the formatting of reports
 and tidying the new "Testing framework"
 links , etc.
In addition, Holger Levsen add the Tor Project's Reproducible Builds
Manager  to our "Who is Involved? " page and Mattia Rizzolo
dropped a literal HTML element .
We operate a comprehensive Jenkins-based testing framework that
powers tests.reproducible-builds.org. This month, the following
changes were made:
* Holger Levsen:
* Debian-specific changes:
* Add a script to ease powercycling x86 and arm64 nodes.
* Don't create suite-based directories for
buildinfos.debian.net . 
* Make all four suites being tested shown in a single row on
the performance page. 
* OpenWrt changes:
* Only run jobs every third day. 
* Create jobs to run the reproducible_openwrt_rebuild.py
script today and in the future. 
* Mattia Rizzolo:
* Add some packages that were lost while updating to *buster*.
* Fix the auto-offline functionality by checking the content of the
permalinks file instead of following the lastSuccessfulBuild
that no longer being updated. 
* Paul Spooren (OpenWrt ):
* Add a reproducible_common utilities file. 
* Update the openwrt-rebuild script to to use schroot.
* Use unbuffered  Python output  as well as fixing
The usual node maintenance was performed by Holger Levsen ,
Mattia Rizzolo  and Vagrant Cascadian .
Getting in touch
If you are interested in contributing the Reproducible Builds project,
please visit our *Contribute*  page on our website. However, you
can get in touch with us via:
* Mailing list: rb-general@... 
* IRC: #reproducible-builds on irc.oftc.net.
* Twitter: @ReproBuilds / https://twitter.com/ReproBuilds
This month's report was written by Bernhard M. Wiedemann, Chris Lamb,
Holger Levesen and Vagrant Cascadian. It was subsequently reviewed by a
bunch of Reproducible Builds folks on IRC and the mailing list.
⬋ ⬊ Chris Lamb
o o reproducible-builds.org