[Git][reproducible-builds/reproducible-website][master] 5 commits: a script of a conversation with kees, very roughest of roughest of

Fri Aug 30 21:08:22 UTC 2024


Vagrant Cascadian pushed to branch master at Reproducible Builds / reproducible-website


Commits:
5824fc3f by Vagrant Cascadian at 2024-08-30T14:05:43-07:00
a script of a conversation with kees, very roughest of roughest of
pre-rough drafts.

- - - - -
14271dbc by Vagrant Cascadian at 2024-08-30T14:05:43-07:00
Interview with Kees: Typo corrections and cleanup.

- - - - -
535716d3 by Vagrant Cascadian at 2024-08-30T14:05:43-07:00
interview with kees. updates, links, etc.

- - - - -
f3e60f30 by Vagrant Cascadian at 2024-08-30T14:05:43-07:00
interview with kees: update.

- - - - -
e4895c3a by Vagrant Cascadian at 2024-08-30T14:05:43-07:00
interview with kees: updates

- - - - -


1 changed file:

- + _posts/2024-08-xx-kees.md


Changes:

=====================================
_posts/2024-08-xx-kees.md
=====================================
@@ -0,0 +1,213 @@
+---
+layout: post
+title: "Supporter spotlight: Kees Cook ..."
+date: 2024-08-14
+categories: org
+draft: true
+---
+
+
+**Vagrant Cascadian: Could you tell me a bit about yourself? What sort
+  of things do you work on?**
+
+Kees Cook: I'm a Free Software junkie living in Portland, Oregon, USA.
+I have been focusing on the upstream Linux kernel's
+protection of itself. There is a lot of support that the kernel gives
+for userspace but when I first started focusing on this there was not
+as much work giving entire classes of bugs in the kernel itself as
+userspace itself gets more hardened the kernel itself became a bigger
+target. Almost 9 years ago I formally announced the [Kernel
+Self-Protection Project](https://kspp.github.io/) because the work necessary was way more than
+my time and expertise could do alone. So I just try to get people to
+help as much as possible; people who understand the arm architecture,
+people who understand the memory management subsystem to help, people
+who understand how to make the kernel less buggy.
+
+**Vagrant: Could you decribe the path that lead you to working on this
+  sort of thing?**
+
+Kees: I have always been interested in security through the aspect of
+exploitable flaws. I always thought it was like a magic trick to
+make a computer do something that it was very much not designed to do
+and seeing how easy it is to subvert bugs. I wanted to improve that fragility.
+In 2006, I started working at Canonical on Ubuntu and was mainly
+focusing on bringing Debian and Ubuntu up to what was the state of the
+art for Fedora and Gentoo's security hardening efforts. Both had really pioneered a lot of
+userspace hardening with compiler flags and ELF stuff and many
+other things for hardened binaries. On the whole, Debian had not really paid attention to
+it. Debian's packaging building proces at the time was sort of a chaotic
+free-for-all as there wasn't centralized build methodolgy for
+defining things. Luckily that did slowly change over the years. In
+Ubuntu we had the opportunity to apply top down build rules for hardening
+all the packages. In 2011 Chrome OS was following along and took advantage of
+a bunch of the security hardening work as they were based on
+ebuild out of Gentoo and when they looked for someone to help out
+they reached out to me. We recognized the Linux kernel was pretty much
+the weakest link in the Chrome OS security posture and I joined them to help solve that.
+Their userspace was pretty well handled but the kernel had a lot of
+weaknesses, so focusing on hardening was the next
+place to go. When I compared notes with other users of the Linux
+kernel within Google there were a number of common concerns and
+desires. Chrome OS already had an "upstream first" requirement, so I
+tried to consolidate the concerns and solve them
+upstream. It was challenging to land anything in other kernel team repos at Google, as they (correctly)
+wanted to minimize their delta from upstream, so I needed
+to work on any major improvements entirely in upstream and had a lot of support from Google
+to do that. As such, my focus shifted further from working directly on Chrome OS
+into being entirely upstream and being more of a consultant to internal teams, helping with integration or sometimes
+backporting.  Since the volume of needed work was so gigantic I
+needed to find ways to inspire other developers (both inside and outside of Google) to help. Once I had a budget
+I tried to get folks paid (or hired) to work on these areas when it wasn't already their job.
+
+**Vagrant: So my understanding of some of your recent work is
+  basically defining undefined behavior in the language or compiler?**
+
+Kees: I've found the term "undefined behavior" to have a really strict meaning
+within the compiler community, so I have tried to redefine my goal as eliminating
+"unexpected behavior" or "ambiguous language constructs". At the end of the day
+ambiguity leads to bugs, and bugs lead to exploitable security flaws. I've been taking a four-pronged approach:
+supporting the work
+people are doing to get rid of ambiguity, identify new areas where
+ambiguity needs to be removed, actually removing that ambiguity from the C language,
+and then dealing with any needed refactorings in the Linux kernel
+source to adapt to the new constraints.
+
+None of this is particularly novel; people have 
+recognized how dangerous some of these language constructs are for
+decades and decades but I think it is a combination of hard problems
+and a lot of refactoring that nobody
+has the interest/resources to do. So, we have been incrementally going after the
+lowest hanging fruit. One clear example in recent years was the elimination of C's "implicit fall-through" in "switch" statements.
+The language would just fall through between adjacent "case"s if a "break" (or other code flow directive) wasn't present.
+But this is ambiguous: is the code meant to fall-through, or did the author just forget a "break" statement? By [defining the "fallthrough" statement](https://en.cppreference.com/w/c/language/attributes/fallthrough), and [requiring its use in Linux](https://git.kernel.org/linus/dee2b702bcf067d7b6b62c18bdd060ff0810a800), all "switch" statements now have explicit code flow, and the entire class of bugs disappeared.
+During our refactoring we actually found that 1 in 10 added "fallthrough" statements were actually missing "break" statements. This was an extraordinarily common bug!
+
+So getting rid of that ambiguity is where we
+have been. Another area I've been spending a bit of time on lately is
+looking at how defensive security work has challenges associated with metrics. How do you measure your
+defensive security impact? You can't say "because we installed locks on the doors, 20% fewer break-ins have happened."
+Much of our signal is always secondary or retrospective, which is frustrating: "This class of flaw was used $X much over the
+last decade so, and if we have eliminated that class of flaw and will never see it again, what is the impact?" Is the impact infinity?
+Attackers will just move to the next easiest thing. But it means that exploitation gets incrementally more difficult. As attack surfaces are reduced,
+the expense of exploitation goes up.
+
+**Vagrant: So it is hard to identify how effective this is.. how bad
+  would it be if people just gave up?**
+
+Kees: I think it would be pretty bad, because as we have seen, using
+secondary factors, the work we have done in the industry at
+large, not just the Linux kernel, has had an impact. What we, Microsoft, Apple, and everyone
+else is doing for their respective software ecosystems, has shown that 
+the price of functional exploits in the black market has gone up. Especially for really egregious stuff like a zero-click remote code execution.
+If those were cheap then obviously we are not doing something
+right, and it becomes clear that it's trivial for anyone to attack the infrastructure that
+our lives depend on. But thankfully we have seen over the last two
+decades that prices for exploits keep going up and up into millions of
+dollars. I think it is important to keep working on that because, as a
+central piece of modern computer infrastructure, the Linux kernel has
+a giant target painted on it. If we give up, we have to accept that
+our computers are not doing what they were designed to do, which I
+can't accept. The safety of my grandparents shouldn't be any different
+from the safety of journalists, and political activists, and anyone else
+who might be the target of attacks. We need to be able to trust our
+devices otherwise why use them at all?
+
+**Vagrant: What has been your biggest success in recent years?**
+
+Kees: I think with all these things I am not the only actor. Almost
+everything that we have been successful at has been because of a lot
+of people's work, and one of the big ones that has been coordinated
+across the ecosystem and across compilers was [initializing stack
+variables to 0 by default](https://git.kernel.org/linus/f0fe00d4972a8cd4b98cc2c29758615e4d51cdfe).
+This feature was added in
+[Clang](https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-ftrivial-auto-var-init),
+[GCC](https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-ftrivial-auto-var-init),
+and [MSVC](https://msrc.microsoft.com/blog/2020/05/solving-uninitialized-stack-memory-on-windows/)
+across the board even though there were a lot of fears about forking the C language.
+
+The worry was that developers would come to depend on zero-intiialized stack
+variables, but this hasn't been the case because we still warn about
+uninitialized variables when the compiler can figure that out. So you still still get the warnings at compile
+time but now you can count on the contents of your stack at run-time and we drop
+an entire class of uninitialized variables. While the exploitation of this class has mostly been around
+memory content exposure, it has also been [used for control flow
+attacks](https://outflux.net/slides/2011/defcon/kernel-exploitation.pdf). So that was politically and technically a large challenge:
+convincing people it was necessary, showing its utility, and
+implementing it in a way that everyone would be happy with,
+resulting in the elimination of a large and persistant class of flaws in C.
+
+
+**Vagrant: In a world where things are generally Reproducible do you
+  see ways in which that might affect your work ?**
+
+Kees: One of the
+questions I frequently get is, "What version of the Linux kernel has feature $foo?"
+If I know how things are built, I can answer with just a
+version number. In a Reproducible Build scenario I can count on the
+compiler version, compiler flags, kernel configuration, etc. all those
+things are known, so I can actually answer definitively that a certain
+feature exists. So that is an area where Reproducible Builds affects
+me most directly. Indirectly, it is just being able to trust the
+binaries you are running are going to behave the same for the same build environment is critical for sane testing.
+
+**Vagrant: Have you used diffoscope?**
+
+Kees: I have! One subset of treewide refactoring that we
+do when getting rid of ambiguious language usage in the kernel is when
+we have to make source level changes to satisfy some new compiler
+requirement but where the binary output is not expected to change at all. It
+is mostly about getting the compiler to understand what is happeneing,
+what is intended in the cases where the old ambiguity does actually
+match the new unambiguous description of what is intended. The binary
+shouldn't change. We have used [diffoscope to compare](https://outflux.net/blog/archives/2022/06/24/finding-binary-differences/) the before and after binaries to
+confirm that "yep, there is no change in binary".
+
+**Vagrant: You cannot just use checksums for that?
+
+Kees: For the most part, we need to only compare the text segments. We try to hold as much stable as we can, following the
+[Reproducible Builds documentation for the kernel](https://docs.kernel.org/kbuild/reproducible-builds.html), but there are
+macros in the kernel that are sensitive to source line numbers and as
+a result those will change the layout of the data segment (and sometimes the text segment too). With
+diffoscope there's flexibility where I can exclude or include different comparisons. Sometimes I just go look at what diffoscope is
+doing and do that manually, because I can tweak that a little harder,
+but diffoscope is definitely the default. Diffoscope is awesome!
+
+**Vagrant: Where has reproducible builds affected you?**
+
+Kees: One of the notable wins of reproducible builds lately was
+dealing with the fallout of the XZ backdoor and just being able to ask
+the question "is my build environment running the expected
+code?" and to be able to compare the output generated from one
+install that never had a vulnerable XZ and one that did have a
+vulnerable XZ and compare the results of what you get. That was
+important for kernel builds because the XZ threat actor was working to
+[expand their influence and capabilities to include Linux kernel](https://lore.kernel.org/lkml/27db456edeb6f72e7e229c2333c5d8449718c26e.camel@16bits.net/)
+builds, but they didn't finish their work before they were noticed. I
+think what happened with [Debian proving the build infrastructure was not affected](https://lists.reproducible-builds.org/pipermail/rb-general/2024-March/003321.html) is an
+important example of how people would have needed to verify the kernel
+builds too.
+
+**Vagrant: What do you want to see for the near or distant future in
+  security work?
+
+Kees: For reproducible builds in the kernel, in the work that has been
+going on in the [ClangBuiltLinux project](https://clangbuiltlinux.github.io/), one of the driving forces of
+code and usability quality has been the
+continuious integration work. As soon as something breaks, on the
+kernel side, the Clang side, or something in between the two, we get a
+fast signal and can chase it and fix the bugs quickly. I would like to
+see someone with funding to maintain a reproducible kernel build
+CI. There have been places where there are certain
+architecture configurations or certain build configuration where we lose
+reproducibility and right now we have sort of a standard open source
+development feedback loop where those things get fixed but the time
+in between introduction and fix can be large. Getting a CI for
+reproducible kernels would give us the opportunity to shorten that
+time.
+
+**Vagrant: Well, thanks for that! Any last closing thoughts?
+
+Kees: I am a big fan of reproducible builds, thank you for all your
+work. The world is a safer place because of it.
+
+**Vagrant: likewise for your work!



View it on GitLab: https://salsa.debian.org/reproducible-builds/reproducible-website/-/compare/7e1ebc6cfc3b4b4a72b306a6e43497e99a4e07e9...e4895c3a495a8b8d3b7187293e64020a150b7569

-- 
View it on GitLab: https://salsa.debian.org/reproducible-builds/reproducible-website/-/compare/7e1ebc6cfc3b4b4a72b306a6e43497e99a4e07e9...e4895c3a495a8b8d3b7187293e64020a150b7569
You're receiving this email because of your account on salsa.debian.org.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.reproducible-builds.org/pipermail/rb-commits/attachments/20240830/f5b7592e/attachment.htm>