DockerCon

Building the Software Supply Chain on Docker Official Images

James Carnegie, Principal Software Engineer, Docker

Ethan Heilman, CTO, BastionZero

Recorded on November 8th, 2023
Learn how Docker and BastionZero have leveraged open standards like TUF and SLSA, along with OpenPubkey, to help secure the software supply chain.

Transcript

Good morning and welcome. Thanks for coming. I’m James. I’m an engineer, working at Docker. I’m working on Docker Scout and on Docker official images security. With me is Ethan from Bastion Zero. Hi, I’m Ethan from BastionZero and I’ll be talking about the OpenPubkey part of this talk.

You may have noticed in the press on social media, there’s been an announcement. Docker and BastionZero are working together with the Linux Foundation on a cool new signing solution. And that’s what we’re going to talk about today. But first I’m going to try and put this into a Docker official images context. Ethan will go into the details of OpenPubkey, and we’ll do a demo if the networks working, and then we hopefully will have some time for questions at the end.

As a little disclaimer, we’re going to be showing some code. Some of it doesn’t exist. Some components don’t exist. The code exists. You can try it at home. It’s probably full of bugs. Don’t rely on it. It’s not production ready. So you’ve been warned.

Table of Contents

    Docker official images

    What are Docker official images? Most of the people here should know what they are. You’re at DockerCon. But probably the most important thing to mention is that it’s one of the largest sources of open source packages in the world — if not the largest. I think that’s what Amy said at the beginning of the conference. There are 150 repos thereabouts across many, many, many platforms and architectures.

    Securing the supply chain of Docker official images is really important to Docker and really important to the community who helps us to maintain them. To build on that context, the cybersecurity and infrastructure agency identified two key areas to focus on when it comes to open source. The first one is vulnerabilities, and there’s been a lot of talk about vulnerabilities and remediation and developer workflows earlier in the conference. And in fact, Christian is doing another talk later on today. So look out for that.

    But we’ll be focusing on the second bit — on the supply chain associated with Docker official images. Docker is in a pretty unique position to be active in this space, not only because we produce them but also because, you know, we ship our client to 20 million laptops. And so we can start doing stuff with both of those sides of the equation.

    You can’t really talk about supply chain without mentioning SLSA. If you don’t know what SLSA is, it’s not your lunch. It’s a framework for modeling your supply chain. It has levels. You’re supposed to incrementally improve your security by going up those levels. We’re all at level zero. So congratulations. We did it. But seriously for Docker official images, we’re going to be trying to get to  level three. We’re not really going through this process. It doesn’t really make sense for us. We’ve got quite a lot going on in the hardened build area already. We’ve got a lot going on in provenance, some of our official images already have provenance and SBOMs. But today we’ll be focusing on the signing bit.

    Focus on signing

    So in going down this rabbit hole, we’ve kind of figured out that it’s not just about signing. It’s all about who’s signing, why are they signing, what are they signing? There are lots of examples of signing in the industry that are very successful in getting people to sign, for example, GPG. Everything in the Maven Central Repository is signed using GPG. But who’s verifying those signatures? What’s the value of those signatures? So unless you solve the verification problem, you’ve got to ask the question, what is the real value? As I said before, I think verification is the important bit. And for Docker official images, what we want is for verification to be the default, not an opt in.

    What that brings us to is a distribution problem, ultimately. How do you get the certificates, the trust policy, any revocated, revoked artifacts, and certificates to the clients in a safe way? And any code needed to do the verification also needs to move down to the client. As a producer of Docker official images, we have a policy. We want to say all of them should have a signed attestation, and all of them should have a provenance attestation. So how do we do that?

    TUF

    We’re looking at using TUF to do this. It’s one way to do it. I look at TUF as like an app store, like your iPhone, that probably has a TUF-like implementation. Certainly your browser has a TUF-like implementation for bringing down the CA store. So it’s a good pattern for ratcheting forward security as a time-stamping service so that your client always knows that everything is up to date. And there’s a spec out there. There’s code as well, and it’s being used in the world by PHP and Python and various other package managers.

    In the context of Docker, what does this look like? So it’s boxes and lines. Over in the left-hand side, we’ve got a GitHub Action, our TUF repository in Git. The idea is that it’s managed by Docker staff, and that sets which keys, what the policy is, which attestation should be there. That will be automatically published into the registry. The registry is a nice native place for representing TUF. With TUF. It doesn’t matter where you store stuff. It’s all cryptographically secure. So the registry is a good place.

    On the right hand side, we have the signer. This kind of framework with TUF could work with any signing technology. So we kind of see this as a good substrate for delivering signing technologies to Docker official images. So Docker’s clients… When you do “docker pull,” the idea is that the TUF route is embedded in that client. It will check that it has the latest version of the trust policy when you do “docker pull,” but it’s also going to check the signatures. So although we can put any old signing on there, we really do care about the signing technology.

    We want a signing technology that’s super simple, easy to explain, and easy to deploy. You shouldn’t need an extra trust point in order to verify Docker official images. You’re already going to be trusting our TUF route. So we don’t need another CA involved. We should just be able to trust the signatures from Docker. So it should be secure. It should use current crypto. We’re not going to invent any of our own crypto here. We learned that the hard way many years ago. And it should be open, and it should be free. And so with that, we’re going to hand over Ethan. He’ll tell you details of our solution.

    OpenPubkey

    Thanks, James. So I’m going to be talking about OpenPubkey, which is a protocol for signing objects and verifying them under identities. So before I begin, I want to explain the problem that OpenPubkey is trying to solve. So imagine you have some workload, say a GitHub Action. And we’re just going to name this workload Alice for the purposes of convenience to refer to the workload. But it’s not a user in this context; it’s just a workload. And it creates some image, and it signs this image, and it uploads that image along with Alice’s signature analysis public key to a registry. Now Bob trusts Alice and downloads the image, the public key, and the signature, but Bob wants to check this signature and make sure it’s signed. And Bob has a question. Even if the public key verifies for the signature, how does he know that this public key is in fact Alice’s public key? It could be anyone’s public key. Anyone could put a public key here.

    So when we talk about workload identity with GitHub Actions, we should think about OpenID Connect. GitHub Actions has an IDP and identity provider built on OpenID Connect that provides identities to workloads and allows workloads to prove their own identity. So we’re going to build the solution to this problem of “how do I know this is really Alice’s public key” using OpenID Connect. And so OpenPubkey is a protocol for binding a public key to an identity using OpenID Connect.

    The advantage of OpenPubkey is that you don’t have the key management headaches, like if you have like a long-term signing key where you have to load them from device to device, if a user loses that signing key, now you have to create a new one. Using OpenPubkey, you can generate signing keys as well as delete them when you don’t need them anymore. It works without any changes to your IDPs. So you can just use this for Google or GitHub or Microsoft today. And it’s secure. It doesn’t add any trusted parties. We’re not saying, here’s a new CA that you have to use. It’s just you and your identity provider as it was before. And now I’m really happy to announce that with Docker OpenPubkey is now a Linux Foundation project and is available open source so people can build on it.

    Before I explain the details of how OpenPubkey works, I need to provide a little background on OpenID Connect, and I’m going to use GitHub Actions as the example here. So we have our workload that we’ve named Alice for convenience here. This workload can authenticate to the GitHub Actions IDP, and when it does this, the IDP will create an ID token, which has a number of attestations about Alice’s identity. And the IDP will sign this ID token under the IDP signature. Alice can then present this ID token to Bob, and Bob will be convinced that Alice is Alice. Bob can go and check that the ID token is signed under the GitHub Actions public key by downloading the public key at the GitHub JWKS URI, which is an OpenID Connect way of making public keys.

    How it works

    How are we going to make this work and add signatures to this? Well, I’m going to provide a simplified version of OpenID Connect. It’s basically OpenPubkey. And then I’ll provide the little bit of complexity that OpenPubkey adds here. But basically OpenPubkey is the same as before. But now Alice generates a public key, generates a key pair — a public key and a signing key. And notice that there is this audience claim here, which is part of GitHub Actions. Alice can put whatever she wants in this aud or audience parameter. And so she puts her pub key in the audience parameter. And then when the IDP signs the ID token and returns it, the IDP will just put whatever value Alice supplied will put it in here. So if Alice puts her pub key, there is now an ID token signed by the IDP that contains Alice’s pub key along with her identity.

    So Alice can now sign objects. And if she publishes her ID token along with the signed object, Bob can check that that object is signed under Alice’s identity because the IDP has attested to the pub key used by Alice. Notice that, in some sense, the ID token is functioning like a certificate issued by a certificate authority, binding identity to a public key. But we haven’t added a certificate authority here. It’s just the regular old IDP that you’ve used before. So this also works for user identity. I won’t go into the full details here, but you can do this with, say, Google’s OpenID provider. You just have to change the audience parameter to a nonce parameter, and then you’ll get the public key in here.

    At BastionZero, we use the user identity rather than the workload identity. And it’s a really powerful tool, because we can essentially get a signed statement that says Google says Harold’s public key is X. And so, if you think about SSH, what are SSH’s, what are SSH pub keys doing? They’re saying that this person is allowed to connect, because this public key is there, but we don’t need that anymore. We can do SSH without SSH keys just trusting the IDP to identify users, which everyone is already trusting the IDP. We also use this to build TLS tunnels secure by user identity. And we can have identity be checked at both the network layer and the server or end host or K8 cluster by just checking that the identity has the pub key that we expect. And then you can bootstrap an authenticated or secure channel there.

    One of the things that this allows us to build is we can do logging and policy enforcement without trust. The integrity of the message can be protected, so  the logger or the policy enforcer can’t change messages, because we have a pub key protecting it. But it can still log and enforce policy on that traffic. I’m not going to go into any real detail about any of this, but ask me about this after the talk. I love talking about it. What we built is really exciting.

    Potential attacks

    Notice that there’s a tension here. OpenID is treating these ID tokens as authentication secrets that have to be kept secret and then revealed to authenticate. The name for this pattern is bearers authentication — you bear a token to authenticate — whereas OpenPubkey is treating these ID tokens like public certificates. You publish them with the signature potentially on like a public registry or anywhere on the internet, and you use these to verify that the object has been signed by the identity. So these two uses are our tension.

    What if, for instance, someone takes an ID token that was published for OpenPubkey and replaces it to a OpenID Connect authentication service that is misconfigured, so it doesn’t check that this is for OpenPubkey. It doesn’t check all the fields that it should. It says, hey, you know GitHub or Google says this is Alice. How do we prevent those sorts of attacks?

    Let’s look at the attack in a little bit more specificity. You have Alice, and she is sending her ID token and assigned object to Bob, same as before. And now we have evil Bob, and evil Bob takes the ID token and replays it to a misconfigured OpenID Connect service, and says, hey, I’m Alice. And the service says, hey, it’s signed by GitHub, you are Alice.

    GQ signatures

    Our plan of attack to solve this is that we want to preserve the security and cryptographic properties of ID tokens but make them valid for OpenPubkey but not valid when used for OpenID Connect authentication. To do this Alice is going to replace the IDP signature here with a proof that she knows the IDP signature — a cryptographic proof that she knows the IDP signature. So this will be as strong as the actual signature, but won’t be the signature anymore. And so the technique we use for this is GQ signatures. So she takes the IDP signature off and provides a proof of signature that she knows the signature of the ID token.

    The result is that, if this ID token with the signature with the proof of signature is replayed to a misconfigured service, the misconfigured service will attempt to verify it according to rules of OpenID Connect, and it will fail, and they will reject it so it can’t be abused. But this proof of signature has the same security properties of the signature while allowing Alice to keep the actual signature secret. Alice could even delete the actual signature so that it could not leak out once she has her ID token.

    I won’t go into the details of how GQ signatures — the technique we use — works. I’ll point out that it was invented in 1988 in this pretty famous paper. And Newman just this year proposed using GQ signatures to solve exactly this problem and even addresses OpenPubkey and Sigstore as potential use cases for GQ signatures. And so a lot of what we’re doing is based on this paper when it comes to GQ signatures.

    Remember how I said that there was a simplification here, where we didn’t just put the public key in the audience claim? Well, now I’m going to explain how it’s a little bit more complex but not that much more complex. There’s some additional metadata that we want included along with the workloads of public key, such as say the algorithm that the signing that the user will sign by, and maybe some additional claims that Alice wants to make, or Alice’s client wants to make, the workloads client wants to make.

    What we’re going to do is take this metadata along with the public key. We’re going to hash it all together, and then we’re going to actually supply the hash to the audience claim, rather than all of this data. But notice that Alice’s public key is still in that hash. A result of this is that the ID token alone no longer allows someone to check whether the public key is attested to in the ID token. You need these other fields as well. So we want to package these all together as one object, and we use the fact that the ID token is a JSON web signature. And JSON web signatures support having more than one signature on them. So we add a second signature and header that contains the values that we’re going to hash into the audience including Alice’s public key, and we call this the CIC. Don’t worry about the name. It stands for client instance claims, but know that it contains the public key and then some other metadata.

    PK tokens

    We call this whole thing a PK token. It’s an ID token augmented with this ability to verify that the public key is attested to in the ID token.

    So let’s say Alice wants to sign an image using OpenPubkey. So the first thing she’s going to do is generate a key pair including her signing key. She’s going to create the CIC, which includes her public key and other metadata. And then she’s going to hash the CIC and request an ID token supplying the hash as the audience parameter to the IDP. The IDP, in this case GitHub, will reply with a signed ID token and using what we just talked about in the last slide where she creates a new signature on the ID token. She will now create a PK token from the ID token. And you can see that it’s just an additional signature on the ID token. The ID token is part of the PK token. Now she’s ready to sign. So she takes the image that she wants to sign. She signs it with her signing key to get the signature A, and she uploads the image, the signature, and the PK token to her repository.

    In Docker’s case, Docker is doing single-use signatures, which is a feature of OpenPubkey. And this ensures that a PK token can only be used to sign a particular object or particular image. To enforce this, we use the client instance claim, and we provide an additional parameter sig, which is just the hash of the object to be signed. So that signing key can only be used for that object.

    When Bob wants to verify a signature that’s been signed under OpenPubkey, Bob downloads the image, the signature, and the PK token. Then, Bob checks that the ID token inside the PK token is signed by say GitHub, checks the CIC hash to the audience, extracts Alice’s public key from the CIC, and then checks the image is signed by the PK token, which Bob now knows is Alice.

    In summary, OpenPubkey does not require any changes to the IDP. It just works using the OpenID Connect protocol, but it augments it with this ability to buy and sign keys and public keys and identities. It’s very extensible — you can build lots of different things. We’re looking at one use case with Docker, but for example some MIT students read the paper and wrote an encrypted chat room using OpenPubkey and MIT’s OpenID Connect IDP.

    It’s secure. There’s no new parties that are being added to OpenID Connect and with GQ signatures, we can ensure that even against misconfigured services these ID tokens can’t be replayed. And it’s convenient. There’s no key management signing; keys are ephemeral, and it uses the OpenID flow that’s already being used for users and workloads when someone signs in with Google or has a GitHub Action. For full details — the paper’s focused a little bit more on user identity — but see our OpenPubkey paper. Now I’m going to turn it back over to James.

    Demo

    Thank you. Amazing — that stuff is so neat. What I’m going to do is demo just that small flow. What I’m going to do is make an important change. I think that’s already made to my Docker image, to my code, commit the change, sign it, push to GitHub. Where is that? Over here on GitHub, we should see the commit coming in. There we go.

    While that’s building, I’ll show you what the GitHub Action looks like — the changes. So right here we’ve just added another buildkit image. That’s a nice way to extend buildkit. That’s just a fork that’s living on OpenPubkey on GitHub. In the future, we hope to try and negotiate with those guys to try and get that built in. So use the GitHub Action to automatically get signed. You don’t have to do anything — zero configuration. No management of keys, no secret stored anywhere, nothing at all.

    Do head over to the OpenPubkey organization on GitHub. This is where the code is. This is the main library, contributed by Ethan. This is a little verify CLI plugin, which I’m just about to show you, or you probably saw my cast version. And here’s the buildkit, so go over there and get involved, raise issues. Please get involved.

    Back here to my terminal. I should be able to “docker verify”. If you look carefully, we can see that all of the details in the ID token are being verified. They’re checked that they’re signed by GitHub. We’ve checked the Docker organization. If I did this again with the wrong one, it should fail, show that it’s not vaporware. Oh, yeah, it did fail. There we go.

    We’re still working on the policy bit. We need a policy language that we can bring down to clients. So that bit’s undefined. At the moment, the attestations are in toto format, and I’ll show you a bit of that. Hang on a minute.

    What’s coming

    Just to put that into a little bit of context — to tell you which bits we’ve done and which bits we haven’t. And although Ethan didn’t touch on it, one important aspect probably is to add a transparency log here, so that all of the signatures that are generated on GitHub Actions. Certainly for the Docker official use case. We’re keeping an eye on GitHub, and we’re putting it in the transparency log. So if anything does go wrong there, we can notice it happening, and we can attempt to fix it, and we can always revoke using TUF.

    Oh, the other thing I forgot to mention. And it’s an important step. As everybody knows who works with the OIDC, those public keys expire. And they rotate them. They rotate them regularly. They can change their rotation frequency. This happens all the time. And so what we’re going to do — at least with the first drop — the plan is for Docker to put those public keys into the TUF repository and distribute them down to the client. That works for Docker official images but might not work for anyone. Because if you’re trusting our TUF repo, you can trust us to put those public keys in there. Having said that, what are we going to sign our TUF route with? Should we be signing it with OpenPubkey? And if we do sign it with OpenPubkey, what does that mean? It probably means we can’t put the public keys in there. We need to keep them in some other log.

    But, OpenPubkey supports the idea of multi-factor signing. So what we could do is add another signature. So, hey, you know, I’ve also got a login with Google, and we’ll counter-sign that and add a third signature to that list. And that’s very, very cool.

    So, I know not everybody’s going to like this. Some people like it. Some people don’t. But this is what we’re doing in the beginning. We’re attaching the signatures to indexes as images and those index with an unknown architecture. So, acknowledge that kind of works with most tools. It means our signatures are going to move around. It’s not going to break the registry. And we understand it’s not the end game. We understand that there’s artifacts and reference types and all these really important initiatives going on, and we’re definitely going to embrace those. But this is just a starting point. And also a call to action to engage and help us make this work. We understand that if you’re tying to specific architectures and images, then there’s no way to find those signatures. We know this doesn’t solve everything, so please do get involved.

    Each of the attestations is stored as a layer in that image. Again, it’s a bit of a kludge, but it does work. I’m sure people who are in that world will understand why we’ve taken this approach. There you can see the SBOM and the SLSA provenance. So the provenance attestation here. There’s a little bit more detail, but you can see right there we’ve called it in OPK — that’s OpenPubkey. And drilling down even further into the signature. Here we’ve got the OIDC payload with that with the signature stripped. We’ve got the OpenPubkey signature, and we’ve got the GQ proof. So that should, if you remember, look very much like Ethan’s slide.

    So there are loads of open questions. I mentioned refers and artifacts. Another big one is downgrade attacks. Right now we’re not resilient to that because no one’s verifying signatures. But actually notary does protect you against downgrade attacks by nature of the fact that the tags themselves are signed. So that is not going to happen with this initial drop. Having said that, the impact is probably extremely low, and there are solutions to that. So we’re going to look at that.

    The other thing is, is it possible for the OIDC providers to help here? And I think there really are opportunities here. And it’s good for them, and it’s good for us. If this becomes a popular way of signing things, which I think it will, because it’s so easy and open, maybe they could sign their public keys in WebPKI. Maybe they could log their public keys and make them available. Maybe they could have their own transparency logs, which actually maybe they should anyway. So I think there’s a lot we can do there. And I mentioned earlier, should we actually sign our TUF route with OpenPubkey? And maybe we can with the multi-factor stuff.

    These are the things that are coming. We’re adding the transparency log. We’re going to add Docker’s TUF route. That needs to be added to a whole bunch of clients so that we can actually use it and start verifying those signatures. And we need to solve the public key logs problem. We’re going to be adding them to TUF. We can add a checker, a monitor to monitor that. And anyone can monitor it. It’s all in GitHub. It’s all open.

    So that’s the call to action. Come and get involved. We think it’s super exciting. And we think it works specifically in the open for Docker official images. But also, if you want to bring this kind of technology in-house, that’s good. You don’t need an external CA. You can build on top of your own OIDC provider.

    Q&A

    So, thank you, and if you have any questions, now’s a good time. Does anyone have any questions?

    Thank you for the presentation. So how do you compare OpenPubkey with Cosign?

    Sure. So the question is how to compare OpenPubkey with Sigstore’s Cosign? I would argue that OpenPubkey is a way of binding public keys to identities. And so it fits very well within Sigstore. You could put OpenPubkey in Fulcio, whereas Sigstore has a bunch of other components that are in the OpenPubkey. And there are other components that are about like signing, transparency logs, and monitoring. And I think all of that stuff is super cool. And something that can work very well with OpenPubkey. I see OpenPubkey as a public key to identity binding mechanism but not as a complete signature system. And Docker is building more of a complete signature system, and they’re using OpenPubkey for this identity public key binding.

    Learn more

    This article contains the YouTube transcript of a presentation from DockerCon 2023. “Building the Software Supply Chain on Docker Official Images” was presented by Ethan James Heilman, CTO, BastionZero, and James Carnegie, Principal Software Engineer, Docker.

    Find a subscription that’s right for you

    Contact an expert today to find the perfect balance of collaboration, security, and support with a Docker subscription.