Die Softwareherkunft (Software Provenance): An opera in two acts
How the modernization of identity and software provenance will change the way we think about building software
Overture
For the longest time, I believed that utilizing a single CI/CD system was the only way to hold teams accountable for meeting standards for building software. As supply chain security policies impact how we build software more each day, this has become a core tenet of how I think about build systems. Centralized CI/CD obviates the need to solve some very challenging problems. Of those, establishing the provenance of an artifact is perhaps the most challenging.
In order to establish provenance, we must establish identity. Identity and access management (IAM) is a hard problem to solve and remains largely unsolved in modern orchestration systems like Kubernetes. Unless you’re leaning on a cloud provider’s services or a SaaS solution with integrations to do it all for you, building IAM often yields a patchwork quilt of systems that each hold some partial representation of the truth required to authenticate and authorize an action.
Act 1: The Temple of the Ordeal
The curtain rises. A small group of people huddled together stare into the distance against a backdrop of a tall, glistening ivory tower.
Because identifying individual actors is so challenging, leaning on centralized CI/CD is naturally the easiest way to solve build provenance. A common workflow for a secure artifact pipeline might look something like this:
Build and test
Authenticate signatures / validate that a PR caused the change
Vulnerability scan and analyze results
Run software composition analysis (SCA)
Sign the artifact
Publish the artifact
Initiate deploy
Consumer validates signature
In this case, the signature simply means “this artifact is good to go.” It’s clean, easy to reason about, and the pretty diagrams look great for the auditors. All for the low, low cost of 5-10 full-time engineers.
But people want things. They always want CI/CD to do more—generally more than 5-10 engineers can handle. Your build team is presented with a choice: say no or feature accretion. Golden paths are all well and good, but a “my way or the highway” attitude only foments resentment. We’re all people pleasers, so the most common scenario is continuously adding new features to our CI/CD pipeline as it accumulates more customers with different needs. In some ways, this is a badge of honor. Saying no, however, leads to a world of hurt for everyone left out of the ivory tower.
All too often the details of these build processes are lost in a miasma of Bash and Make. If you’re lucky, you have release engineers that produce, at the very least, systems that can be maintained by people that know Jenkins, Make, Groovy, etc. This is not terribly difficult to hire for. It’s just as likely (if not moreso), however, that your build processes have grown organically over the lifetime of the company. Build.sh started out simple enough, but in time has turned into a latticework of interdependent functions and global state.
Nobody knows what’s actually required, because what’s documented and what’s implemented aren’t in agreement. That’s if you’re lucky enough to have documented requirements and implementation at all. Documentation gets stale. Security is often far removed from product teams, they are incapable of articulating requirements of the systems in production. They may be able to define policy, but providing guidance on implementation is beyond them.
So often, when you’re telling people to go at it themselves, it’s a big mystery as to what they need to build. You try to articulate what your system does, but invariably more and more “requirements” for their implementation pop up over time. This game of whack-a-mole is already familiar to them, because of their experience with audit finding remediation, but now we’ve added more fuel to the fire in their bellies: security == annoying and expensive.
If you haven’t read it yet, Kelly Shortridge has written a fun piece on security obstructionism, I recommend you go do so now. Requiring the use of centralized systems as the only mechanism of enforcing security policy could be seen as a form of security obstructionism. Particularly if you're obfuscating requirements with an opaque system. Publish and encode all requirements so that an exacting implementation becomes less relevant.
We can do so much better. Once we can identify an actor (person or process) in a system, everything else becomes a lot easier. What we need to do is rub some modernization and open standards on it.
Act 2: The Temple of the Sun
Villagers gleefully chatter amongst themselves amidst a backdrop of a rising sun and a lucious garden. Enter, stage left, amid great fanfare: a group of priests bearing signatures and attestations.
Signing things has been a hard problem for a long time. It has recently become much easier because of Sigstore. Sigstore (and its component systems) provide ephemeral signing material and a public transparency log as well as integrations with identity providers and key management systems (like their Fulcio CA or cloud-based KMS).
A fundamental premise of Sigstore is that signing and key management are independent problems. Even an expired key allows third parties to verify the identity of an actor at a point in time. Signatures are all well and good, but that’s only the I of IAM. Now we couple our identity with an attestation of the action taken, and we have the means to authorize.
It’s unfortunate that the advancements in this area of software engineering come on the heels of security incidents on such a grand scale that they have household names internationally. That being said, this has been a long time coming as build and release engineering is typically funded at less than 1% of total R&D spend. That innovation in this area has been slow and has often come out of the needs of hyperscalers should be no surprise.
As an example, with in-toto attestations, we can show that an artifact was scanned for vulnerabilities. A vulnerability scan attestation can include salient details such as the identity of the scanner, its version, and when the scan was executed. Subsequently, at the point of enforcement, we can ensure that the action taken meets policy, yielding governance at the same time. In the old world, your central build system would sign an artifact, and that signature alone would indicate that standards have been met. While this allows for a more simplified implementation at the point of enforcement, it prevents developers from taking advantage of modern, distributed build systems like GitHub Actions.
To enable developers the freedom to choose how they build software, security organizations must publish the requirements of those policies they seek to enforce. Some common policies might include recent vulnerability scans showing no critical vulnerabilities or a requirement that all commits have two signatures indicating review. Before attestations, this would have required engineers to go through a continuous cycle of security review and audit of their build systems—exploding the cost of the project.
In the new world, we get out of the way by publishing requirements along with documentation about how to attest to meet these requirements, allowing us to validate the implementation with consumers of attestations. This does require more than authenticating the developer that set up the GitHub Action. We're now also responsible for authenticating systems as actors. We go to in-toto’s example of a provenance attestation and see everything we need to authenticate the actor and action.
Let's pretend that we're using an open source tool like Trivy to scan containers. We have two ways to go about attesting that we actually used Trivy. We could run a centralized Trivy service that scans containers and front it with a proxy that creates and signs attestations. Alternatively, Trivy could identify itself in the form of a software provenance attestation and bill of materials. This could be accomplished by returning, alongside the scan results, an attestation of the results via an open standard like SPDX or in-toto.
A key to the success of this endeavor is clearly-documented policies and requirements. Once we’ve published our policies and attestation requirements, developers have a precise set of guidelines that they can follow as they implement their own custom CI/CD solutions as necessary. It's not that centralized CI/CD systems go away as that's always going to be the path of least resistance, but not all developers have the same needs and build and release teams do not have infinite capacity.
And now I’m really excited because ultimately we can attest to anything. Attestations don’t have to be about security and compliance alone. They’re an API that allows us to validate the authenticity and authorization of an action; they’re an API that can be used for governance. Rather than continuously shimming features in centralized systems, we can begin to piece things apart to allow customization and interoperability. Previous security success stories like Netflix’s Wall-E (admittedly a centralized reusable system) have shown us definitively that something can start out as necessary for security and go on to become a boon for developers.
Attestations are not only the key to unlocking developer freedom, they allow us to begin iterating on the centralized CI/CD system in a consistent way as well. The steps in a build system often lack abstractions necessary to change the underlying implementation due to a lack of proper segmentation caused by the organic addition of features and treating the build system as “one big script.” Attestations would allow us to provide clearly defined inputs and outputs of individual actions in a build.