You’ve likely been hearing about software bill of materials (SBOMs) over the last few years along with the importance of software transparency for vulnerability management, licensing risk, and other use cases.
At some point, though, you may have started to explore generating and/or consuming SBOMs — and quickly realized things were a bit more complex than you had initially thought.
One of the biggest areas of confusion revolves around SBOM formats and specifications, such as CycloneDX and SPDX, that are used to create and share SBOM information.
In its simplest form, an SBOM needs to convey a set of basic metadata about the software it describes. These baseline data fields are frequently referred to as the “Minimum Elements” of an SBOM. (Organizations that sell into federal government agencies or are otherwise regulated by certain federal government agencies are required to produce SBOMs that include these elements. Other organizations may not be bound by the same requirements, but it’s still a best practice and industry standard to do so.)
The minimum data fields are as follows:
- Supplier Name
- Component Name
- Version of the Component
- Other Unique Identifiers
- Dependency Relationship
- Author of SBOM Data
- Timestamp
The reality is that unless you are concerned with the regulatory compliance of your SBOM such as government procurement or FDA requirements, you can still derive value from even a partial SBOM. It’s also noteworthy that these seven fields can be expanded on with additional references, cryptographic hashes and other information useful for supply chain use cases. The minimum elements are just a starting point.
But does it matter whether you use SPDX, CycloneDX, or another specification? We will get to that in just a few moments, but let’s first explore what these specifications are.
SPDX
SPDX, or Software Package Data Exchange, is an open standard for communicating licensing and SBOM minimum elements as described above. It’s the only specification that is recognized as an ISO standard in the form of ISO/IEC 5962:2021, which describes the state of SPDX in version 2.2.1. As such, SPDX is the only SBOM specification that has undergone this rigorous standards development process, but it’s also important to note that this space is evolving rapidly, far more quickly than standards development can keep pace with.
SPDX is also the oldest of these SBOM formats, and in its early days, it was largely intended for open source licensing use cases. In fact, SPDX has been around since at least 2011, far longer than many have been talking about SBOM for cybersecurity.
The latest published version is 2.3, which was released in 2022 after a very minor update in 2.2.2. Again, both of these are more recent than the ISO standard version, and SPDX 3.0 is in a prerelease state which makes major breaking changes to the data specification. This will dramatically improve the state of the industry, including such concepts as profiles across Licensing, Security, Build, Usage, AI, and Dataset use cases, along with more flexibility in relationship modeling, additional simplicity, and promotion of PURL (Package URL), which will help improve SBOM accuracy and data quality by standardizing component naming).
So, who uses SPDX? Well, lots of people do! In fact, even CycloneDX uses SPDX for software licensing; but the licensing definitions and identifiers come from SPDX. This is what SPDX was purpose-built for. As such, it should come as no surprise that many large software companies such as Microsoft and Google prefer SPDX due to the ease of working with software licensing information and the maturity of an ISO standard. This is also why the Linux Foundation has taken over governance of the project, and, as such, SPDX is well-respected by many in the open source ecosystem.
SPDX supports a wide range of data formats such as tag/value (.spdx), JSON (.spdx.json), YAML (.spdx.yml), RDF/xml (spdx.rdf) and spreadsheets (.xlsx). The SPDX team also has an online tool to make it easy to work with the various formats and convert as needed for your use case.
CycloneDX
Another specification you have likely heard a lot about in recent years is CycloneDX. It too, is an open format, though only recently moving into a formal standards adoption process through ECMA TC54, which is also adopting many other sibling projects that are important for supply chain transparency use cases such as PURL. CycloneDX also supports several data formats such as XML, JSON, and protocol buffer, all of which can be easily transformed into CSV formats if you are accustomed to spreadsheets.
CycloneDX is currently on version 1.5, but new versions are released frequently, and with a much wider array of use cases and BOM (bill of materials) types than what is seen within SPDX. As an OWASP project, it should come as no surprise that CycloneDX is aligned much more closely with cybersecurity use cases, and as such, sees increased adoption with many cybersecurity tool providers and vendors.
The team behind CycloneDX is also responsible for Dependency Track, Software Component Verification Standard, the BOM Maturity Model, Common Lifecycle Enumeration, and of course the many open source and commercial tools that implement the CycloneDX specification.
As the BOM conversation evolves, CycloneDX has evolved as well to support VEX (which we will discuss more in a moment), HBOM for hardware components, Cryptographic BOM to describe algorithms and ciphers, SaaSBOM which describes APIs and services, OBOM for configuration data, and MLBOM for machine learning models. CycloneDX also has attestation support in its short-term roadmap.
SBOM Format Comparison
Now that we have an overview of the SBOM specifications and formats, how do we use them and what are the primary differences in the minimum elements and verbiage in each? We will focus on CycloneDX 1.5 and SPDX 2.3 for this comparison.
Minimum Element | SPDX 2.3 | CycloneDX 1.5 |
---|---|---|
Supplier Name | PackageSupplier | Supplier |
Component Name | PackageName | Name |
Version of the Component | PackageVersion | Version |
Other Unique Identifiers | DocumentNamespace, SPDXID | Purl, cpe, swid |
Dependency Relationship | Relationship | Dependencies |
Author of SBOM Data | Creator | Author |
Timestamp | Created | Timestamp |
Due in part to these differences, in some cases, translating formats may create a scenario we refer to as “lossy.” This refers to cases where the source SBOM contains data that is not supported in the destination format, and, as such, this information can get lost in translation. This is also one reason why you might sometimes see a value of ‘NOASSERTION’, especially with SPDX, as it’s telling you that we tried, but can’t answer this question.
As an example, consider unique identifiers. If your source SBOM was in CycloneDX and had a PURL reference, it wouldn’t be supported in the corresponding section in SPDX 2.3. You’d have to use the ‘ExternalRef’ field in SPDX instead, (PURL has been supported as an external reference in SPDX starting with 2.2). We have certainly seen ‘PackageDownloadLocation’ used as well, which is not what this field is supposed to be used for. Typically, this should be where the direct download points to. As you might imagine, this ambiguity creates situations where not all tools will handle translation in the same way. Consistency becomes extremely important, or you might miss a critical piece of information. In this case, the lack of a PURL might mean that you miss a vulnerable component.
Converting between SPDX and CycloneDX (and guarding against data accuracy issues) is one area where certain SBOM management tools like FOSSA can help. We'll discuss more specifics later in this post.
Other SBOM-Related Formats
Although SPDX and CycloneDX are the two full-stack SBOM formats commonly used today, you may come across a few other SBOM-adjacent specifications. Here’s a brief overview of two of them: Software Identification (SWID) as well as a newer concept used in vulnerability use cases, called VEX.
VEX, as you will see, also has a few different formats as well. However, it's important to note that VEX documents can be and often are created and distributed independently of an SBOM. SWID can theoretically as well, but this isn't as common.
SWID
In the early days of the U.S. government's work to promote SBOMs, SWID, or SWID Tag, was discussed as an alternative to the other SBOM specifications. U.S. Executive Order 14028, “Improving the Nation’s Cybersecurity,” was the landmark policy document that drove SBOM to prominence and recognized the National Telecommunication and Information Administration (NTIA) as the authoritative organization to define the minimum elements and other deliverables within the EO. All three formats, SPDX, CycloneDX and SWID were recognized as SBOM formats in these initial drafts.
SWID is seen today as less of an SBOM format and more of a software descriptor, and according to NIST, is seen as one possible successor to the beleaguered CPE naming standard. The reason why CPE has become so problematic is two-fold. One, these CPE values are manually assigned, and as such suffer from inconsistency. Secondly, CPE is focused on product-level naming, but very few software components are featured in the National Vulnerability Database, so relying on a CPE to describe a vulnerability in a software component will not be very reliable.
While we have not seen anyone use SWID as an SBOM format, it does have a role to play. Much of the industry’s current focus is on managing the risk of open-source dependencies, and approaches such as PURL are well-supported by package managers to do this work. But SWID solves the challenge around software naming when the software package is not publicly known or distributed, such as in the case of proprietary software.
The reason why SWID is not used for SBOM is that it is not really an SBOM format. It describes key characteristics of software such as the software name, version, suppliers and other metadata. It also provides some useful context for supply chain concepts such as pedigree in the form of patch metadata. But the data format itself is not conducive to the concept of nested layers of software components and their dependency relationships, otherwise known as transitive dependencies.
VEX
Any discussion on SBOM formats will naturally migrate to a conversation around VEX as well. VEX, or Vulnerability Exploitability eXchange, is a concept that sprang from the early NTIA meetings where SBOM was discussed for vulnerability management scenarios. The idea was that an SBOM can tell you where software might be vulnerable with a moderate degree of confidence, but a VEX document functions as a sort of reverse attestation, indicating why the software is not vulnerable, or rather, not affected by the vulnerability. This helps to prioritize your efforts so you are only working on the issues that create real risk for your organization or your customers.
VEX is also supported in many formats, including CycloneDX, CSAF, and OpenVEX. Some of these formats are quite old and predate the concept of VEX entirely, while others have entered our parlance in just the last couple of years. By and large, support for VEX formats is mostly focused on creating them as part of a manual vulnerability triage process, or consuming them to inform your vulnerability program. You can think of them as a companion document to your SBOM, enriching the information your SBOM provides.
What you should primarily focus on, though, is what a VEX document says. Contrary to the name, VEX is not a statement of exploitability or known exploitation. Rather, it refers to whether the software is affected by a corresponding vulnerability.
There may be many reasons why a vulnerable component does not affect the software with its corresponding vulnerability. It might not be reachable code, and only included but not used. It might be mitigated through configuration. It might even be fixed in such a way that the supplier did what is called a backport, where they addressed the vulnerability but did not update the software component version. And software versions are how we identify vulnerabilities from an SBOM. The VEX is a form of attestation that the supplier typically provides with the outcome of their triage of the vulnerability, and states if they will fix the issue or not.
Another consideration when looking at VEX is that while an SBOM is a static document, a VEX should not be. In other words, you need an SBOM for every new release of the software, but until a new version of the software is released, it will not change. The SBOM states what was in the software when it was released. It does not track ongoing status until there is a new version.
A VEX document, on the other hand, is a statement about the status of a vulnerability, and status can change frequently as new information is discovered about the vulnerability or details about the affected nature of the vulnerability are uncovered or disproven by the supplier. CVSS scores can change, impacts might change, the specific conditions that must be in place for vulnerability exploitation might change as we learn more about how the software is abused. Even more frequently than any of this, are the exploitability indicators about a vulnerability that are a direct reflection on the ever-changing threat landscape. This is why the Exploit Prediction Scoring System (EPSS) data feeds change on a daily basis. All of this is to say that VEX is a living, breathing — and, yes, dynamic — piece of content.
Final Notes
As you may have surmised, we have several formats to contend with to operationalize SBOMs. It can be challenging enough when you are a software producer, but at least you can somewhat standardize on your approach. But when you are a consumer and are dealing with hundreds, or even thousands, of software products in your organization, you may have many different types of documents to consume in varying states of maturity. The format you use probably matters much less than whether your SBOM tools effectively support the formats you need to ingest and analyze, unless you plan on spending all day manually reviewing XML and JSON documents.
The good news is we are here to help simplify the process, no matter what your role is. If you need to convert from SPDX to CycloneDX, migrate versions, work with XML instead of JSON, or produce a specific format for a downstream consumer, having robust SBOM management tools can help you get back to business quickly. Feel free to get in touch with the FOSSA team for more information.