Software composition analysis: a practical guide to managing open source dependency risk
Most modern applications are assembled, not written. Synopsys' 2024 Open Source Security and Risk Analysis (OSSRA) report scanned 1,067 commercial codebases and found 96% contained open source code, with open source making up 77% of total code on average. That borrowed code carries inherited risk. 84% of those codebases held at least one known vulnerability, and 74% held a high-risk one. The prior year's number was 48%.
Software composition analysis (SCA) is how engineering teams keep track of all that. The rest of this piece walks through what SCA actually does, how it differs from SAST and DAST, what to look for in a tool, and where it sits inside the broader software supply chain security picture.
Key takeaways
- Software composition analysis (SCA) inventories every open source and third-party component in an application, matches each one against vulnerability databases (NVD, OSV, GHSA, CISA KEV), and flags license obligations that legal teams need to enforce.
- Transitive dependencies are the dominant risk surface. Endor Labs' 2024 Dependency Management Report found 95% of vulnerable dependencies in Java applications were never declared by the developer.
- CVSS base score alone is a bad prioritization signal. Effective SCA programs combine CVSS, EPSS exploit probability, KEV membership, and reachability analysis to decide what to fix first.
- SCA is a continuous control, not a release-gate scan. The cost of remediating a vulnerability rises by roughly 6x between coding and production per IBM Systems Sciences Institute data cited in NIST SP 800-218.
- The base image determines how many CVEs an application inherits before SCA ever runs. Hardened, minimal images shrink the inventory SCA has to triage.
What is software composition analysis?
Software composition analysis is the automated identification of open source and third-party components inside an application, along with their known vulnerabilities, licenses, and dependency relationships. It produces a software bill of materials (SBOM) and matches each component against vulnerability databases such as the National Vulnerability Database (nvd.nist.gov), the Open Source Vulnerabilities database (osv.dev), and CISA's Known Exploited Vulnerabilities (KEV) catalog.
SCA is one of four complementary application security disciplines:
DisciplineWhat it analyzesPrimary risk it surfacesSCAThird-party and open source componentsDependency CVEs and license obligationsSASTFirst-party source codeBugs in code your team wroteDASTRunning application from outsideExploitable runtime behaviorIASTApplication during testing with instrumentationReal-time data-flow vulnerabilities
A complete program runs SCA and SAST against pre-deploy artifacts, DAST against staging environments, and feeds findings into a single risk register. Treating any one of them as a substitute for the others leaves predictable gaps: SCA without SAST misses first-party authorization flaws, SAST without SCA misses inherited Log4j-class CVEs.
Why software composition analysis matters
Open source is the largest attack surface most teams have never inventoried. Sonatype's 10th Annual State of the Software Supply Chain Report (2024) tracked 6.6 trillion open source package downloads and recorded a 156% year-over-year increase in malicious packages published to public registries. The pattern that falls out of those numbers is specific:
- Transitive dependencies are where the real exposure sits. Endor Labs' 2024 dataset found 95% of vulnerable Java dependencies were transitive: pulled in by other dependencies, never declared in a build manifest.
- Critical CVEs spread fast. CVE-2021-44228 (Log4Shell) carries a CVSSv3 base score of 10.0 (CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H), was added to CISA's KEV catalog on December 10, 2021, and turned up in millions of production deployments within 72 hours of disclosure.
- License violations are a legal problem, not a technical one. Copyleft licenses such as GPL-3.0 and AGPL-3.0 can compel source disclosure of derivative works. Discovery during M&A due diligence routinely triggers deal repricing or remediation escrows.
Regulation now codifies SCA as a baseline requirement. U.S. Executive Order 14028, the NIST Secure Software Development Framework (SSDF) practice PS.3.2, and the EU Cyber Resilience Act (entered into force December 10, 2024) all require software producers to maintain accurate component inventories and to disclose vulnerabilities to downstream users on defined timelines.
How SCA works: the technical process
SCA tools execute six steps in sequence, with each step depending on the accuracy of the previous one.
- Dependency discovery. Parse manifest files (package.json, requirements.txt, pom.xml, go.mod, Gemfile.lock) to enumerate declared dependencies and resolve the transitive graph.
- Component identification. Fingerprint binaries, container layers, and unmanaged files using cryptographic hashes and signature heuristics, so components that are present but undeclared still get caught.
- Vulnerability matching. Cross-reference each component-version against NVD, OSV, GitHub Security Advisories (GHSA), and vendor advisories to produce a CVE list with CVSS vectors.
- License analysis. Detect declared and embedded license headers and flag conflicts against organizational policy (for example, a GPL-3.0 component shipped inside a proprietary SaaS binary).
- Risk scoring. Combine CVSS base score, EPSS exploit probability, KEV listing status, and reachability analysis (whether the vulnerable function is actually called from application code) to rank findings.
- Remediation guidance. Suggest minimum-version upgrades, alternative packages, or VEX statements that exclude non-exploitable findings from triage queues.
The gap between a manifest-only scanner and one that does binary fingerprinting plus reachability analysis is where most "false positive" complaints come from. A scanner that reports every CVE in every transitive dependency, without checking reachability, will typically inflate the queue 4–5x relative to actually exploitable findings.
Core capabilities of a modern SCA tool
A few capabilities consistently separate the enterprise-grade platforms from the basic scanners:
- SBOM generation in SPDX 2.3 or CycloneDX 1.5 format, cryptographically signed with Sigstore cosign or in-toto attestations so downstream consumers can verify provenance.
- Reachability analysis via static call-graph traversal that confirms whether a vulnerable function is actually invoked from application code. Without it, teams burn engineering hours on CVEs that never execute.
- VEX (Vulnerability Exploitability eXchange) support per the CycloneDX VEX 1.5 specification, with machine-readable statements that mark specific CVEs as not_affected, affected, fixed, or under_investigation. VEX cuts triage workload by removing non-exploitable findings before they reach a developer.
- Container and IaC coverage for image layers, Dockerfiles, Helm charts, Terraform modules, and Kubernetes manifests, since that is where most modern deployment risk now lives.
- Policy as code for license and severity rules (OPA Rego or a custom DSL), version-controlled rather than buried in a UI, so the policy is auditable and applied uniformly across teams.
Integrating SCA into the SDLC
The cost of remediating a vulnerability rises by roughly 6x between coding and production, per IBM Systems Sciences Institute data cited in NIST SP 800-218 (SSDF). On a recent engagement with a fintech team running about 40 containerized microservices, moving SCA from a pre-release gate into the IDE and pull-request stages cut the average dependency-upgrade PR from eleven days to under 48 hours for non-breaking bumps. The same finding showed up three stages earlier, which is the whole point:
- IDE. Plugins for VS Code, IntelliJ, and Eclipse warn developers as a dependency is added, before a commit exists.
- Pull request. Scanners hooked into GitHub, GitLab, or Bitbucket post inline comments listing new CVEs introduced by the PR diff and block merges that violate severity policy.
- CI/CD. Pipeline steps in Jenkins, GitHub Actions, GitLab CI, and CircleCI fail builds when policy thresholds are breached. Configure pass/fail thresholds by severity and EPSS score, not by CVSS alone, to avoid blocking releases over theoretical findings.
- Container build and registry. Scan every image at build time and on registry push. Block pushes that introduce KEV-listed CVEs.
- Runtime monitoring. Continuously re-scan deployed images so newly disclosed CVEs trigger alerts against fleet inventory rather than waiting for the next release. CVE-2024-3094 (the xz-utils backdoor, CVSSv3 10.0) was disclosed March 29, 2024 and added to KEV the same day. SBOM-driven runtime monitoring is what made fleet-wide impact assessment possible within hours of disclosure.
The SBOM produced at build time is the link between these stages. The same artifact identity is queried at every gate, so a finding raised in production can be traced back to the exact build that introduced it.
Evaluating and selecting an SCA tool
Use a fixed evaluation matrix when comparing vendors. The criteria below correlate most strongly with reduced false positives and faster mean time to remediation in published benchmarks.
- Language and ecosystem coverage. Verify support for every language and package manager in your codebase, including Go modules, Rust crates, .NET NuGet, and Swift Package Manager. Coverage gaps are common outside the JavaScript and Java mainstream.
- Vulnerability database breadth. A scanner that pulls only from NVD will miss findings that exist in OSV or GHSA. Confirm at least three independent feeds plus KEV correlation.
- Reachability analysis. Endor Labs' 2024 benchmark showed reachability analysis suppresses 60–80% of false positives in Java and Python codebases.
- VEX support, both consumption and emission. Without VEX, downstream customers cannot trust your SBOM as a complete risk statement.
- Build vs. buy fit. OWASP Dependency-Check, Syft, Grype, and Trivy cover most baseline use cases at zero license cost. Commercial platforms (Snyk, Mend, Black Duck, Veracode) add reachability analysis, license attribution generation, ticketing integrations, and contractual SLAs for vulnerability data freshness. Recent supply chain incidents, including the Trivy v0.69.4 distribution compromise, are a reminder that scanner trust itself is an evaluation criterion.
Avoid four common pitfalls: treating SCA as a one-time audit, ignoring transitive depth, relying on a single vulnerability feed, and tuning severity thresholds so loosely that nothing ever blocks a release.
SCA best practices
Practical recommendations from running SCA programs across regulated workloads:
- Maintain a signed SBOM for every released artifact. Without one, the next zero-day requires manual codebase searches across every repository.
- Define remediation SLAs by severity and KEV status. I use 7 days for anything on the KEV catalog, 30 days for CVSS 9.0+ with a public exploit, and 90 days for the rest. That stays inside CISA Binding Operational Directive 22-01's 14-day KEV requirement for federal agencies and has held up in private-sector audits.
- Automate dependency updates with Dependabot or Renovate, but require CI to validate the updated build. Auto-merge for patch versions only.
- Treat license findings with the same blocking authority as security findings. A GPL violation discovered post-release is usually more expensive than a CVE.
- Pair SCA with hardened base images so the inherited vulnerability count starts as close to zero as possible. SCA shows you what is in the image; the base image determines how much is in there to begin with.
Frequently asked questions
What is the difference between SCA and SAST?
SCA analyzes third-party and open source components inside your application; SAST analyzes the code your team wrote. An SCA tool flags a vulnerable version of a library in your dependency tree. A SAST tool flags a SQL injection bug in your login handler. Both are required for a complete program. Running only one leaves predictable gaps: SCA without SAST misses first-party authorization flaws, SAST without SCA misses inherited Log4j-class CVEs.
Does SCA replace an SBOM?
No. SCA generates the SBOM as a byproduct. The SBOM is the inventory artifact, a machine-readable list of every component in a build, typically in SPDX 2.3 or CycloneDX 1.5 format. SCA is the process that produces and then enriches that inventory with vulnerability, license, and risk data. The SBOM travels with the artifact; SCA findings are tied back to the SBOM's component identifiers so the same finding can be queried at any later stage.
What is reachability analysis in SCA?
Reachability analysis is static call-graph analysis that determines whether a vulnerable function inside a dependency is actually invoked from application code. A CVE counts as "present" when the package is installed, but only counts as "reachable" when the exploit path is callable from your code. Endor Labs' 2024 benchmark showed reachability analysis suppresses 60–80% of false positives in Java and Python codebases. Without it, teams triage CVEs that can never execute.
Can SCA detect malicious packages?
SCA can flag known-malicious packages, but detection is only as good as the threat intelligence feed behind it. Sonatype's 2024 report tracked a 156% year-over-year increase in malicious packages in public registries, including typosquats, dependency confusion attacks, and backdoored releases such as CVE-2024-3094 in xz-utils. Leading SCA platforms pair vulnerability data with a separate malicious-package feed and block installs by cryptographic hash rather than by package name alone.
Are open source SCA tools sufficient for enterprise workloads?
OWASP Dependency-Check, Syft, Grype, and Trivy cover most baseline scanning and SBOM generation needs at zero license cost. Enterprise workloads typically add requirements those tools do not ship out of the box: reachability analysis, license attribution generation, ticketing integrations, contractual SLAs on vulnerability data freshness, and single sign-on. The real question is not open source vs. commercial, but where the organization's maturity and risk profile sits on that spectrum.