Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extraction of license text from files. #33 #193

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ $ yarn cyclonedx
This causes information loss in trade-off shorter PURLs, which might improve ingesting these strings.
--output-reproducible Whether to go the extra mile and make the output reproducible.
This might result in loss of time- and random-based values.
--gather-license-texts Search for license files in components and include them as license evidence.
--verbose,-v Increase the verbosity of messages.
Use multiple times to increase the verbosity even more.

Expand Down
46 changes: 34 additions & 12 deletions src/builders.ts
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Copyright (c) OWASP Foundation. All Rights Reserved.
import type { FromNodePackageJson as PJB } from '@cyclonedx/cyclonedx-library/Builders'
import { ComponentType, ExternalReferenceType, LicenseAcknowledgement } from '@cyclonedx/cyclonedx-library/Enums'
import type { FromNodePackageJson as PJF } from '@cyclonedx/cyclonedx-library/Factories'
import { Bom, Component, ExternalReference, type License, Property, Tool } from '@cyclonedx/cyclonedx-library/Models'
import { Bom, Component, ComponentEvidence, ExternalReference, type License, Property, Tool } from '@cyclonedx/cyclonedx-library/Models'
import { BomUtility } from '@cyclonedx/cyclonedx-library/Utils'
import { Cache, type FetchOptions, type Locator, type LocatorHash, type Package, type Project, structUtils, ThrowReport, type Workspace, YarnVersion } from '@yarnpkg/core'
import { ppath } from '@yarnpkg/fslib'
Expand All @@ -32,15 +32,23 @@ import type { PackageURL } from 'packageurl-js'
import { getBuildtimeInfo } from './_buildtimeInfo'
import { isString, tryRemoveSecretsFromUrl, trySanitizeGitUrl } from './_helpers'
import { wsAnchoredPackage } from './_yarnCompat'
import { makeLicenseEvidence } from './evidence'
import { PropertyNames, PropertyValueBool } from './properties'

type ManifestFetcher = (pkg: Package) => Promise<any>
interface PackageInfo {
/** Content of package.json, as is. */
manifest: any
licenseEvidence?: ComponentEvidence
}

type ManifestFetcher = (pkg: Package) => Promise<PackageInfo>

interface BomBuilderOptions {
omitDevDependencies?: BomBuilder['omitDevDependencies']
metaComponentType?: BomBuilder['metaComponentType']
reproducible?: BomBuilder['reproducible']
shortPURLs?: BomBuilder['shortPURLs']
gatherLicenseTexts: BomBuilder['gatherLicenseTexts']
}

export class BomBuilder {
Expand All @@ -52,6 +60,7 @@ export class BomBuilder {
metaComponentType: ComponentType
reproducible: boolean
shortPURLs: boolean
gatherLicenseTexts: boolean

console: Console

Expand All @@ -70,6 +79,7 @@ export class BomBuilder {
this.metaComponentType = options.metaComponentType ?? ComponentType.Application
this.reproducible = options.reproducible ?? false
this.shortPURLs = options.shortPURLs ?? false
this.gatherLicenseTexts = options.gatherLicenseTexts

this.console = console_
}
Expand Down Expand Up @@ -137,7 +147,7 @@ export class BomBuilder {
}

private makeComponentFromWorkspace (workspace: Workspace, type?: ComponentType | undefined): Component | false | undefined {
return this.makeComponent(workspace.anchoredLocator, workspace.manifest.raw, type)
return this.makeComponent(workspace.anchoredLocator, { manifest: workspace.manifest.raw }, type)
}

private async makeManifestFetcher (project: Project): Promise<ManifestFetcher> {
Expand All @@ -150,11 +160,18 @@ export class BomBuilder {
report: new ThrowReport(),
cacheOptions: { skipIntegrityCheck: true }
}
return async function (pkg: Package): Promise<any> {
const gatherLicenseTexts = this.gatherLicenseTexts
return async function (pkg: Package): Promise<PackageInfo> {
const { packageFs, prefixPath, releaseFs } = await fetcher.fetch(pkg, fetcherOptions)
try {
const manifestPath = ppath.join(prefixPath, 'package.json')
return JSON.parse(await packageFs.readFilePromise(manifestPath, 'utf8'))
const packageInfo: PackageInfo = {
manifest: JSON.parse(await packageFs.readFilePromise(manifestPath, 'utf8'))
}
if (gatherLicenseTexts) {
packageInfo.licenseEvidence = makeLicenseEvidence(prefixPath, packageFs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has nothing to do with the function it is placed in.
the evidence are no package-info at all.

I pull-requested my idea here: AugustusKling#1

}
return packageInfo
} finally {
if (releaseFs !== undefined) {
releaseFs()
Expand All @@ -171,30 +188,35 @@ export class BomBuilder {
const data = await fetchManifest(pkg)
// the data in the manifest might be incomplete, so lets set the properties that yarn discovered and fixed
/* eslint-disable-next-line @typescript-eslint/strict-boolean-expressions */
data.name = pkg.scope ? `@${pkg.scope}/${pkg.name}` : pkg.name
data.version = pkg.version
data.manifest.name = pkg.scope ? `@${pkg.scope}/${pkg.name}` : pkg.name
data.manifest.version = pkg.version
return this.makeComponent(pkg, data, type)
}

private makeComponent (locator: Locator, data: any, type?: ComponentType | undefined): Component | false | undefined {
private makeComponent (locator: Locator, data: PackageInfo, type?: ComponentType | undefined): Component | false | undefined {
// work with a deep copy, because `normalizePackageData()` might modify the data
const dataC = structuredClonePolyfill(data)
normalizePackageData(dataC as normalizePackageData.Input)
normalizePackageData(dataC.manifest as normalizePackageData.Input)
// region fix normalizations
if (isString(data.version)) {
if (isString(data.manifest.version)) {
// allow non-SemVer strings
dataC.version = data.version.trim()
dataC.manifest.version = data.manifest.version.trim()
}
// endregion fix normalizations

// work with a deep copy, because `normalizePackageData()` might modify the data
const component = this.componentBuilder.makeComponent(
dataC as normalizePackageData.Package, type)
dataC.manifest as normalizePackageData.Package, type)
if (component === undefined) {
this.console.debug('DEBUG | skip broken component: %j', locator)
return undefined
}

if (data.licenseEvidence instanceof ComponentEvidence) {
this.console.debug('DEBUG | Adding license evidence for: %j', locator)
component.evidence = data.licenseEvidence
}

switch (true) {
case locator.reference.startsWith('workspace:'): {
// @TODO: add CDX-Property for it - cdx:yarn:reference:workspace = $workspaceName
Expand Down
8 changes: 7 additions & 1 deletion src/commands.ts
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,10 @@ export class MakeSbomCommand extends Command<CommandContext> {
'This might result in loss of time- and random-based values.'
})

gatherLicenseTexts = Option.Boolean('--gather-license-texts', false, {
description: 'Search for license files in components and include them as license evidence.'
})

verbosity = Option.Counter('--verbose,-v', 1, {
description: 'Increase the verbosity of messages.\n' +
'Use multiple times to increase the verbosity even more.'
Expand Down Expand Up @@ -142,6 +146,7 @@ export class MakeSbomCommand extends Command<CommandContext> {
mcType: this.mcType,
shortPURLs: this.shortPURLs,
outputReproducible: this.outputReproducible,
gatherLicenseTexts: this.gatherLicenseTexts,
verbosity: this.verbosity,
projectDir
})
Expand Down Expand Up @@ -171,7 +176,8 @@ export class MakeSbomCommand extends Command<CommandContext> {
omitDevDependencies: this.production,
metaComponentType: this.mcType,
reproducible: this.outputReproducible,
shortPURLs: this.shortPURLs
shortPURLs: this.shortPURLs,
gatherLicenseTexts: this.gatherLicenseTexts
},
myConsole
)).buildFromWorkspace(workspace)
Expand Down
67 changes: 67 additions & 0 deletions src/evidence.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
/*!
This file is part of CycloneDX SBOM plugin for yarn.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

SPDX-License-Identifier: Apache-2.0
Copyright (c) OWASP Foundation. All Rights Reserved.
*/
import { extname } from 'node:path'

import * as CDX from '@cyclonedx/cyclonedx-library'
import { type ComponentEvidence } from '@cyclonedx/cyclonedx-library/Models'
import { type FakeFS, type PortablePath, ppath } from '@yarnpkg/fslib'

export function makeLicenseEvidence (packageRoot: PortablePath,
packageFs: FakeFS<PortablePath>): ComponentEvidence | undefined {
return new CDX.Models.ComponentEvidence({
licenses: new CDX.Models.LicenseRepository(readLicenseFiles(packageRoot, packageFs))
})
}

const LICENSE_FILENAME_PATTERN = /^(?:UN)?LICEN[CS]E|NOTICE/i
jkowalleck marked this conversation as resolved.
Show resolved Hide resolved
// common file endings that are used for notice/license files
const MAP_TEXT_EXTENSION_MIME: Readonly<Record<string, string>> = {
'': 'text/plain',
'.md': 'text/markdown',
'.rst': 'text/prs.fallenstein.rst',
'.txt': 'text/plain',
'.xml': 'text/xml'
} as const

function * readLicenseFiles (
packageRoot: PortablePath,
packageFs: FakeFS<PortablePath>
): Generator<CDX.Models.License> {
const files = packageFs.readdirSync(packageRoot).filter((f) => {
return LICENSE_FILENAME_PATTERN.test(f)
})
for (const licenseFile of files) {
const path = ppath.join(packageRoot, licenseFile)
if (packageFs.existsSync(path)) {
const contentType = MAP_TEXT_EXTENSION_MIME[extname(licenseFile)]
const attachment = new CDX.Models.Attachment(
packageFs.readFileSync(path).toString('base64'),
{
contentType,
encoding: CDX.Enums.AttachmentEncoding.Base64
}
)
yield new CDX.Models.NamedLicense(
`file: ${licenseFile}`,
{
text: attachment
})
}
}
}
10 changes: 8 additions & 2 deletions tests/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,20 @@ Test files must follow the pattern `**.{spec,test}.[cm]?js`, to be picked up.
Test runner is `mocha`, configured in [mocharc file](../.mocharc.js).

```shell
npm test
yarn run test
```

To run specific tests only
```shell
yarn run test:node --grep "testname"
```

### Snapshots

Some tests check against snapshots.
To update these, set the env var `CYARN_TEST_UPDATE_SNAPSHOTS` to a non-falsy value.

like so:
```shell
CYARN_TEST_UPDATE_SNAPSHOTS=1 npm test
CYARN_TEST_UPDATE_SNAPSHOTS=1 yarn run test
```
Loading