Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Dependency finder for smarter builds #52471

Closed
wants to merge 10 commits into from
Closed

Conversation

noahtallen
Copy link
Contributor

Changes proposed in this Pull Request

This is the initial implementation of the dependency finder package. The ultimate goal is to be able to take a certain CI/CD build x, and then determine when it should run. Builds should run when files which would change the build output are modified.

This package attempts to find all files which could possibly affect the build output of a given package. Here are some items it finds:

  • All local project imports (e.g. "found files")
  • All imported npm packages (including monorepo ones)

It also lists files which don't fit into the above definitions:

  • Files which exist in the filesystem, but do not appear to be imported. (Like a bash script or configuration file.)
  • Files which were imported, but could not be located on the filesystem.

To Do

  • Solve all reported "missing files"
    • Dependencies imported via dependency extraction plugin (a8c-fse-common-data-stores)
    • Some bug with the whats-new imports in ETK

Testing instructions

Run this in your shell to test just for ETK. Leave out the apps/editing-toolkit portion if you want to see for all packages/apps.

yarn workspace @automattic/dependency-finder build
node ./packages/dependency-finder/dist/esm/index.js apps/editing-toolkit

You should see all the files it detects in the different categories.

@noahtallen noahtallen requested a review from a team May 1, 2021 03:41
@noahtallen noahtallen self-assigned this May 1, 2021
@matticbot matticbot added [Status] Needs Review The PR is ready for review. This also triggers e2e canary tests and wp-desktop tests automatically. labels May 1, 2021
@github-actions
Copy link

github-actions bot commented May 1, 2021

@matticbot
Copy link
Contributor

This PR does not affect the size of JS and CSS bundles shipped to the user's browser.

Generated by performance advisor bot at iscalypsofastyet.com.

'packages/wpcom-proxy-request',
'packages/wpcom.js',
].map( ( pkg ) => path.resolve( pkg ) );
function getMonorepoPackages() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use ./build-tools/lib/monorepo.js

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect! I'm not sure how to include it, though. I keep getting the error Could not find a declaration file for module '../../../build-tools/lib/monorepo'. when I add import { packagesInMonorepo } from '../../../build-tools/lib/monorepo';

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I experimented with an index.d.ts file (declaring a module for monorepo), but it didn't seem to help


// Add the first variation which exists to the accumulator.
for ( const possibleFile of fileNameVariations ) {
const packagePath = resolve( 'node_modules', dirName, possibleFile );
Copy link
Contributor

@scinos scinos May 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would need to keep walking the tree structure up for more node_modules (eg: ./client/node_modules, then ./node_modules, then ../node_modules..., right?

Maybe use webpack's https://github.com/webpack/enhanced-resolve instead of building our own.

Or even directly use sass-loader? Looks like all their importing logic is exported in https://github.com/webpack-contrib/sass-loader/blob/master/src/utils.js

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, yeah. This approach does work for all imports in ETK, but it's likely missing stuff. Will look into those libraries!

@noahtallen noahtallen force-pushed the try/dependency-finder branch from f6ce5d1 to a7c144f Compare May 24, 2021 21:52
@noahtallen
Copy link
Contributor Author

Alright, this is in a pretty solid state. I've been working through all the edges just parsing through one package's files (ETK), and it's performing well.

  • Correctly detecting all node_modules and monorepo dependencies
  • Correctly finding all internal files based on imports

The only missing file is actually an entry related to the webpack dependency extraction plugin. a8c-fse-common-data-stores represents a dependency on a shared script. (Defined in webpack here). I'm thinking that we can probably just ignore that, at least for now. The actual implementation of that script is detected under "known" files.

The unknown files right now are mostly a combination of the following:

  • Bin scripts
  • Configs (like eslint, gitignore, etc)
  • Non-compiled JavaScript. E.g. ES5 JS which is directly enqueued by the plugin PHP.

I'm not exactly sure how to tie those into the results. 🤔

Package:
  apps/editing-toolkit
Missing files:
  a8c-fse-common-data-stores
Packages:
  @automattic/[email protected]
  @automattic/[email protected]
  @automattic/[email protected]
  @automattic/[email protected]
  @automattic/[email protected]
  @automattic/[email protected]
  @automattic/[email protected]
  @automattic/[email protected]
  @automattic/[email protected]
  @automattic/[email protected]
  @automattic/[email protected]
  @types/[email protected]
  @types/[email protected]
  @types/[email protected]
  @types/[email protected]
  @types/[email protected]
  @types/[email protected]
  @types/[email protected]
  @types/[email protected]
  @types/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  @wordpress/[email protected]
  [email protected]
  [email protected]
  [email protected]
  [email protected]
  [email protected]
  [email protected]
  [email protected]
  [email protected]
  [email protected]
Found files:
  /home/nt/dev/wp-calypso/apps/editing-toolkit/editing-toolkit-plugin/block-helpers/block-helpers.test.ts
  /** A long list of found files; basically everything in the ETK directory **/
Unkown files:
  /home/nt/dev/wp-calypso/apps/editing-toolkit/editing-toolkit-plugin/wpcom-block-editor-nav-sidebar/src/wp-edit-post-types.d.ts
  /home/nt/dev/wp-calypso/apps/editing-toolkit/editing-toolkit-plugin/newspack-blocks/synced-newspack-blocks/blocks/homepage-articles/editor.js
  /home/nt/dev/wp-calypso/apps/editing-toolkit/editing-toolkit-plugin/newspack-blocks/synced-newspack-blocks/blocks/carousel/editor.js
  /home/nt/dev/wp-calypso/apps/editing-toolkit/editing-toolkit-plugin/newspack-blocks/synced-newspack-blocks/amp/homepage-articles/view.js
  /home/nt/dev/wp-calypso/apps/editing-toolkit/editing-toolkit-plugin/newspack-blocks/synced-newspack-blocks/shared/js/newspack-icon.js
  /home/nt/dev/wp-calypso/apps/editing-toolkit/editing-toolkit-plugin/event-countdown-block/event-countdown.js
  /home/nt/dev/wp-calypso/apps/editing-toolkit/editing-toolkit-plugin/mailerlite/subscriber-popup.js
  /home/nt/dev/wp-calypso/apps/editing-toolkit/editing-toolkit-plugin/phpunit.xml.dist
  /home/nt/dev/wp-calypso/apps/editing-toolkit/editing-toolkit-plugin/editor-site-launch/src/wp-components-types.d.ts
  /home/nt/dev/wp-calypso/apps/editing-toolkit/editing-toolkit-plugin/global-styles/static/style.css
  /home/nt/dev/wp-calypso/apps/editing-toolkit/bin/setup-env.sh
  /home/nt/dev/wp-calypso/apps/editing-toolkit/bin/wpcom-watch-and-sync.sh
  /home/nt/dev/wp-calypso/apps/editing-toolkit/bin/sync-newspack-blocks.sh
  /home/nt/dev/wp-calypso/apps/editing-toolkit/bin/npm-run-build.js
  /home/nt/dev/wp-calypso/apps/editing-toolkit/bin/js-unit-setup.js
  /home/nt/dev/wp-calypso/apps/editing-toolkit/bin/babel-transform.js
  /home/nt/dev/wp-calypso/apps/editing-toolkit/.eslintrc.js
  /home/nt/dev/wp-calypso/apps/editing-toolkit/.wp-env.json
  /home/nt/dev/wp-calypso/apps/editing-toolkit/.gitignore
  /home/nt/dev/wp-calypso/apps/editing-toolkit/typings/index.d.ts

@scinos
Copy link
Contributor

scinos commented May 25, 2021

Correctly detecting all node_modules and monorepo dependencies

Eventually we would need to make the tool smarter: it should traverse node_modules dependencies, so if a transitive dependency changes it knows it has to re-build ETK. For now maybe we can just assume that if package.json or yarn.lock changes, ETK has changed.

I'm not exactly sure how to tie those into the results. 🤔

The idea is that if any of those files can alter the compilation result (maybe ./bin files?) they should be added as additionalEntryPoints, so their deps can be scanned.

@noahtallen
Copy link
Contributor Author

The idea is that if any of those files can alter the compilation result (maybe ./bin files?) they should be added as additionalEntryPoints, so their deps can be scanned.

Hmm, yeah. At this point, most of the "unknown" files can change the compilation result, but they are not compiled by webpack. I'm thinking of them as static assets. The bin scripts are a mixed bag, some of them can alter, some can't.

I'm wondering what the most flexible approach would be. I think it would be hard to maintain a list of all files which can alter the build but are not processed by webpack. (We'd have to pick through the bin scripts of every app/package.) Is it reasonable to assume they can all affect the result, unless we know it's safe?

For example, we could exclude all .gitignore and eslint config files. But we could assume that modifications to a .sh script under $app_path will modify that $app.

so if a transitive dependency changes it knows it has to re-build ETK

Yeah.. I was thinking that we could do something like that following:

  • file A is in calypso/packages/calypso-analytics
  • $foo has a dependency on calypso-analytics
  • if file A is modified, we know that we are changing the calypso-analytics package
  • therefore, we can also trigger the build for $foo because we know it has the dependency

Do we need to inspect the files further than just knowing that we have a dependency, and then trigger the build for $foo in two scenarios?

  1. The dependency is internal, so whenever any files change there.
  2. The dependency is external, so whenever the version number updates.

@noahtallen noahtallen force-pushed the try/dependency-finder branch from 949c625 to a3b61dc Compare May 25, 2021 23:58
@scinos
Copy link
Contributor

scinos commented May 26, 2021

Is it reasonable to assume they can all affect the result, unless we know it's safe?

That's a good point, we can do that. We won't need mosts of the dependency walking code, we can just define when to build ETK based on include/exclude globs. Eg:

include: ['apps/editing-toolkit/editing-toolkit-plugin'],
exclude: ['apps/editing-toolkit/editing-toolkit-plugin/.eslintrc.js', 'apps/editing-toolkit/editing-toolkit-plugin/typings/*' ] //etc

The dependency is internal, so whenever any files change there.

Yes! We should apply the same logic that we apply for $foo to detect when $foo has actually changed: Either traverse $foo/package.json and find all related files, or use the include/exclude globs.

The dependency is external, so whenever the version number updates.

Yep, but we need to remember to do it for any package in the full dependency tree, not only the direct dependencies. Probably the best way to generate that tree is look at yarn.lock, all info is there.

@noahtallen
Copy link
Contributor Author

We won't need mosts of the dependency walking code, we can just define when to build ETK based on include/exclude globs.

One of my worries is having to create this lists for every single thing we want to process. I could see that being hard to maintain, especially if we want to expand this tool across the organization 🤔

We won't need most of the dependency walking code

true! I guess the main problem with using just "include" with no dependency traversal would be if something in ETK imported something from elsewhere in the monorepo without declaring it in package.json. And I guess that probably isn't a huge problem either as we enforce dependency lint rules more strictly.

To correctly parse JS dependencies, we need to:
1. Pass a long enough directory path so that
   enhanced-resolve can find the root node_modules
2. Pass the webpack config
@noahtallen noahtallen force-pushed the try/dependency-finder branch from a3b61dc to 362bf2e Compare May 27, 2021 20:11
@noahtallen
Copy link
Contributor Author

@noahtallen
Copy link
Contributor Author

Here's the suggestion from the sass dependency finder tool for how to find node_modules: dependents/node-dependency-tree#132 (comment)

@noahtallen
Copy link
Contributor Author

For next steps, I was exploring the connection between TeamCity and the changed files. Essentially, the TeamCity build calls the script in dependency finder. The dependency finder reads a list of changed files from TeamCity. It also reads a list of different projects which include TeamCity project IDs. (For now, I put that in package-map.json. There might be a better spot.)

The next step is comparing the changed files to the list of jobs, and then calling the TeamCity API to trigger that Job ID.

I started writing a "find matching jobs" function, but I don't think it quite works yet. (https://github.com/Automattic/wp-calypso/pull/52471/files#diff-b71cfeff41c95e0ccfbf40d4f246e9b798c762d0644362cc9b3573529e3f1a64R104-R114) Everything should be pretty well typed, though.

@noahtallen
Copy link
Contributor Author

We probably won't be moving forward with this.

@noahtallen noahtallen closed this Mar 14, 2023
@github-actions github-actions bot removed the [Status] Needs Review The PR is ready for review. This also triggers e2e canary tests and wp-desktop tests automatically. label Mar 14, 2023
@sirreal sirreal deleted the try/dependency-finder branch January 14, 2025 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants