Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

file based splits #108

Open
wd60622 opened this issue Dec 29, 2024 · 5 comments
Open

file based splits #108

wd60622 opened this issue Dec 29, 2024 · 5 comments

Comments

@wd60622
Copy link

wd60622 commented Dec 29, 2024

I'm interested in file based splits in order to keep modules together

For example:

a.py
b.py
sub_module/c.py
sub_module/d.py
another_sub_module/e.py

A split might be like a.py and b.py, sub_module, and another_sub_module

@wd60622
Copy link
Author

wd60622 commented Dec 29, 2024

Cherry on top would be being able to recover these groups by name so I can label in other CI/CD

@jerry-git
Copy link
Owner

Let's first align on the naming:

  • module is a .py file
  • package is a directory which contains .py files, including __init__.py

I think the pragmatic solution would be to ensure that tests within a single module are ran in the same group. If the algorithm would also take into accounts packages, it'd easily get complex as there can be basically endless level of nesting with packages.

Considering implementation, I think there are two options:

  1. A new flag, e.g. --no-module-splits. This would bring the feature to both of the splitting algorithms (duration_based_chunks and least_duration, see README and sources in algorithms.py).
  2. A new splitting algorithm

I'd favour 1. as there are also needs for "no splits for classes" (e.g. #82) so 1. would be a better choice considering future development.

I believe there'd be many use cases for this so happy to take a PR if someone wants to give it a shot.

@jerry-git
Copy link
Owner

recover these groups by name so I can label in other CI/CD

@wd60622 what do you mean by this?

@wd60622
Copy link
Author

wd60622 commented Dec 30, 2024

@wd60622 what do you mean by this?

I would like to be able to know which files / modules are in each group. This issue's body is the motivation: pymc-labs/pymc-marketing#1158 where I display the groups in a GitHub Issue.

@wd60622
Copy link
Author

wd60622 commented Dec 30, 2024

Thanks for clarifying the problem a bit @jerry-git

Are you imagining that the --splitting-algorithm flag works the same way as before?

For instance,

# Same behavior as before
pytest --splitting-algorithm=duration_based_chunks

# New behavior
pytest --splitting-algorithm=duration_based_chunks --no-module-splits

Where the --no-module-splits would be some preprocessing step on the duration before the algorithm is run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants