Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for numpy2.0 compatibility #1341

Merged
merged 1 commit into from
Jul 3, 2024

Conversation

PGijsbers
Copy link
Collaborator

Numpy2.0 cleaned up their namespace, so we have to alias for the new version.

The problem is that this is not covered in CI (tested versions do not support numpy 2.0), and in general tests need to be updated to support more recent scikit-learn versions that support 2.0 (1.5, maybe 1.4?). I don't know if I really have the time for that right now. Options:

  • Put a pin in the dependencies
  • Accept this as a best-effort, and hope that the tests that don't fail due to scikit-learn incompatibility sufficiently cover the code to raise any numpy1/2 issues
  • Wait until I find time to make tests support scikit-learn 1.5

Numpy2.0 cleaned up their namespace.
@PGijsbers PGijsbers added testing dependencies Pull requests that update a dependency file labels Jul 3, 2024
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 28.20%. Comparing base (923b49d) to head (124e96f).

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1341      +/-   ##
===========================================
- Coverage    32.94%   28.20%   -4.75%     
===========================================
  Files           38       38              
  Lines         5254     5255       +1     
===========================================
- Hits          1731     1482     -249     
- Misses        3523     3773     +250     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@PGijsbers PGijsbers changed the base branch from develop to fix/sklearn-test-compatibility July 3, 2024 14:08
@PGijsbers PGijsbers marked this pull request as ready for review July 3, 2024 14:08
@PGijsbers PGijsbers merged commit 9af2f62 into fix/sklearn-test-compatibility Jul 3, 2024
9 of 10 checks passed
@PGijsbers PGijsbers deleted the fix/numpy2compatible branch July 3, 2024 14:08
PGijsbers added a commit that referenced this pull request Jul 5, 2024
* Update 'sparse' parameter for OHE for sklearn >= 1.4

* Add compatability or skips for sklearn >= 1.4

* Change 'auto' to 'sqrt' for sklearn>1.3 as 'auto' is deprecated

* Skip flaky test

It is unclear how a condition where the test is supposed to pass
is created. Even after running the test suite 2-3 times, it does
not yet seem to pass.

* Fix typo

* Ignore description comparison for newer scikit-learn

There are some minor changes to the docstrings. I do not know
that it is useful to keep testing it this way, so for now I will
disable the test on newer versions.

* Adjust for scikit-learn 1.3

The loss has been renamed.
The performance of the model also seems to have changed slightly
for the same seed. So I decided to compare with the lower fidelity
that was already used on Windows systems.

* Remove timeout and reruns to better investigate CI failures

* Fix typo in parametername

* Add jobs for more recent scikit-learns

* Expand the matrix with all scikit-learn 1.x versions

* Fix for numpy2.0 compatibility (#1341)

Numpy2.0 cleaned up their namespace.

* Rewrite matrix and update numpy compatibility

* Move comment in-line

* Stringify name of new step to see if that prevented the action

* Fix unspecified os for included jobs

* Fix typo in version pinning for numpy

* Fix version specification for sklearn skips

* Output final list of installed packages for debugging purposes

* Cap scipy version for older versions of scikit-learn

There is a breaking change to the way 'mode' works, that
breaks scikit-learn internals.

* Update parameter base_estimator to estimator for sklearn>=1.4

* Account for changes to sklearn interface in 1.4 and 1.5

* Non-strict reinstantiation requires different scikit-learn version

* Parameters were already changed in 1.4

* Fix race condition (I think)

It seems to me that run.evaluations is set only when the run is
fetched. Whether it has evaluations depends on server state.
So if the server has resolved the traces between the initial
fetch and the trace-check, you could be checking
len(run.evaluations) where evaluations is None.

* Use latest patch version of each minor release

* Convert numpy types back to builtin types

Scikit-learn or numpy changed the typing of the parameters
(seen in a masked array, not sure if also outside of that).
Convert these values back to Python builtins.

* Specify versions with * instead to allow for specific patches

* Flow_exists does not return None but False is the flow does not exist

* Update new version definitions also installation step

* Fix bug introduced in refactoring for np.generic support

We don't want to serialize as the value np.nan, we want to
include the nan directly. It is an indication that the
parameter was left unset.

* Add back the single-test timeout of 600s

* [skip ci] Add note to changelog

* Check that evaluations are present with None-check instead

The default behavior if no evaluation is present is for it
to be None. So it makes sense to check for that instead.
As far as I can tell, run.evaluations should always contain
some items if it is not None. But I added an assert just in
case.

* Remove timeouts again

I suspect they "crash" workers. This of course introduces the risk of hanging processes... But I cannot reproduce the issue locally.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants