v0.19.0
🚨 Breaking Changes
- Use the new RF backend by default for classification (#3686) @hcho3
- Deprecating quantile-per-tree and removing three previously deprecated Random Forest parameters (#3667) @vinaydes
- Update predict() / predict_proba() of RF to match sklearn (#3609) @hcho3
- Upgrade FAISS to 1.7.x (#3509) @viclafargue
- cuML's estimator Base class for preprocessing models (#3270) @viclafargue
🐛 Bug Fixes
- Fix brute force KNN distance metric issue (#3755) @viclafargue
- Fix min_max_axis (#3735) @viclafargue
- Fix NaN errors observed with ARIMA in CUDA 11.2 builds (#3730) @Nyrio
- Fix random state generator (#3716) @viclafargue
- Fixes the out of memory access issue for computeSplit kernels (#3715) @vinaydes
- Fixing umap gtest failure under cuda 11.2. (#3696) @cjnolet
- Fix irreproducibility issue in RF classification (#3693) @vinaydes
- BUG fix BatchedLevelAlgo DtClsTest & DtRegTest failing tests (#3690) @venkywonka
- Restore the functionality of RF score() (#3685) @hcho3
- Use main build.sh to build docs in docs CI (#3681) @dantegd
- Revert "Update conda recipes pinning of repo dependencies" (#3680) @raydouglass
- Skip tests that fail on CUDA 11.2 (#3679) @dantegd
- Dask KNN Cl&Re 1D labels (#3668) @viclafargue
- Update conda recipes pinning of repo dependencies (#3666) @mike-wendt
- OOB access in GLM SoftMax (#3642) @divyegala
- SilhouetteScore C++ tests seed (#3640) @divyegala
- SimpleImputer fix (#3624) @viclafargue
- Silhouette Score
make_monotonic
for non-monotonic label set (#3619) @divyegala - Fixing support for empty rows in sparse Jaccard / Cosine (#3612) @cjnolet
- Fix train_test_split with stratify option (#3611) @Nanthini10
- Update predict() / predict_proba() of RF to match sklearn (#3609) @hcho3
- Change dask and distributed branch to main (#3593) @dantegd
- Fixes memory allocation for experimental backend and improves quantile computations (#3586) @vinaydes
- Add ucx-proc package back that got lost during an auto merge conflict (#3550) @dantegd
- Fix failing Hellinger gtest (#3549) @cjnolet
- Directly invoke make for non-CMake docs target (#3534) @wphicks
- Fix Codecov.io Coverage Upload for Branch Builds (#3524) @mdemoret-nv
- Ensure global_output_type is thread-safe (#3497) @wphicks
- List as input for SimpleImputer (#3489) @viclafargue
📖 Documentation
- Add sparse docstring comments (#3712) @JohnZed
- FIL and Dask demo (#3698) @miroenev
- Deprecating quantile-per-tree and removing three previously deprecated Random Forest parameters (#3667) @vinaydes
- Fixing Indentation for Docstring Generators (#3650) @mdemoret-nv
- Update doc to indicate ExtraTree support (#3635) @hcho3
- Update doc, now that FIL supports multi-class classification (#3634) @hcho3
- Document model_type='xgboost_json' in FIL (#3633) @hcho3
- Including log loss metric to the documentation website (#3617) @lowener
- Update the build doc regarding the use of GCC 7.5 (#3605) @hcho3
- Update One-Hot Encoder doc (#3600) @lowener
- Fix documentation of KMeans (#3595) @lowener
🚀 New Features
- Reduce the size of the cuml libraries (#3702) @robertmaynard
- Use ninja as default CMake generator (#3664) @wphicks
- Single-Linkage Hierarchical Clustering Python Wrapper (#3631) @cjnolet
- Support for precomputed distance matrix in DBSCAN (#3585) @Nyrio
- Adding haversine to brute force knn (#3579) @cjnolet
- Support for sample_weight parameter in LogisticRegression (#3572) @viclafargue
- Provide "--ccache" flag for build.sh (#3566) @wphicks
- Eliminate unnecessary includes discovered by cppclean (#3564) @wphicks
- Single-linkage Hierarchical Clustering C++ (#3545) @cjnolet
- Expose sparse distances via semiring to Python API (#3516) @lowener
- Use cmake --build in build.sh to facilitate switching build tools (#3487) @wphicks
- Add cython hinge_loss (#3409) @Nanthini10
- Adding CodeCov Info for Dask Tests (#3338) @mdemoret-nv
- Add predict_proba() to XGBoost-style models in FIL C++ (#2894) @levsnv
🛠️ Improvements
- Updating docs, readme, and umap param tests for 0.19 (#3731) @cjnolet
- Locking RAFT hash for 0.19 (#3721) @cjnolet
- Upgrade to Treelite 1.1.0 (#3708) @hcho3
- Update to XGBoost 1.4.0rc1 (#3699) @hcho3
- Use the new RF backend by default for classification (#3686) @hcho3
- Update LogisticRegression documentation (#3677) @viclafargue
- Preprocessing out of experimental (#3676) @viclafargue
- ENH Decision Tree new backend
computeSplit*Kernel
histogram calculation optimization (#3674) @venkywonka - Remove
check_cupy8
(#3669) @viclafargue - Use custom conda build directory for ccache integration (#3658) @dillon-cullinan
- Disable three flaky tests (#3657) @hcho3
- CUDA 11.2 developer environment (#3648) @dantegd
- Store data frequencies in tree nodes of RF (#3647) @hcho3
- Row major Gram matrices (#3639) @tfeher
- Converting all Estimator Constructors to Keyword Arguments (#3636) @mdemoret-nv
- Adding make_pipeline + test score with pipeline (#3632) @viclafargue
- ENH Decision Tree new backend
computeSplitClassificationKernel
histogram calculation and occupancy optimization (#3616) @venkywonka - Revert "ENH Fix stale GHA and prevent duplicates " (#3614) @mike-wendt
- ENH Fix stale GHA and prevent duplicates (#3613) @mike-wendt
- KNN from RAFT (#3603) @viclafargue
- Update Changelog Link (#3601) @ajschmidt8
- Move SHAP explainers out of experimental (#3596) @dantegd
- Fixing compatibility issue with CUDA array interface (#3594) @lowener
- Remove cutlass usage in row major input for euclidean exp/unexp, cosine and L1 distance matrix (#3589) @mdoijade
- Test FIL probabilities with absolute error thresholds in python (#3582) @levsnv
- Removing sparse prims and fused l2 nn prim from cuml (#3578) @cjnolet
- Prepare Changelog for Automation (#3570) @ajschmidt8
- Print debug message if SVM convergence is poor (#3562) @tfeher
- Fix merge conflicts in 3552 (#3557) @ajschmidt8
- Additional distance metrics for ANN (#3533) @viclafargue
- Improve warning message when QN solver reaches max_iter (#3515) @tfeher
- Fix merge conflicts in 3502 (#3513) @ajschmidt8
- Upgrade FAISS to 1.7.x (#3509) @viclafargue
- ENH Pass ccache variables to conda recipe & use Ninja in CI (#3508) @Ethyling
- Fix forward-merger conflicts in #3502 (#3506) @dantegd
- Sklearn meta-estimators into namespace (#3493) @viclafargue
- Add flexibility to copyright checker (#3466) @lowener
- Update sparse KNN to use rmm device buffer (#3460) @lowener
- Fix forward-merger conflicts in #3444 (#3455) @ajschmidt8
- Replace ML::MetricType with raft::distance::DistanceType (#3389) @lowener
- RF param initialization cython and C++ layer cleanup (#3358) @venkywonka
- MNMG RF broadcast feature (#3349) @viclafargue
- cuML's estimator Base class for preprocessing models (#3270) @viclafargue
- Make
_get_tags
a class/static method (#3257) @dantegd - NVTX Markers for RF and RF-backend (#3014) @venkywonka