- Features
- df.evalute_iterator for efficient parallel chunked evaluation #515
- Performance
- IncrementalPredictor uses parallel chunked support (2x speedup possible) #515
- Fix
- Performance
- Dataframes are always true (implements
__bool__
) to avoid calling__len__
#496
- Dataframes are always true (implements
- Fixes
- Do not duplicate column when joining DataFrames on a column with the same name #480
- Better error messages/stack traces, and work better with debugger. #488
- Accept numpy scalars in expressions. #462
- Expression.astype can create datetime64 columns out of (arrow) strings arrays. #440
- Invalid mask access triggered when memory-mapped read only for strings. #459
- Features
- Features
- IncrementalPredictor for
scikit-learn
models that support the.partial_fit
method #497
- IncrementalPredictor for
- Fixes
- Adding unique function names to dataframes to enable adding a predictor twice #492
* Compatibility with vaex-core 1.4.0
- Performance
- Parallel df.evaluate #474
- Avoid calling df.get_column_names (1000x for 1 billion rows per column use) #473
- Slicing e.g df[1:-1] goes much faster for filtered dataframes #471
- Dataframe copying and expression rewriting was slow #470
- Double indices columns were not using index cache since empty dict is falsy #439
- Features
- requires vaex-core >=1.3,<2 for parallel evaluate
- Fixes:
- bqplot 0.12 revealed a bug/inconsistency with heatmap #465
- Fixes
- Support for Apache Arrow >= 0.15
- Fixes
- Docstrings and minor improvements
- initial release 0.1
- feature: auto upcasting for sum #435
- fix: selection/filtering fix when using masked values #431
- fix: masked string array fixes #434
- fix: memory usage fix for joins #439
- fix: support for Apache Arrow >= 0.15