Releases · pola-rs/polars

01 May 13:02

github-actions

py-0.17.11

27cf122

Python Polars 0.17.11

🚀 Performance improvements

improve nested grouptuples related code (#8618)
buffer spill partitions in ooc sort. ~10/20% (#8616)
avoid potentially redundant casts on Series init (#8613)

✨ Enhancements

add Expr.meta namespace eq and ne methods (#8599)
avoid potentially redundant casts on Series init (#8613)
use temp dir for ooc spills (#8614)
add strict dtype equality comparison methods (is_ and is_not) (#8600)
automatically convert series <op> expr to pl.lit(series) <op> expr (#8549)

🐞 Bug fixes

maintain sorted info on top-k and empty sort (#8615)
fix ooc sort regression; don't take IO-thread before init (#8607)
maintain sortedness in date -> datetime cast (#8606)

🛠️ Other improvements

document sortedness of return value of upsample (#8612)
Set up functions module in Rust bindings (#8598)
Split PyExpr impl block into modules (#8596)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @dependabot, @dependabot[bot], @mcrumiller, @ritchie46 and @stinodego

Contributors

mcrumiller, alexander-beedie, and 4 other contributors

Assets 2

30 Apr 13:03

github-actions

py-0.17.10

94b2bcd

Python Polars 0.17.10

🏆 Highlights

Out-of-core unique (#8573)

🚀 Performance improvements

improve OOC sort performance during partition phase (#8590)
significant speedup for python iteration over Series data (#8501)

✨ Enhancements

make ooc-sort resilient against chunk_size (#8588)
Out-of-core unique (#8573)
Add to_date, to_datetime, to_time to String namespace (#8579)
enhance parametric strategy retrieval, enable List strategy by default (#8571)
Add default value for round (#8566)
don't trigger unreachable code if no dtype is set (#8532)
Ergonomic inputs for all, any, sum, and cumsum (#8541)
accept expressions in groupby_dynamic/rolling (#8528)
add is_nested property to dtypes (#8514)

🐞 Bug fixes

fix determining of supertype for tz-aware and tz-naive datetimes (#8585)
correct for nested offsets in json serialization (#8584)
fix wrong dtype init in streaming groupby (#8574)
fix edge-case with NamedTuple input that contains unhashable field data (#8578)
temporarily disable List dtype in parametric tests (#8581)
fix categorical/string_cache fill_null panic (#8562)
fix testing asserts for NaN values in Struct data (#8557)
fix window function contention in binary expression (#8544)
fix struct pyarrow ffi (#8543)
don't trigger unreachable code if no dtype is set (#8532)
fix testing asserts for NaN values in List data (#8537)
keep sorted info on agg_first and simple singleton… (#8526)
don't downcast Decimal to Float64 in truediv (#8523)
unset fast_unique coming from arrow (#8521)
correct sign-reversed scale on DecimalChunked to Python Decimal conversion (fixes #8423) (#8508)
Clarify and fix behaviour in pl.min/max (#8509)

🛠️ Other improvements

warn about changing date_range default from lazy=False to eager=False (#8593)
Rename internals module to _reexport (#8554)
change partition strategy (#8561)
fix testing asserts for NaN values in Struct data (#8557)
note sortedness of results from groupby ops (#8540)
better type signature for set_sorted (#8529)
add test for categorical input that is not fast_unique (#8527)
Improvements to the Python release workflow (#8121)
Update docs requirements (#8200)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @cgevans, @ritchie46, @stinodego and @uchiiii

Contributors

cgevans, alexander-beedie, and 4 other contributors

Assets 2

25 Apr 13:58

github-actions

py-0.17.9

201fded

Python Polars 0.17.9

Migration guide.

Operation that require columns to be sorted will now give a warning if they are not explicitly sorted, or tagged as sorted.

# 1. inform polars that a column is sorted on the DataFrame / LazyFrame.
(
    df.set_sorted("foo")
    .groupby_dynamic(..)
)

# 2. inform polars inline via the `set_sorted` expression
df.join_asof(df2, on=pl.col("foo").set_sorted())

# 3. explicitly sort first 
# this is expensive if the data is already sorted
df.sort("foo")

✨ Enhancements

expose quantile/mean for duration (#8491)
require explicitly sorted flag for upsample (#8488)
allow for _saturating suffix in duration strings (#8479)

🐞 Bug fixes

don't error on cast if column is not projected (#8495)
ensure window function succeeds on empty frame (#8492)
don't set verbose on union (#8487)
check literal/group length before claiming agg sta… (#8486)

🛠️ Other improvements

Remove unneeded operation in strptime (#8496)
additional parametric testing docs/examples (#8485)
improve sorted warning/ fix tests (#8484)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @ritchie46 and @stinodego

Contributors

alexander-beedie, ritchie46, and 2 other contributors

Assets 2

24 Apr 15:07

github-actions

py-0.17.8

3009a25

Python Polars 0.17.8

🚀 Performance improvements

less naive count (#8473)
parallelise dataframe describe method (#8465)
parallelize almost all flattens (#8468)
optimize horizontal min/max (#8463)
reinstate old behavior in numeric group-tuples (#8445)

✨ Enhancements

apply thousand-separators to "shape" html output, consi… (#8472)
let duration string accept "1mo_saturating" (#8469)
add dt.month_start and dt.month_end (#8435)
add SQL support for cumulative functions (#8457)
improve utility of dtype groups (#8453)
improved parametric Decimal strategy (#8444)
improved hypothesis/parametric testing profile registration (#8433)

🐞 Bug fixes

fix error message of offset_by if offsetting by negative number of months (#8464)
fix sorted warning (#8462)
improve utility of dtype groups (#8453)

🛠️ Other improvements

bubble up time_iter errors (#8467)
additional test coverage for dtype groups (#8458)
integrate live refresh/reload facility while writing docs (#8452)
add a series of parametric/hypothesis example tests to the main testing docs page (#8454)
parametric testing docs improvements (#8447)
improved parametric Decimal strategy (#8444)
improved hypothesis/parametric testing profile registration (#8433)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @ritchie46, @universalmind303 and @utkarshgupta137

Contributors

alexander-beedie, ritchie46, and 3 other contributors

Assets 2

22 Apr 20:08

github-actions

py-0.17.7

39c2287

Python Polars 0.17.7

🚀 Performance improvements

remove false sharing in perfect hash table >2x (#8432)
further optimised conversions to python date/datetime (#8417)

✨ Enhancements

initial parametric/hypothesis Decimal dtype testing strategy (note: disabled by default) (#8430)
add Series support to pl.from_repr (#8429)
Allow %f in strptime format strings (#8404)

🐞 Bug fixes

raise upon invalid use of zero_copy_only (#8418)
respect dtype in anonymous list builder in case of… (#8428)
str.strptime error message: utf -> utc (#8422)

🛠️ Other improvements

initial parametric/hypothesis Decimal dtype testing strategy (note: disabled by default) (#8430)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @ayemjay, @jonashaag, @mzjp2, @pgimalac, @ritchie46 and @stinodego

Contributors

jonashaag, alexander-beedie, and 5 other contributors

Assets 2

21 Apr 14:00

github-actions

py-0.17.6

5c61983

Python Polars 0.17.6

🚀 Performance improvements

optimize join inner materialization of single keys (#8405)
parallelize sorted group tuple materialization (#8387)
improve materialization of huge cardinality group tuples (#8382)
improve group_tuples materialization (#8375)
conversion speedups from polars int64 timestamps to python temporal types:
- ~35% faster → python date (#8339)
- ~15% faster → python time (#8352)
- ~10% faster → python datetime (#8339)

✨ Enhancements

allow existing item method to optionally take row/col indices (#8412)
allow negative 'arange' expression (#8413)
warn if argument is not explicitly sorted (#8409)
.to_numpy(use_pyarrow=False) for Object and Boolean (#8397)
new hypothesis strategy that can generate data for List dtypes (#8400)
offer cleaner usage pattern for Config object in context-manager context (#8394)
add support for SQL "IN" expr (#8396)
add a "signed" param to Series.is_integer (#8383)
add is_integer (#8373)
raise error on invalid dict aggregation (#8371)
cli output mode & sql read_json (#8336)
more informative keyerror on invalid getitem (#8320)

🐞 Bug fixes

infer supertype in json serde (#8411)
duration on empty df (#8403)
don't inadvertently set Series initialised with nested tuple data as Object dtype (#8401)
use physical in streaming unique global table (#8390)
recursively bubble up all dtypes in list cast (#8386)
is_in struct logical types (#8378)
fix nested null parquet read (#8372)
fix logical type in ListChunked::new_from_index (#8367)
fix unintentional loading of hypothesis profile (#8362)
bubble up logical type in recursive list cast (#8356)
ensure that iter_rows doesn't return nested Timestamp values (#8359)
implement clone_inner for all series (#8357)
add missing __hash__ support to Field, include "time_zone" in Datetime hash, fix Struct hash (#8354)
fix fill_null for categorical (#8353)
time.cast(str) as strftime (#8351)
fix logical dtypes in parallel list collection (#8349)
improve logical types of explode operation (#8348)
logical type in anonymous list builders (#8346)
address potential error caused by float division on time_unit scaling (#8337)
escape csv header names if they contain special chars (#8331)
nested struct/list/categorical logical/physical (#8334)
fix struct schema argument (#8327)
fix precision issue when converting pl.Datetime("ms") to Python datetime (#8332)
fix deserialize empty list (#8326)
List<Null> consistency (#8325)
fix coalesce schema (#8324)
don't do null propagation (#8322)
validate window_size user input in rolling_expr (#8318)
ensure invalid list eval raises (#8317)
fix typing overloads of read_excel (#8300)

🛠️ Other improvements

new hypothesis strategy that can generate data for List dtypes (#8400)
update duration docstring/example (#8392)
Upgrade ruff (#8380)
enhanced parametric testing for temporal dtypes (#8347)
Minor update to strptime (#8345)
adjust pytest config so as not to inadvertently prevent test debugging in IPython consoles (#8308)
add newline in pl.DataFrame.pivot docs (#8307)

Thank you to all our contributors for making this release possible!
@JoonHong-Kim, @MarcoGorelli, @StefanBRas, @alexander-beedie, @avimallu, @grantmcdermott, @jonashaag, @rben01, @ritchie46, @stinodego and @universalmind303

Contributors

jonashaag, rben01, and 9 other contributors

Assets 2

29 Oct 06:14

stinodego

py-0.17.5

c361d87

Python Polars 0.17.5

🚀 Performance improvements

use online variance kernel for aggregation (#8306)

Thank you to all our contributors for making this release possible!
@ritchie46

Contributors

ritchie46

Assets 2

17 Apr 11:09

github-actions

py-0.17.4

7f553da

Python Polars 0.17.4

🚀 Performance improvements

add specialized boolean aggregation for min/max (#8294)

✨ Enhancements

preserve time zone in combine (#8263)

🐞 Bug fixes

pass name to struct construction in aggregation (#8299)
improve nested list construction (#8278)
Truncate long column name in glimpse (#8281)
Fix DataFrame.sum returning empty column names (#8283)
always sort in top_k fast path (#8275)
don't use fast paths for sorted join if there are … (#8272)

🛠️ Other improvements

use concat_owned_array_unchecked when possible (#8274)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @ritchie46, @stinodego, @zaynetro and @zundertj

Contributors

zaynetro, alexander-beedie, and 4 other contributors

Assets 2

15 Apr 18:45

github-actions

py-0.17.3

4ffcb74

Python Polars 0.17.3

🏆 Highlights

support DataFrame init from pydantic model data (#8178)

🚀 Performance improvements

fail fast on non-inferable strings in strptime if no fmt is provided (#8111)
make chunks search more resilient (#8229)
SIMD accelerated arg_min/arg_max (via argminmax) (#8074)
speed up csv parsing for slower datetimes formats (#8213)
improve datetime interpret perf (#8209)
arr.eval run on groupby expression engine when possible (#8199)
~2-3x speedup for DataFrame init from pydantic models (#8181)

✨ Enhancements

add use_earliest argument to replace_time_zone for dealing with ambiguous datetimes (#8087)
fail loudly on .%f directive, as it differs from the Python standard library (#8237)
SQL CTE's (#8208)
automatically convert series OP expr -> pl.lit(series) OP expr where OP is arithmetic (#8225)
add pickle support for LazyFrame (#8220)
add duration cumsum and remainder (#8219)
support DataFrame init from nested dataclass, pydantic, and NamedTuple objects (#8185)
better algorithm for streaming unique (#8003)
Add approx distinct count via approx_unique() (#7937)
add percentiles to describe methods (#8169)
support DataFrame init from pydantic model data (#8178)
display skipped row if same number of rows (#8170)

🐞 Bug fixes

add special numpy float branch in anyvalue conversion (#8259)
fix boolean par materialization (#8257)
improve null/empty list construction (#8255)
fix offsets in parallel utf8 materialization (#8254)
nested struct logical type consistency (#8249)
keep literal state if elementwise function is applied (#8195)
decimal ensure backed arrow arrays have correct dtype (#8193)

🛠️ Other improvements

parametric/hypothesis testing code cleanups (#8253)
Rename strptime/strftime args (#8221)
change sampling ratio for groupby strategy (#8223)
Rename Expr.list to implode (#8165)
don't panic on err in offset_by (#8210)
re-enable test parallization for Windows tests (#8214)
Fix small typo: "im memory" -> "in memory" (#8187)
remove unused dtype_to_arrow_type (#8177)

Thank you to all our contributors for making this release possible!
@MarcoGorelli, @alexander-beedie, @avimallu, @borchero, @chitralverma, @clickingbuttons, @ghuls, @josh, @jvdd, @rben01, @ritchie46, @stinodego and @universalmind303

Contributors

josh, rben01, and 11 other contributors

Assets 2

11 Apr 14:46

github-actions

py-0.17.2

be4a856

Python Polars 0.17.2

✨ Enhancements

make unique expr serde and cmp (#8153)
Enhanced parametric testing DataFrame generation (#8149)
support negative index in pct_change (#8137)
add log1p to list of mathematical functions (#8102)

🐞 Bug fixes

object conversion in anyvalue (#8155)
Address a ~15% regression in import polars speed (#8151)
validate map lenghts (#8147)
fix row-wise init of UInt64 values that exceed Int64 upper bound (#8146)
implement list<null> constructor (#8143)
add all primitives to av_buffer builder (#8140)
struct is_in (#8139)
fix wrong display name of binary expressions (#8131)

🛠️ Other improvements

Enhanced parametric testing DataFrame generation (#8149)

Thank you to all our contributors for making this release possible!
@alexander-beedie, @borchero, @dependabot, @dependabot[bot], @jonashaag, @ritchie46 and @stinodego

Contributors

jonashaag, alexander-beedie, and 4 other contributors

Assets 2

Releases: pola-rs/polars

Python Polars 0.17.11

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors

Python Polars 0.17.10

🏆 Highlights

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors

Python Polars 0.17.9

Migration guide.

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors

Python Polars 0.17.8

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors

Python Polars 0.17.7

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors

Python Polars 0.17.6

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors

Python Polars 0.17.5

🚀 Performance improvements

Contributors

Python Polars 0.17.4

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors

Python Polars 0.17.3

🏆 Highlights

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors

Python Polars 0.17.2

✨ Enhancements

🐞 Bug fixes

🛠️ Other improvements

Contributors