Skip to content

Releases: Eventual-Inc/Daft

v0.1.14

24 Aug 23:35
7fa9e64
Compare
Choose a tag to compare

Changes

✨ New Features

  • [FEAT] add flag to use multithreaded io for parquet_read_table @samster25 (#1298)
  • [FEAT] Add Retry Mode, connection timeout, and read timeout to S3Config @samster25 (#1293)
  • [FEAT] [New Query Planner] Add optimization framework and PushDownFilter rule. @clarkzinzow (#1284)

👾 Bug Fixes

🧰 Maintenance

v0.1.13

18 Aug 22:25
857162c
Compare
Choose a tag to compare

Changes

✨ New Features

  • [FEAT] Add Flag to_arrow to convert large string arrays @samster25 (#1283)

👾 Bug Fixes

  • [BUG] try release profile rather than dev-bench for daft profiling @samster25 (#1280)

🧰 Maintenance

  • [CHORE] reduce severity of region reroute logs to debug @samster25 (#1279)

v0.1.12

16 Aug 17:01
2b82afc
Compare
Choose a tag to compare

Changes

✨ New Features

👾 Bug Fixes

  • [BUG] Use manylinux_2_24 for aarch64 linux to be able to publish manylinux2014 @samster25 (#1275)

📖 Documentation

  • [FEAT] [New Query Planner] Support for Ray runner in new query planner. @clarkzinzow (#1265)

🧰 Maintenance

  • [CHORE] Refactor arrays to share a FromArrow constructor trait @jaychia (#1276)
  • [CHORE] Bump rust nightly channel date @jaychia (#1255)

⬆️ Dependencies

4 changes

v0.1.11

11 Aug 19:36
48b46b3
Compare
Choose a tag to compare

Changes

✨ New Features

  • [FEAT] [New Query Plan] Add support for Projection and Coalesce, enable many tests @clarkzinzow (#1256)
  • [FEAT] [New Query Planner] Add support for Concat. @clarkzinzow (#1254)
  • [FEAT] [New Query Planner] Add support for tabular writes. @clarkzinzow (#1252)
  • [FEAT] Multi-partition aggregate; Coalesce @xcharleslin (#1249)
  • [FEAT] [New Query Planner] Add support for Sort, Repartition, and Distinct in new query planner. @clarkzinzow (#1248)
  • [FEAT] Add Azure Support for Native Downloader @samster25 (#1250)
  • [FEAT] Locally unique semantic IDs for Expressions @xcharleslin (#1243)
  • [FEAT] Read parquet tables with int96 coercion option @jaychia (#1231)
  • [FEAT] [New Query Plan] Add support for CSV scans, JSON scans, in-memory scans and caching materialized results. @clarkzinzow (#1246)
  • [FEAT] Native Downloader add Retry Config parameters @samster25 (#1244)
  • [FEAT] (Single partition only) DataFrame.sum() via Rust planner @xcharleslin (#1230)
  • [FEAT] [New Query Planner] Logical --> physical translation, physical plan execution. @clarkzinzow (#1232)
  • [FEAT] native parquet correctness checks @samster25 (#1225)
  • [FEAT] add session token as input to io config @samster25 (#1224)

🚀 Performance Improvements

👾 Bug Fixes

📖 Documentation

🧰 Maintenance

  • [CHORE] Update test to only use store_schema kwarg for pa>=11 @jaychia (#1253)
  • [FEAT] (Single partition only) DataFrame.sum() via Rust planner @xcharleslin (#1230)
  • [CHORE] [New Query Planner] Introduce LogicalPlanBuilder and QueryPlanner interfaces to hide query planner implementations. @clarkzinzow (#1245)
  • [CHORE] LogicalPlan: Add display improvements, and Filter @xcharleslin (#1221)
  • [CHORE] Add unit tests for int96 timestamps @jaychia (#1229)
  • [DOCS][CHORE] Add docs for IOConfig and S3Config @jaychia (#1227)
  • [CHORE] disable mac test for lack of docker @samster25 (#1223)
  • [CHORE] Begin integrating Rust Logical Plan with Dataframe API @xcharleslin (#1207)
  • [CHORE] integration tests for nightly platform wheels @samster25 (#1219)
  • [CHORE] Remove existing LogicalPlan from all execution concepts @xcharleslin (#1208)
  • [CHORE] Add endpoints to simulate rate-limiting on AWS S3 buckets @jaychia (#1220)
  • [CHORE] Add pytest marker for integration @jaychia (#1211)
  • [CHORE] Add s3 fixtures for retrying logic @jaychia (#1206)
  • [CHORE] Add developer flag to use Rust query planner @xcharleslin (#1205)
  • [CHORE] Rust Logical plan skeleton @xcharleslin (#1192)

⬆️ Dependencies

7 changes

v0.1.10

31 Jul 18:36
bc11e57
Compare
Choose a tag to compare

Changes

✨ New Features

  • [FEAT] Enable feature-flagged native downloader in daft.read_parquet @jaychia (#1190)
  • [FEAT] parquet reader refactor, add parquet_stats_reader and parquet_schema_reader (1/2) @samster25 (#1191)

🚀 Performance Improvements

🧰 Maintenance

⬆️ Dependencies

6 changes

v0.1.9

25 Jul 04:16
751b839
Compare
Choose a tag to compare

Changes

🏆 Highlights

  • [FEAT] [Tensor] Add support for Tensor and FixedShapeTensor types. @clarkzinzow (#1073)

✨ New Features

🚀 Performance Improvements

  • [PERF] Simple Read Planner and RangeReader for Native Parquet Reader @samster25 (#1172)

👾 Bug Fixes

📖 Documentation

🧰 Maintenance

⬆️ Dependencies

10 changes

v0.1.8

05 Jul 18:53
447cb2f
Compare
Choose a tag to compare

Changes

✨ New Features

👾 Bug Fixes

  • [BUG] S3 Downloader set default region when region not detected @samster25 (#1100)

📖 Documentation

  • [CHORE] Update README.rst for image downloading @jaychia (#1109)
  • [DOCS] Update image tutorials with .image namespaced expressions @jaychia (#1110)

🧰 Maintenance

  • [CHORE] Tidy up typing of binary ops [1/2] @xcharleslin (#1114)
  • [CHORE] Pin Pydantic to < 2 @jaychia (#1115)
  • [CHORE] Remove rogue print statement @jaychia (#1112)
  • [CHORE] Install wheel together with requirements in release build @jaychia (#1111)
  • [CHORE] Update README.rst for image downloading @jaychia (#1109)
  • [CHORE] Adding more test fixtures for different I/O sources @jaychia (#1083)
  • [CHORE] Cache build artifacts in target folder @jaychia (#1104)
  • [CHORE] Fix CI caching to cache integration test builds separately @jaychia (#1101)
  • [CHORE] Use maturin directly instead of multiplatform build step @jaychia (#1099)

⬆️ Dependencies

v0.1.7

26 Jun 18:36
9063395
Compare
Choose a tag to compare

Changes

🏆 Highlights

  • [FEAT] Add DataFrame.to_torch_map_dataset and .to_torch_iter_dataset. @xcharleslin (#1086)
  • [PERF] Rust based url downloading with error handling @samster25 (#1061)

✨ New Features

  • [FEAT] Enable Native Downloader IO Config @samster25 (#1090)
  • [FEAT] Add DataFrame.to_torch_map_dataset and .to_torch_iter_dataset. @xcharleslin (#1086)
  • [FEAT] DataFrame.__iter__() and .iter_partitions() @xcharleslin (#1062)
  • [FEAT] New DataType: Duration (without arithmetic) @xcharleslin (#1051)
  • [FEAT] [Images] [9/N] Infer Image type for PIL images on ingress. @clarkzinzow (#1067)
  • [FEAT] Automatically cast logical types to Python objects on Series.to_pylist(). @clarkzinzow (#1063)
  • [FEAT] [Images] [8/N] Add encoding and resizing support for fixed-shape images. @clarkzinzow (#1052)
  • Dataframe Iter 1/n: Physical plan streams results into Runner. @xcharleslin (#1060)

🚀 Performance Improvements

👾 Bug Fixes

📖 Documentation

  • [FEAT] Add DataFrame.to_torch_map_dataset and .to_torch_iter_dataset. @xcharleslin (#1086)
  • [CHORE] Fix filepath for autogeneration of .list.join docs @jaychia (#1084)
  • In CI, limit tutorial to 500 rows @xcharleslin (#1076)
  • [DOCS] Embeddings tutorial: Temporarily remove full dataset @xcharleslin (#1039)
  • [DOCS] Remove release notes from documentation, link to Github instead @jaychia (#1049)

🧰 Maintenance

  • [CHORE] set dependabot schedule to weekly @samster25 (#1085)
  • [CHORE] Refactor integration test to use wheel built for release @jaychia (#1087)
  • [CHORE] unpin numpy version for py<3.8 @jaychia (#1088)
  • [CHORE] Fix filepath for autogeneration of .list.join docs @jaychia (#1084)
  • [CHORE] Crate Smash v1 @samster25 (#1080)
  • [CHORE] Scheduler cleanup: merge logical_op_runners.py into execution_step @xcharleslin (#1020)
  • [CHORE] Inline the label enforcer into the release drafter wf @jaychia (#1057)
  • [CHORE] Fix naming of "Release Drafter" workflow in trigger @jaychia (#1055)
  • [CHORE] Add new trigger to run PR label enforcement after Release Drafter @jaychia (#1054)
  • [CHORE][CI] Use pyarrow Table sort API that's compatible with older pyarrow versions @clarkzinzow (#1053)

⬆️ Dependencies

4 changes

v0.1.6

14 Jun 22:12
23f784f
Compare
Choose a tag to compare

Changes

🏆 Highlights

✨ New Features

  • [FEAT] Support for Timestamp datatype. @xcharleslin (#1032)
  • [FEAT] Thread user-provided schema through to DataFrame reads @jaychia (#1024)
  • [FEAT] Daft Image viz support. Remove Tabulate dependency. @xcharleslin (#1027)
  • [FEAT] Dataframe Concats @jaychia (#1023)
  • [FEAT] Add kernels for .list.join on a list[utf8] column @jaychia (#989)
  • [FEAT][Table-Read-Schema 2/3] Add table casting logic @jaychia (#1012)
  • [FEAT][Table-Read-Schema 1/3] Split reading tabular file formats into 2 method calls @jaychia (#1010)
  • [FEAT][Images] [7/N] Add image encoding support. @clarkzinzow (#1013)
  • [FEAT] Visualization cleanup 2/n: Add repr_html to Series, Table, and PyO3 @xcharleslin (#1018)
  • [FEAT] Visualization cleanup (1/n): Use Table for repr @xcharleslin (#1011)

📖 Documentation

🧰 Maintenance

  • [CHORE] Update cargo version to v0.1.6 @jaychia (#1047)
  • [CHORE] Add a GitHub action to enforce labels are added to the PR before merging @jaychia (#1045)
  • [CHORE] Fix CI TPCH data generation for old deprecated kwarg @jaychia (#1044)
  • [CHORE] Fix footer of release-drafter @jaychia (#1043)
  • [CHORE] Add release-drafter files @jaychia (#1042)
  • [CHORE][CI] Fix flakiness in Datasets integration tests. @clarkzinzow (#1017)

⬆️ Dependencies

7 changes

v0.1.5

14 Jun 02:30
a54c534
Compare
Choose a tag to compare

The Daft 0.1.5 release features better series exporting, bugfixes and improved documentation.

Enhancements

  • Enable Cast from Image to Python via Numpy #990

Bug Fixes

  • Fix Image Resize/Decode Expressions #1001

Build Changes

  • Python script for subprefixing s3 tpch files #997
  • Update pyo3-log from 0.8.1 to 0.8.2 #996
  • Update hypothesis from 6.75.9 to 6.76.0 #995

Documentation

  • Include Dataframe comparison and related projects in readme #1005
  • Include Benchmarks in Readme #1003
  • Add Red Pajamas Tutorial to docs #1002
  • Include Blog in docs #1000
  • Update Datatype docs for complex types #999