Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ColumnStats on Append and Timestamp handling #205

Merged
merged 8 commits into from
Aug 1, 2023

Conversation

osopardo1
Copy link
Member

@osopardo1 osopardo1 commented Aug 1, 2023

Description

This PR fixes bugs #204 and #195

Type of change

It is a bug fix for Timestamp and Column Stats handling. Both issues were touching the same code, so I find more useful to solve them both together.

  1. Issue Appending data with new columnStats fails in some cases #195 was due to an inconsistent creation of Revisions when appending new data. The merge of Revisions was not done correctly if the user specify a wider region of the space.
  2. Issue Column stats for Timestamp are not saved properly #204 was due to bad parsing of JSON Timestamps.

Checklist:

Here is the list of things you should do before submitting this pull request:

  • New feature / bug fix has been committed following the Contribution guide.
  • Add comments to the code (make it easier for the community!).
  • Change the documentation.
  • Add tests.
  • Your branch is updated to the main branch (dependent changes have been merged).

How Has This Been Tested? (Optional)

This is tested with two different tests in SparkRevisionFactoryTest.

Test Configuration:

  • Spark Version: 3.3.x
  • Hadoop Version: 3.3.4
  • Cluster or local? Local

@osopardo1 osopardo1 marked this pull request as ready for review August 1, 2023 07:17
@codecov
Copy link

codecov bot commented Aug 1, 2023

Codecov Report

Merging #205 (1f67422) into main (35bc1f0) will decrease coverage by 0.12%.
The diff coverage is 94.44%.

❗ Current head 1f67422 differs from pull request most recent head 72ae325. Consider uploading reports for the commit 72ae325 to get more accurate results

@@            Coverage Diff             @@
##             main     #205      +/-   ##
==========================================
- Coverage   93.87%   93.76%   -0.12%     
==========================================
  Files          85       85              
  Lines        2091     2100       +9     
  Branches      172      173       +1     
==========================================
+ Hits         1963     1969       +6     
- Misses        128      131       +3     
Files Changed Coverage Δ
...ain/scala/io/qbeast/spark/table/IndexedTable.scala 94.33% <93.33%> (-0.61%) ⬇️
...scala/io/qbeast/spark/internal/QbeastOptions.scala 96.42% <100.00%> (+0.27%) ⬆️

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@osopardo1 osopardo1 merged commit 3edd55b into Qbeast-io:main Aug 1, 2023
1 check passed
@osopardo1 osopardo1 deleted the 195-columnStats-and-timestamp branch August 2, 2023 05:51
osopardo1 added a commit to osopardo1/qbeast-spark that referenced this pull request Aug 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants