Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QuerySpace creation fix #212

Merged
merged 1 commit into from
Aug 31, 2023
Merged

QuerySpace creation fix #212

merged 1 commit into from
Aug 31, 2023

Conversation

osopardo1
Copy link
Member

Description

In this PR, we fixed a bug in which the creation of the range filter that creates a QuerySpace wasn't taking into account the possibility of column transformations in Spark.

For example, imagine that we index the parameter named column1. If we filter by:

regex("prefix", "column1", 1) == "a"

Instead of filtering by the regex expression, QuerySpaceBuilder would understand that we are trying to filter all the values that contain column1 == "a"

Type of change

Describe the change you're making: how it affects the API, and user experience...

Is a bug fix.

Checklist:

Here is the list of things you should do before submitting this pull request:

  • New feature / bug fix has been committed following the Contribution guide.
  • Add comments to the code (make it easier for the community!).
  • Change the documentation.
  • Add tests.
  • Your branch is updated to the main branch (dependent changes have been merged).

How Has This Been Tested? (Optional)

Please describe the tests that you ran to verify your changes.

Change test that (unfortunately) passed because the filter range belonged to the same file (prefix_versace and versace have very similar values of hash and had been stored together).
Now the test checks if we return the whole dataset in that type of scenarios.

Later on, we can investigate how to manipulate more complex predicates. But let's start with the basics.

Test Configuration:

  • Spark Version: 3.2.x
  • Hadoop Version: 3.4.x
  • Cluster or local? Local

@codecov
Copy link

codecov bot commented Aug 31, 2023

Codecov Report

Merging #212 (555705c) into main (0d679db) will not change coverage.
The diff coverage is 100.00%.

❗ Current head 555705c differs from pull request most recent head a552b98. Consider uploading reports for the commit a552b98 to get more accurate results

@@           Coverage Diff           @@
##             main     #212   +/-   ##
=======================================
  Coverage   93.59%   93.59%           
=======================================
  Files          85       85           
  Lines        2123     2123           
  Branches      171      171           
=======================================
  Hits         1987     1987           
  Misses        136      136           
Files Changed Coverage Δ
...io/qbeast/spark/index/query/QuerySpecBuilder.scala 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@osopardo1 osopardo1 merged commit f9c7ab0 into Qbeast-io:main Aug 31, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants