-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions about the project #38
Comments
Hi @MadL1me thank you for the interest!
Federated joins are not optimized yet, see #23 for some discussion of this. Most queries will see a full table scan and datafusion will perform the join locally. It would be an awesome improvement to push down more work to the federated table providers.
If datafusion-federation as a whole matures and proves itself useful to a significant portion of the overall user-base, then yes it could be merged into the upstream repo. We could also continue pushing upstream small bits of functionality over time. We have actually already done this for the Plan->SQL code apache/datafusion#9494 . |
Thanks for the info! I'll look forward to your progress in experimenting with query federations . |
* Fix OuterReferenceColumns not being rewritten correctly * wip
Hi there! I stumbled upon this project from this discussion: apache/datafusion#970
First of all - thanks everybody for this repo! The federation support for DataFusion insanely relevant to me, and I was thinking about building similar thing, until I found this project. I have a few questions regarding the performance of certain situations, which are mostly relevant for me.
The first one is - how performant is the join of two remote tables, and how is it works? Are you doing smth like querying join eq operands, doing hash join in memory and fetching relevant tables? (for example, a join of two different PostgreSQL tables from different servers, a.k.a FDW). Or there's no optimisations yet in this regard?
Also, I was curious about the final goal of the project - would it be merged into mainstream of DataFusion repo, or it is expected to be in different repo and crate? Thanks in advance
The text was updated successfully, but these errors were encountered: