Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for USING to SQL unparser #11636

Merged
merged 3 commits into from
Jul 26, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
122 changes: 89 additions & 33 deletions datafusion/sql/src/unparser/plan.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,9 @@
// specific language governing permissions and limitations
// under the License.

use datafusion_common::{internal_err, not_impl_err, plan_err, DataFusionError, Result};
use datafusion_common::{
internal_err, not_impl_err, plan_err, Column, DataFusionError, Result,
};
use datafusion_expr::{
expr::Alias, Distinct, Expr, JoinConstraint, JoinType, LogicalPlan, Projection,
};
Expand Down Expand Up @@ -368,37 +370,11 @@ impl Unparser<'_> {
self.select_to_sql_recursively(input, query, select, relation)
}
LogicalPlan::Join(join) => {
match join.join_constraint {
JoinConstraint::On => {}
JoinConstraint::Using => {
return not_impl_err!(
"Unsupported join constraint: {:?}",
join.join_constraint
)
}
}

// parse filter if exists
let join_filter = match &join.filter {
Some(filter) => Some(self.expr_to_sql(filter)?),
None => None,
};

// map join.on to `l.a = r.a AND l.b = r.b AND ...`
let eq_op = ast::BinaryOperator::Eq;
let join_on = self.join_conditions_to_sql(&join.on, eq_op)?;

// Merge `join_on` and `join_filter`
let join_expr = match (join_filter, join_on) {
(Some(filter), Some(on)) => Some(self.and_op_to_sql(filter, on)),
(Some(filter), None) => Some(filter),
(None, Some(on)) => Some(on),
(None, None) => None,
};
let join_constraint = match join_expr {
Some(expr) => ast::JoinConstraint::On(expr),
None => ast::JoinConstraint::None,
};
let join_constraint = self.join_constraint_to_sql(
join.join_constraint,
&join.on,
join.filter.as_ref(),
)?;

let mut right_relation = RelationBuilder::default();

Expand Down Expand Up @@ -582,9 +558,89 @@ impl Unparser<'_> {
}
}

/// Convert the components of a USING clause to the USING AST
fn join_using_to_sql(
&self,
join_conditions: &[(Expr, Expr)],
) -> Result<ast::JoinConstraint> {
let mut idents = Vec::with_capacity(join_conditions.len());
for (left, right) in join_conditions {
match (left, right) {
(
Expr::Column(Column {
relation: _,
name: left_name,
}),
Expr::Column(Column {
relation: _,
name: right_name,
}),
) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than returning "not implemented" it would be nice to just fall back to the existing code

I think you could do it with a match guard like

Suggested change
) => {
) if left_name == right_name => {

if left_name != right_name {
// USING is only valid when the column names are the
// same, so they should never be different
return not_impl_err!(
"Unsupported USING with different column names"
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, not_impl_err is for some todo items. If this case never happens, I prefer to use internal_err here.

Suggested change
return not_impl_err!(
"Unsupported USING with different column names"
);
return internal_err!(
"Unsupported USING with different column names"
);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this case can happen -- like JOIN ON (t1.a = t2.b)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can happen with ON, but it can't happen with USING, right? This function is only for USING.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I thought about this some more - these two cases should be impossible for a LogicalPlan made from SQL (you can't do JOIN … USING (t1.a = t1.b), but could be created by a user directly creating a LogicalPlan::Join(Join { … }).

Given that, fallback makes sense, and I'll add a comment to explain.

}
idents.push(self.new_ident_quoted_if_needs(left_name.to_string()));
}
// USING is only valid with column names; arbitrary expressions
// are not allowed
_ => {
return not_impl_err!("Unsupported USING with non-column expressions")
}
}
}
Ok(ast::JoinConstraint::Using(idents))
}

/// Convert a join constraint and associated conditions and filter to a SQL AST node
fn join_constraint_to_sql(
&self,
constraint: JoinConstraint,
conditions: &[(Expr, Expr)],
filter: Option<&Expr>,
) -> Result<ast::JoinConstraint> {
match (constraint, conditions, filter) {
// No constraints
(JoinConstraint::On, [], None) => Ok(ast::JoinConstraint::None),
// Only equi-join conditions
(JoinConstraint::On, conditions, None) => {
let expr =
self.join_conditions_to_sql(conditions, ast::BinaryOperator::Eq)?;
match expr {
Some(expr) => Ok(ast::JoinConstraint::On(expr)),
None => Ok(ast::JoinConstraint::None),
}
}
// More complex filter with non-equi-join conditions; so we combine
// all conditions into a single AST Expr
(JoinConstraint::On, conditions, Some(filter)) => {
let filter_expr = self.expr_to_sql(filter)?;
let expr =
self.join_conditions_to_sql(conditions, ast::BinaryOperator::Eq)?;
match expr {
Some(expr) => {
let join_expr = self.and_op_to_sql(filter_expr, expr);
Ok(ast::JoinConstraint::On(join_expr))
}
None => Ok(ast::JoinConstraint::On(filter_expr)),
}
}

(JoinConstraint::Using, conditions, None) => {
self.join_using_to_sql(conditions)
}
(JoinConstraint::Using, _, Some(_)) => {
not_impl_err!("Unsupported USING with filter")
}
}
}

fn join_conditions_to_sql(
&self,
join_conditions: &Vec<(Expr, Expr)>,
join_conditions: &[(Expr, Expr)],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree -- this is a nice refactoring and very nicely commented code.

eq_op: ast::BinaryOperator,
) -> Result<Option<ast::Expr>> {
// Only support AND conjunction for each binary expression in join conditions
Expand Down
2 changes: 2 additions & 0 deletions datafusion/sql/tests/cases/plan_to_sql.rs
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ fn roundtrip_statement() -> Result<()> {
"select 1;",
"select 1 limit 0;",
"select ta.j1_id from j1 ta join (select 1 as j1_id) tb on ta.j1_id = tb.j1_id;",
"select ta.j1_id from j1 ta join (select 1 as j1_id) tb using (j1_id);",
"select ta.j1_id from j1 ta join (select 1 as j1_id) tb on ta.j1_id = tb.j1_id where ta.j1_id > 1;",
"select ta.j1_id from (select 1 as j1_id) ta;",
"select ta.j1_id from j1 ta;",
Expand Down Expand Up @@ -142,6 +143,7 @@ fn roundtrip_statement() -> Result<()> {
r#"SELECT id, count(distinct id) over (ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING),
sum(id) OVER (PARTITION BY first_name ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) from person"#,
"SELECT id, sum(id) OVER (PARTITION BY first_name ROWS BETWEEN 5 PRECEDING AND 2 FOLLOWING) from person",
"WITH t1 AS (SELECT j1_id AS id, j1_string name FROM j1), t2 AS (SELECT j2_id AS id, j2_string name FROM j2) SELECT * FROM t1 JOIN t2 USING (id, name)",
];

// For each test sql string, we transform as follows:
Expand Down