You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched in the issues and found no similar issues.
Describe the bug
kyuuiby:1.9.0
Spark:3.4.2
Currently, when connecting to hivemetastore and Spark through kyuubi-hive-connector and testing performance with TPC-DS, it is found that some SQL cannot be dynamically partitioned.
q33.sql
with ss as (
select
i_manufact_id,sum(ss_ext_sales_price) total_sales
from
store_sales,
date_dim,
customer_address,
item
where
i_manufact_id in (select
i_manufact_id
from
item
where i_category in ('Home'))
and ss_item_sk = i_item_sk
and ss_sold_date_sk = d_date_sk
and d_year = 1998
and d_moy = 5
and ss_addr_sk = ca_address_sk
and ca_gmt_offset = -6
group by i_manufact_id),
cs as (
select
i_manufact_id,sum(cs_ext_sales_price) total_sales
from
catalog_sales,
date_dim,
customer_address,
item
where
i_manufact_id in (select
i_manufact_id
from
item
where i_category in ('Home'))
and cs_item_sk = i_item_sk
and cs_sold_date_sk = d_date_sk
and d_year = 1998
and d_moy = 5
and cs_bill_addr_sk = ca_address_sk
and ca_gmt_offset = -6
group by i_manufact_id),
ws as (
select
i_manufact_id,sum(ws_ext_sales_price) total_sales
from
web_sales,
date_dim,
customer_address,
item
where
i_manufact_id in (select
i_manufact_id
from
item
where i_category in ('Home'))
and ws_item_sk = i_item_sk
and ws_sold_date_sk = d_date_sk
and d_year = 1998
and d_moy = 5
and ws_bill_addr_sk = ca_address_sk
and ca_gmt_offset = -6
group by i_manufact_id)
select i_manufact_id ,sum(total_sales) total_sales
from (select * from ss
union all
select * from cs
union all
select * from ws) tmp1
group by i_manufact_id
order by total_sales
limit 100
If kyuubi-hive-connecotor is not used, the execution plan will have dynamic partition pruning
If kyuubi-hive-connecotor is used,
there will be no dynamic partition pruning in the execution plan
pan3793
changed the title
[Bug] when I use kyuubi's connector, Spark dynamic partitioning cannot be used.
Spark Hive connector supports dynamic partition prunning
Feb 27, 2025
Code of Conduct
Search before asking
Describe the bug
kyuuiby:1.9.0
Spark:3.4.2
Currently, when connecting to hivemetastore and Spark through kyuubi-hive-connector and testing performance with TPC-DS, it is found that some SQL cannot be dynamically partitioned.
q33.sql
If kyuubi-hive-connecotor is not used, the execution plan will have dynamic partition pruning
If kyuubi-hive-connecotor is used,
there will be no dynamic partition pruning in the execution plan
Affects Version(s)
1.9.0
Kyuubi Server Log Output
Kyuubi Engine Log Output
Kyuubi Server Configurations
Kyuubi Engine Configurations
Additional context
Dynamic partition cutting core logic:
private def prune(plan: LogicalPlan): LogicalPlan = {
plan transformUp {
.......
var filterableScan = getFilterableTableScan(l, left)
if (filterableScan.isDefined && canPruneLeft(joinType) &&
hasPartitionPruningFilter(right)) {
newLeft = insertPredicate(l, newLeft, r, right, rightKeys, filterableScan.get)
} else {
filterableScan = getFilterableTableScan(r, right)
if (filterableScan.isDefined && canPruneRight(joinType) &&
hasPartitionPruningFilter(left) ) {
newRight = insertPredicate(r, newRight, l, left, leftKeys, filterableScan.get)
}
}
case _ =>
}
.......
Join(newLeft, newRight, joinType, Some(condition), hint)
}
}
def getFilterableTableScan(a: Expression, plan: LogicalPlan): Option[LogicalPlan] = {
val srcInfo: Option[(Expression, LogicalPlan)] = findExpressionAndTrackLineageDown(a, plan)
srcInfo.flatMap {
case (resExp, l: LogicalRelation) =>
l.relation match {
case fs: HadoopFsRelation =>
val partitionColumns = AttributeSet(
l.resolve(fs.partitionSchema, fs.sparkSession.sessionState.analyzer.resolver))
if (resExp.references.subsetOf(partitionColumns)) {
return Some(l)
} else {
None
}
case _ => None
}
case (resExp, l: HiveTableRelation) =>
if (resExp.references.subsetOf(AttributeSet(l.partitionCols))) {
return Some(l)
} else {
None
}
case (resExp, r @ DataSourceV2ScanRelation(_, scan: SupportsRuntimeV2Filtering, _, _, _)) =>
val filterAttrs = V2ExpressionUtils.resolveRefs[Attribute](scan.filterAttributes, r)
if (resExp.references.subsetOf(AttributeSet(filterAttrs))) {
Some(r)
} else {
None
}
case _ => None
}
}
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: