Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-48530][SQL] Support for local variables in SQL Scripting #49445

Open
wants to merge 56 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
73cb01b
first commit
dusantism-db Dec 24, 2024
813d282
POC works
dusantism-db Dec 24, 2024
1c08f57
make column res helper more functional
dusantism-db Dec 25, 2024
18da02f
move variables map to SqlScriptingScope
dusantism-db Dec 25, 2024
cee5f1a
implement proper namespace (scope label name) for local variables
dusantism-db Dec 27, 2024
47934ab
qualified names
dusantism-db Dec 30, 2024
399d4e8
update todos
dusantism-db Jan 3, 2025
6efe764
resolve catalogs + check for duplicates
dusantism-db Jan 3, 2025
769607d
set variable and normalized identifiers
dusantism-db Jan 3, 2025
6225956
resolve fully qualified session vars in tempvarManager only and updat…
dusantism-db Jan 6, 2025
241fc05
tests first batch
dusantism-db Jan 6, 2025
068e1ec
add more tests
dusantism-db Jan 8, 2025
60335db
add error messages, more tests and some comments
dusantism-db Jan 8, 2025
65b69d3
rename TempVariableManager.scala and add more tests
dusantism-db Jan 8, 2025
fe5dc7b
remove old logic for dropping variables, update tests and add more tests
dusantism-db Jan 8, 2025
4f8d2c1
add cleanup for scripting execution, separate drop and create variabl…
dusantism-db Jan 9, 2025
ba5b8d2
fix resolvecatalogs and add more tests
dusantism-db Jan 9, 2025
33f0aac
refactor to support properly setting variables
dusantism-db Jan 9, 2025
be6052f
add error message for system and session label names
dusantism-db Jan 9, 2025
4b1e8e1
small fixes and cleanup
dusantism-db Jan 10, 2025
90b106b
Fix duplicate detection for set variablwe
dusantism-db Jan 10, 2025
7ba0923
Add test for DECLARE OR REPLACE but ignore it until FOR is fixed
dusantism-db Jan 10, 2025
cd4e932
execute immediate don't resolve vars from scripts. Problem remains wi…
dusantism-db Jan 10, 2025
fdf3c5a
cleanup
dusantism-db Jan 10, 2025
52cbd17
Merge remote-tracking branch 'upstream/master' into scripting-local-v…
dusantism-db Jan 13, 2025
c134fd4
fix merge mistake
dusantism-db Jan 13, 2025
3ea762d
fix merge mistake 2
dusantism-db Jan 13, 2025
8e9352a
fix comments
dusantism-db Jan 15, 2025
78042e3
Update CreateVar, SetVar and lookupVariable to work with Execute Imme…
dusantism-db Jan 16, 2025
40ffa83
add enum for lookup variable mode
dusantism-db Jan 17, 2025
4a546a4
convert scripting variable manager to threadlocal
dusantism-db Jan 17, 2025
15d5554
fix e2e test
dusantism-db Jan 17, 2025
a2b20c5
add comment
dusantism-db Jan 17, 2025
e3077a4
add comment and regenerate golden files
dusantism-db Jan 21, 2025
6ce8f9c
fix failing test
dusantism-db Jan 21, 2025
ccab52c
refactor SqlScriptingVariableManager to be LexicalThreadLocal singlet…
dusantism-db Jan 22, 2025
370bf65
renames
dusantism-db Jan 23, 2025
0cea838
tagging approach
dusantism-db Jan 24, 2025
9895c69
Revert "tagging approach"
dusantism-db Jan 24, 2025
cd888dd
analysiscontext withExecuteImmediate
dusantism-db Jan 24, 2025
4fe7ab5
remove into clause flag
dusantism-db Jan 25, 2025
8a6b536
address comments
dusantism-db Jan 27, 2025
db573c1
remove parameter from lookupVariable
dusantism-db Jan 27, 2025
dadd517
Merge remote-tracking branch 'upstream/master' into scripting-local-v…
dusantism-db Jan 27, 2025
680e5d7
resolve comments 1
dusantism-db Feb 6, 2025
7d3008e
Merge remote-tracking branch 'upstream/master' into scripting-local-v…
dusantism-db Feb 6, 2025
901aa6c
improve logic to work with exception handlers
dusantism-db Feb 7, 2025
34677c7
add check for existing variable in set, and add comments to findVariable
dusantism-db Feb 7, 2025
b814b97
Introduce FakeLocalCatalog, remove sessionVariablesOnly flags and upd…
dusantism-db Feb 7, 2025
8074c63
resolve comments
dusantism-db Feb 7, 2025
220aeae
throw error if var not found in setvarexec
dusantism-db Feb 7, 2025
45ca867
resolve comments again
dusantism-db Feb 10, 2025
e1f1098
update resolvecatalogs according to wenchens comments, and forbid dro…
dusantism-db Feb 12, 2025
61a753f
change variableManager api to use only nameParts and VariableDefiniti…
dusantism-db Feb 12, 2025
f29a8fc
add test for drop session var
dusantism-db Feb 12, 2025
e184f8c
forbid session, builtin and sys* label names
dusantism-db Feb 12, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ class ResolveCatalogs(val catalogManager: CatalogManager)

private def resolveCreateVariableName(nameParts: Seq[String]): ResolvedIdentifier = {
val ident = SqlScriptingLocalVariableManager.get()
.filterNot(AnalysisContext.get.isExecuteImmediate)
.filterNot(_ => AnalysisContext.get.isExecuteImmediate)
.getOrElse(catalogManager.tempVariableManager)
.createIdentifier(nameParts.last)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An Identifier is already created, we can directly return ResolvedIdentifier(FakeSystemCatalog, ident) here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, what do you mean already created? Here we create the identifier, which is dependent on scripting context in the case of local variables, and then we return it in ResolvedIdentifier(FakeSystemCatalog, ident) after checking for errors.


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ import org.apache.spark.sql.catalyst.catalog.{VariableDefinition, VariableManage
import org.apache.spark.sql.catalyst.expressions.Literal
import org.apache.spark.sql.connector.catalog.Identifier
import org.apache.spark.sql.errors.DataTypeErrorsBase
import org.apache.spark.sql.errors.QueryCompilationErrors.unresolvedVariableError
import org.apache.spark.sql.errors.QueryCompilationErrors.unresolvedVariableError

class SqlScriptingLocalVariableManager(context: SqlScriptingExecutionContext)
extends VariableManager with DataTypeErrorsBase {
Expand Down Expand Up @@ -60,42 +60,52 @@ class SqlScriptingLocalVariableManager(context: SqlScriptingExecutionContext)
initValue: Literal,
identifier: Identifier): Unit = {
def varDef = VariableDefinition(identifier, defaultValueSQL, initValue)
nameParts match {
// Unqualified case.
case Seq(name) =>
context.currentFrame.scopes
.findLast(_.variables.contains(name))
// Throw error if variable is not found. This shouldn't happen as the check is already
// done in SetVariableExec.
.orElse(throw unresolvedVariableError(nameParts, identifier.namespace().toIndexedSeq))
.map(_.variables.put(name, varDef))
findScopeOfVariable(nameParts)
.getOrElse(throw unresolvedVariableError(nameParts, identifier.namespace().toIndexedSeq))
.variables.put(nameParts.last, varDef)
dusantism-db marked this conversation as resolved.
Show resolved Hide resolved
}
dusantism-db marked this conversation as resolved.
Show resolved Hide resolved

override def get(nameParts: Seq[String]): Option[VariableDefinition] = {
findScopeOfVariable(nameParts).flatMap(_.variables.get(nameParts.last))
}

private def findScopeOfVariable(nameParts: Seq[String]): Option[SqlScriptingExecutionScope] = {
def isScopeOfVar(
nameParts: Seq[String],
scope: SqlScriptingExecutionScope
): Boolean = nameParts match {
case Seq(name) => scope.variables.contains(name)
// Qualified case.
case Seq(label, name) =>
context.currentFrame.scopes
.findLast(_.label == label)
.filter(_.variables.contains(name))
// Throw error if variable is not found. This shouldn't happen as the check is already
// done in SetVariableExec.
.orElse(throw unresolvedVariableError(nameParts, identifier.namespace().toIndexedSeq))
.map(_.variables.put(name, varDef))
case Seq(label, _) => scope.label == label
case _ =>
throw SparkException.internalError("ScriptingVariableManager.set expects 1 or 2 nameParts.")
throw SparkException.internalError("ScriptingVariableManager expects 1 or 2 nameParts.")
}
}

override def get(nameParts: Seq[String]): Option[VariableDefinition] = nameParts match {
// Unqualified case.
case Seq(name) =>
context.currentFrame.scopes
.findLast(_.variables.contains(name))
.flatMap(_.variables.get(name))
// Qualified case.
case Seq(label, name) =>
context.currentFrame.scopes
.findLast(_.label == label)
.flatMap(_.variables.get(name))
case _ =>
throw SparkException.internalError("ScriptingVariableManager.get expects 1 or 2 nameParts.")
// First search for variable in entire current frame.
val resCurrentFrame = context.currentFrame.scopes
.findLast(scope => isScopeOfVar(nameParts, scope))
if (resCurrentFrame.isDefined) {
return resCurrentFrame
}

// When searching in previous frames, for each frame we have to check only scopes before and
// including the scope where the previously checked frame is defined, as the frames
// should not access variables from scopes which are nested below it's definition.
dusantism-db marked this conversation as resolved.
Show resolved Hide resolved
var previousFrameDefinitionLabel = context.currentFrame.scopeLabel

context.frames.dropRight(1).reverseIterator.foreach(frame => {
val candidateScopes = frame.scopes.reverse.dropWhile(
scope => !previousFrameDefinitionLabel.contains(scope.label))

val scope = candidateScopes.findLast(scope => isScopeOfVar(nameParts, scope))
if (scope.isDefined) {
return scope
}
if (candidateScopes.nonEmpty) {
previousFrameDefinitionLabel = frame.scopeLabel
}
})
None
}

override def createIdentifier(name: String): Identifier =
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2381,22 +2381,25 @@ class SqlScriptingExecutionSuite extends QueryTest with SharedSparkSession {
}

test("local variable - execute immediate create session var") {
val sqlScript =
"""
|BEGIN
| EXECUTE IMMEDIATE 'DECLARE sessionVar = 5';
| SELECT system.session.sessionVar;
| SELECT sessionVar;
|END
|""".stripMargin
val expected = Seq(
Seq(Row(5)), // select system.session.sessionVar
Seq(Row(5)) // select sessionVar
)
verifySqlScriptResult(sqlScript, expected)
withSessionVariable("sessionVar") {
val sqlScript =
"""
|BEGIN
| EXECUTE IMMEDIATE 'DECLARE sessionVar = 5';
| SELECT system.session.sessionVar;
| SELECT sessionVar;
|END
|""".stripMargin
val expected = Seq(
Seq(Row(5)), // select system.session.sessionVar
Seq(Row(5)) // select sessionVar
)
verifySqlScriptResult(sqlScript, expected)
}
}

test("local variable - execute immediate create qualified session var") {
withSessionVariable("sessionVar") {
val sqlScript =
"""
|BEGIN
Expand All @@ -2410,6 +2413,7 @@ class SqlScriptingExecutionSuite extends QueryTest with SharedSparkSession {
Seq(Row(5)) // select sessionVar
)
verifySqlScriptResult(sqlScript, expected)
}
}

test("local variable - execute immediate set session var") {
Expand All @@ -2435,4 +2439,59 @@ class SqlScriptingExecutionSuite extends QueryTest with SharedSparkSession {
verifySqlScriptResult(sqlScript, expected)
}
}

test("local variable - handlers - triple chained handlers") {
val sqlScript =
"""
|BEGIN
| DECLARE OR REPLACE VARIABLE varOuter INT = 0;
| l1: BEGIN
| DECLARE OR REPLACE VARIABLE varL1 INT = 1;
| DECLARE EXIT HANDLER FOR SQLEXCEPTION
| BEGIN
| SELECT varOuter;
| SELECT varL1;
| END;
| l2: BEGIN
| DECLARE OR REPLACE VARIABLE varL2 = 2;
| DECLARE EXIT HANDLER FOR SQLEXCEPTION
| BEGIN
| SELECT varOuter;
| SELECT varL1;
| SELECT varL2;
| SELECT 1/0;
| END;
| l3: BEGIN
| DECLARE OR REPLACE VARIABLE varL3 = 3;
| DECLARE EXIT HANDLER FOR SQLEXCEPTION
| BEGIN
| SELECT varOuter;
| SELECT varL1;
| SELECT varL2;
| SELECT varL3;
| SELECT 1/0;
| END;

| SELECT 5;
| SELECT 1/0;
| SELECT 6;
| END;
| END;
| END;
|END
|""".stripMargin
val expected = Seq(
Seq(Row(5)),
Seq(Row(0)),
Seq(Row(1)),
Seq(Row(2)),
Seq(Row(3)),
Seq(Row(0)),
Seq(Row(1)),
Seq(Row(2)),
Seq(Row(0)),
Seq(Row(1))
)
verifySqlScriptResult(sqlScript, expected = expected)
}
}