Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49984][CORE] Fix duplicate JVM options #48488

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Kimahriman
Copy link
Contributor

What changes were proposed in this pull request?

Fix how the JVM options are supplemented with Java module options and IPv6 options. Get and set based on the exact value of spark.*.extraJavaOptions, and don't use the helper config which combines both default and extra Java options.

Why are the changes needed?

The current approach using DRIVER_JAVA_OPTIONS and EXECUTOR_JAVA_OPTIONS loads both the spark.*.extraJavaOptions and spark.*.defaultJavaOptions, adds the "supplemental" options, and then resaves the result in spark.*.extraJavaOptions. This is because DRIVER_JAVA_OPTIONS is a config like:

  private[spark] val DRIVER_JAVA_OPTIONS =
    ConfigBuilder(SparkLauncher.DRIVER_EXTRA_JAVA_OPTIONS)
      .withPrepended(SparkLauncher.DRIVER_DEFAULT_JAVA_OPTIONS)

The result is spark.*.defaultJavaOptions being added multiple times to spark.*.extraJavaOptions, and when the full options are actually pulled to launch a process, three copies of what was in the spark.*.defaultJavaOptions ends up being included on the command line.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Manual testing

Before:

$ pyspark --conf spark.driver.defaultJavaOptions="-Dfoo=bar"
>>> spark.conf.get('spark.driver.extraJavaOptions')
'-Djava.net.preferIPv6Addresses=false -Dfoo=bar -XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/jdk.internal.ref=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED -Djdk.reflect.useDirectMethodHandle=false -Dfoo=bar'
>>> spark.conf.get('spark.driver.defaultJavaOptions')
'-Dfoo=bar'

After:

$ pyspark --conf spark.driver.defaultJavaOptions="-Dfoo=bar"
>>> spark.conf.get('spark.driver.extraJavaOptions')
'-Djava.net.preferIPv6Addresses=false -XX:+IgnoreUnrecognizedVMOptions --add-modules=jdk.incubator.vector --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/jdk.internal.ref=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED -Djdk.reflect.useDirectMethodHandle=false -Dio.netty.tryReflectionSetAccessible=true'
>>> spark.conf.get('spark.driver.defaJavultJavaOptions')
'-Dfoo=bar'

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the CORE label Oct 16, 2024
@Kimahriman
Copy link
Contributor Author

@LuciferYang @dongjoon-hyun since you created these methods

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Could you add a test case, @Kimahriman ?

@LuciferYang
Copy link
Contributor

+1, Agree with @dongjoon-hyun

@Kimahriman
Copy link
Contributor Author

Didn't see any tests when these were added, but added a test to the SparkContextSuite

@Kimahriman Kimahriman force-pushed the java-option-overrides branch from 0783937 to ce7212d Compare October 18, 2024 15:15
@Kimahriman Kimahriman force-pushed the java-option-overrides branch from 346993e to f375fad Compare February 8, 2025 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants