-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Spark 3.5.0 shell classloader issue with the plugin #9500
Conversation
Signed-off-by: Thomas Graves <[email protected]>
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I am checking if there is an easy smoke test we could put in premerge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manually tested on Databricks 13.3
The base branch was changed.
build |
I accidentally had this targetted for 23.12, changed to 23.10. @gerashegalov @abellina mind looking again |
) Contributes to #5704 This PR aims to catch issues like #9500. It modifies run_pyspark_from_build mostly to avoid recreating the logic of figuring out jar location etc. Currently it may not catch this if do not have Spark 3.5.0 CI yet. But this is how it could reproduce the #9500 ```Bash $ SPARK_HOME=~/dist/spark-3.1.1-bin-hadoop3.2 SPARK_SHELL_SMOKE_TEST=1 ./integration_tests/run_pyspark_from_build.sh ... + grep -F 'res0: Array[org.apache.spark.sql.Row] = Array([4950])' res0: Array[org.apache.spark.sql.Row] = Array([4950]) + echo 'SUCCESS spark-shell smoke test...' SUCCESS spark-shell smoke test $ echo $? 0 $ SPARK_HOME=~/dist/spark-3.5.0-bin-hadoop3 SPARK_SHELL_SMOKE_TEST=1 ./integration_tests/run_pyspark_from_build.sh $ echo $? 1 SPARK_SHELL_SMOKE_TEST=1 \ PYSP_TEST_spark_shuffle_manager=com.nvidia.spark.rapids.spark311.RapidsShuffleManager \ SPARK_HOME=~/dist/spark-3.1.1-bin-hadoop3.2 \ ./integration_tests/run_pyspark_from_build.sh + echo 'SUCCESS spark-shell smoke test' SUCCESS spark-shell smoke test $ echo $? 0 SPARK_SHELL_SMOKE_TEST=1 \ PYSP_TEST_spark_shuffle_manager=com.nvidia.spark.rapids.spark350.RapidsShuffleManager \ SPARK_HOME=~/dist/spark-3.5.0-bin-hadoop3 \ ./integration_tests/run_pyspark_from_build.sh $ echo $? 1 ``` Signed-off-by: Gera Shegalov <[email protected]>
fixes #9498
Spark 3.5.0 (apache/spark@1486835) changed the package name of the ExecutorClassLoader. Our ShimLoader code explicitly looks for it by the old package name so we have to update our ShimLoader to look for either name.
Manually tested with spark-shell on Spark 3.5.0 and older 3.3.0 and both worked.