You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Problem:
Executing "system.add_files(... parallelism => 2)" results in a NotSerializableException for an instance ExecutorService:
Task not serializable: java.io.NotSerializableException: java.util.concurrent.Executors$DelegatedExecutorService
in MapPartitionsRDD[16] at collectAsList at SparkTableUtil.java:792, org.apache.spark.ShuffleDependency@12d6880f
Expectations:
add_files runs without exception, also if parallelism > 1.
Suggestions:
Dont pass ExecutorService instance from the Spark driver as argument to listPartition in SparkTableUtils.java:759, but create ExecutorService in listPartitions on the Spark executor.
Willingness to contribute
I can contribute a fix for this bug independently
I would be willing to contribute a fix for this bug with guidance from the Iceberg community
I cannot contribute a fix for this bug at this time
The text was updated successfully, but these errors were encountered:
Apache Iceberg version
1.6.1 (latest release)
Query engine
Spark
Please describe the bug 🐞
Problem:
Executing "system.add_files(... parallelism => 2)" results in a NotSerializableException for an instance ExecutorService:
Task not serializable: java.io.NotSerializableException: java.util.concurrent.Executors$DelegatedExecutorService
in MapPartitionsRDD[16] at collectAsList at SparkTableUtil.java:792, org.apache.spark.ShuffleDependency@12d6880f
Expectations:
add_files runs without exception, also if parallelism > 1.
Suggestions:
Dont pass ExecutorService instance from the Spark driver as argument to listPartition in SparkTableUtils.java:759, but create ExecutorService in listPartitions on the Spark executor.
Willingness to contribute
The text was updated successfully, but these errors were encountered: