Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

procedure add_files parallelism > 1 -> NotSerializableException #11147

Open
3 tasks
zzeekk opened this issue Sep 16, 2024 · 2 comments · May be fixed by #11157
Open
3 tasks

procedure add_files parallelism > 1 -> NotSerializableException #11147

zzeekk opened this issue Sep 16, 2024 · 2 comments · May be fixed by #11157
Labels
bug Something isn't working

Comments

@zzeekk
Copy link

zzeekk commented Sep 16, 2024

Apache Iceberg version

1.6.1 (latest release)

Query engine

Spark

Please describe the bug 🐞

Problem:
Executing "system.add_files(... parallelism => 2)" results in a NotSerializableException for an instance ExecutorService:
Task not serializable: java.io.NotSerializableException: java.util.concurrent.Executors$DelegatedExecutorService
in MapPartitionsRDD[16] at collectAsList at SparkTableUtil.java:792, org.apache.spark.ShuffleDependency@12d6880f

Expectations:
add_files runs without exception, also if parallelism > 1.

Suggestions:
Dont pass ExecutorService instance from the Spark driver as argument to listPartition in SparkTableUtils.java:759, but create ExecutorService in listPartitions on the Spark executor.

Willingness to contribute

  • I can contribute a fix for this bug independently
  • I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • I cannot contribute a fix for this bug at this time
@zzeekk zzeekk added the bug Something isn't working label Sep 16, 2024
@manuzhang
Copy link
Contributor

@zzeekk Thanks for reporting this bug. I will look into it.

@zzeekk
Copy link
Author

zzeekk commented Sep 21, 2024

Thanks a lot @manuzhang, looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants