Releases: apache/iceberg
Releases · apache/iceberg
Apache Iceberg 1.7.0
What's Changed
- [DOCS] Fix link on Concepts page by @gaborkaszab in #10718
- Core: Support appending files with different specs by @fqaiser94 in #9860
- Remove unnecessary class-level synchronized in ManifestFiles by @findepi in #10544
- Spec: Clarify which columns can be used for equality delete files. by @emkornfield in #8981
- Build: Bump nessie from 0.92.1 to 0.93.1 by @dependabot in #10727
- Build: Bump org.testcontainers:testcontainers from 1.19.8 to 1.20.0 by @dependabot in #10730
- Build: Bump mkdocs-material from 9.5.28 to 9.5.29 by @dependabot in #10734
- Build: Bump org.roaringbitmap:RoaringBitmap from 1.2.0 to 1.2.1 by @dependabot in #10733
- Build: Bump software.amazon.awssdk:bom from 2.26.20 to 2.26.21 by @dependabot in #10729
- Build: Bump io.netty:netty-buffer from 4.1.111.Final to 4.1.112.Final by @dependabot in #10726
- Build: Bump orc from 1.9.3 to 1.9.4 by @dependabot in #10728
- Build: Bump com.google.errorprone:error_prone_annotations from 2.28.0 to 2.29.2 by @dependabot in #10731
- Flink: Migrate remaining classes to JUnit5 by @tomtongue in #10684
- Api, Build: Fix typo in comments in
Table
andgradlew
by @hantangwangd in #10744 - Flink: parameterize Flink table source tests to test both old and FLIP-27 source implementations by @stevenzwu in #10741
- Core: Limit memory used by ParallelIterable by @findepi in #10691
- Flink: handle rescale properly and refactor statistics by @stevenzwu in #10457
- Flink: Backport other remaining classes by @tomtongue in #10749
- Spec: Clarify time travel implementation in Iceberg by @emkornfield in #8982
- Build: Let revapi compare against 1.6.0 by @ajantha-bhat in #10754
- Support for Flink's SpeculativeExecution in batch execution mode by @venkata91 in #10548
- API: Update StatisticsFile javadoc by @rice668 in #10769
- Flink: Remove JUnit4 dependency by @nastra in #10770
- Docs: Add bodo to iceberg vendors by @ritwika314 in #10756
- Core: Add estimateRowCount for Files and Entries Metadata Tables by @szehon-ho in #10759
- Update checkstyle definition by @attilakreiner in #10681
- API: Fix typo in RewriteManifestFiles java doc by @amogh-jahagirdar in #10778
- Update .asf.yaml by @ajantha-bhat in #10767
- Support building with Java 21 by @findepi in #10474
- Infra, Docs: Publish Apache Iceberg 1.6.0 release by @jbonofre in #10752
- Update iceberg version on site to 1.6.0 by @jbonofre in #10783
- Hive: close the fileIO client when closing the hive catalog by @hussein-awala in #10771
- Add jbonofre as collaborator on the project by @jbonofre in #10782
- Docs: Make compatibility example consistent. by @emkornfield in #10781
- mr:Fix issues 10639 by @lurnagao-dahua in #10661
- Flink: backport PR #10331 and PR #10457 by @stevenzwu in #10757
- Build: Bump software.amazon.awssdk:bom from 2.26.21 to 2.26.25 by @dependabot in #10800
- Build: Bump nessie from 0.93.1 to 0.94.2 by @dependabot in #10798
- Build: Bump net.snowflake:snowflake-jdbc from 3.17.0 to 3.18.0 by @dependabot in #10801
- Build: Bump mkdocs-material from 9.5.29 to 9.5.30 by @dependabot in #10796
- Flink: Disabling flaky test TestIcebergSourceFailover.testBoundedWithSavepoint by @pvary in #10802
- Build: Bump mkdocs-awesome-pages-plugin from 2.9.2 to 2.9.3 by @dependabot in #10795
- Kafka Connect: Runtime distribution with integration tests by @bryanck in #10739
- Flink: improve snapshot compatibility check by comparing projected sort schema in SortKeySerializer by @stevenzwu in #10794
- Flink: support limit pushdown in FLIP-27 source by @stevenzwu in #10748
- Flink: Remove MiniClusterResource in Flink by @tomtongue in #10817
- Docs: Use link addresses instead of descriptions in releases.md by @lurnagao-dahua in #10815
- Build: Declare avro as an api dependency of iceberg-core by @devinrsmith in #10573
- Flink: backport PR #10748 for limit pushdown by @stevenzwu in #10813
- [DOCS] Fix header for entries metadata table by @gaborkaszab in #10826
- Support Spark Column Stats by @huaxingao in #10659
- Support for Flink's SpeculativeExecution in batch execution mode - Backport of PR #10548 by @venkata91 in #10776
- Infra: Improve feature request template by @nastra in #10825
- Core: (unit test) Replace the duplicated ALL_DATA_FILES with ALL_DELETE_FILES by @hsiang-c in #10836
- Core: Adds Basic Classes for Iceberg Table Version 3 by @RussellSpitzer in #10760
- Core: Allow SnapshotProducer to skip uncommitted manifest cleanup after commit by @grantatspothero in #10523
- Flink: a few small fixes or tuning for range partitioner by @stevenzwu in #10823
- Core: Drop support for Java 8 by @findepi in #10518
- Build: Bump com.adobe.testing:s3mock-junit5 from 2.11.0 to 2.17.0 by @nastra in #10851
- Core: Upgrade Jetty and Servlet API by @nastra in #10850
- Build for Java 11 by @snazy in #10849
- Build: Bump kafka from 3.7.1 to 3.8.0 by @dependabot in #10797
- Build: Update baseline gradle plugin to 5.58.0 by @findepi in #10788
- Flink: refactor sink tests to reduce the number of combinations with parameterized tests by @stevenzwu in #10777
- Flink: backport PR #10823 for range partitioner fixup by @stevenzwu in #10847
- Core: Remove reflection from TestParallelIterable by @findepi in #10857
- Spec: Deprecate the file system table scheme. by @rdblue in #10833
- Core, API: Add addNonDefaultSpec to UpdatePartitionSpec to not set the new partition spec as default by @shanielh in #10736
- Build: Bump com.palantir.baseline:gradle-baseline-java from 5.58.0 to 5.61.0 by @dependabot in #10864
- Build: Bump nessie from 0.94.2 to 0.94.4 by @dependabot in #10869
- Build: Bump org.xerial:sqlite-jdbc from 3.46.0.0 to 3.46.0.1 by @dependabot in #10871
- Build: Bump org.apache.commons:commons-compress from 1.26.0 to 1.26.2 by @dependabot in #10868
- Build: Bump software.amazon.awssdk:bom from 2.26.25 to 2.26.29 by @dependabot in #10866
- Build: Bump mkdocs-material from 9.5.30 to 9.5.31 by @dependabot in #10863
- Build: Fix Scala compilation by @snazy in #10860
- Build: Enable FormatStringAnnotation error-prone check by @findepi in #10856
- Core: Use encoding/decoding methods for namespaces and deprecate Splitter/Joiner by @nastra in #10858
- Aliyun: Replace assert usage with assertThat by @nastra in #10880
- Core: Extract filePath comparator into it's own class by @deniskuzZ in #10664
- Docs: Fix SQL in branching docs by @nakaken-churadata in #10876
- API: Add SupportsRecoveryOperations mixin for FileIO by @amogh-jahagirdar in #10711
- Spec: Clarify identity partition edge cases. by @emkornfield in #10835
- Build: Bump org.testcontainers:testcontainers from 1.20.0 to 1.20.1 by @dependabot in #10865
- Flink: add 1.20 support and remove 1.17 by @stevenzwu in #10881
- Build: Add checkstyle rule to ba...
Apache Iceberg 1.6.1
What's Changed
- Core: Limit ParallelIterable memory consumption by yielding in tasks by @raunaqmorarka in #10787
- [1.6] Core: Drop ParallelIterable's queue low water mark by @findepi in #10979
- Build: Bump orc from 1.9.3 to 1.9.4 (#10728) by @Fokko in #10988
New Contributors
- @raunaqmorarka made their first contribution in #10787
Full Changelog: apache-iceberg-1.6.0...apache-iceberg-1.6.1
Apache Iceberg 1.6.0
What's Changed
- API, Spark 3.3: Remove all usages of deprecated AssertHelpers by @findepi in #10500
- API: Fix default FileIO#newInputFile ManifestFile, DataFile and DeleteFile implementation to pass lengths by @amogh-jahagirdar in #9953
- AWS, Core: Replace .withFailMessage() usage with .as() by @nastra in #10000
- AWS: Fix TestGlueCatalogTable#testCreateTable by @aajisaka in #10221
- AWS: Make sure Signer + User Agent config are both applied by @nastra in #10198
- AWS: Retain Glue Catalog column comment after updating Iceberg table by @lawofcycles in #10276
- AWS: Retain Glue Catalog table description after updating Iceberg table by @aajisaka in #10199
- AWS: Support S3 DSSE-KMS encryption by @aajisaka in #8370
- AWS: close underlying executor for DynamoDb LockManager by @regadas in #10132
- Add 13 Dremio Blogs + Fix a few incorrect dates by @AlexMercedCoder in #9967
- Add EnumConfParser to SparkConfParser by @huaxingao in #10311
- Add Pagination To List Apis by @rahil-c in #9782
- Add bloom filter fpp config by @huaxingao in #10149
- Add checkstyle rule for uppercase constant fields by @attilakreiner in #10673
- Add issue template and docs for iceberg proposals by @danielcweeks in #9932
- Add local nightly build to test current docs changes by @bitsondatadev in #9943
- Add stale PRs management by @jbonofre in #10134
- Add support for providing output-spec-id during rewrite - spark 3.5 by @himadripal in #9803
- Address Intellij inspection findings by @snazy in #10583
- Allow Java 17 in contribute.md by @findepi in #10545
- Apply IntelliJ inspection findings to older Spark + Flink versions by @snazy in #10625
- Avoid adding a closed client to the pool by @flyrain in #10337
- Aws: Add Iceberg version to UserAgent in S3 requests by @CsengerG in #9963
- Backport Flink 1.18 JUnit5 migration to Flink 1.17 by @tomtongue in #10163
- Backport HadoopCatalog related classes in Flink by @tomtongue in #10620
- Backport source package changes in Flink to other versions by @tomtongue in #10663
- Basic manifest encryption by @ggershinsky in #8252
- Build: Align Jackson versions by @nastra in #9925
- Build: Bump Nessie to 0.90.4 by @adutra in #10492
- Build: Bump Nessie to 0.91.2 by @adutra in #10563
- Build: Bump Spark 3.5 to 3.5.1 by @manuzhang in #9832
- Build: Bump arrow from 15.0.0 to 15.0.1 by @dependabot in #9910
- Build: Bump arrow from 15.0.1 to 15.0.2 by @dependabot in #10034
- Build: Bump com.azure:azure-sdk-bom from 1.2.20 to 1.2.21 by @dependabot in #9857
- Build: Bump com.azure:azure-sdk-bom from 1.2.21 to 1.2.22 by @dependabot in #10071
- Build: Bump com.azure:azure-sdk-bom from 1.2.22 to 1.2.23 by @dependabot in #10238
- Build: Bump com.azure:azure-sdk-bom from 1.2.23 to 1.2.24 by @dependabot in #10420
- Build: Bump com.azure:azure-sdk-bom from 1.2.24 to 1.2.25 by @dependabot in #10652
- Build: Bump com.esotericsoftware:kryo from 4.0.2 to 4.0.3 by @dependabot in #9984
- Build: Bump com.google.cloud:libraries-bom from 26.28.0 to 26.43.0 by @dependabot in #10699
- Build: Bump com.google.errorprone:error_prone_annotations from 2.24.1 to 2.26.1 by @dependabot in #9972
- Build: Bump com.google.errorprone:error_prone_annotations from 2.26.1 to 2.27.0 by @dependabot in #10236
- Build: Bump com.google.errorprone:error_prone_annotations from 2.27.0 to 2.28.0 by @dependabot in #10418
- Build: Bump com.gorylenko.gradle-git-properties:gradle-git-properties from 2.4.1 to 2.4.2 by @dependabot in #10239
- Build: Bump com.palantir.gradle.gitversion:gradle-git-version from 3.0.0 to 3.1.0 by @dependabot in #10468
- Build: Bump datamodel-code-generator from 0.25.4 to 0.25.5 by @dependabot in #9979
- Build: Bump datamodel-code-generator from 0.25.5 to 0.25.6 by @dependabot in #10242
- Build: Bump datamodel-code-generator from 0.25.6 to 0.25.7 by @dependabot in #10507
- Build: Bump datamodel-code-generator from 0.25.7 to 0.25.8 by @dependabot in #10649
- Build: Bump gradle.plugin.io.morethan.jmhreport:gradle-jmh-report from 0.9.0 to 0.9.6 by @dependabot in #10193
- Build: Bump guava from 33.0.0-jre to 33.1.0-jre by @dependabot in #9977
- Build: Bump guava from 33.1.0-jre to 33.2.0-jre by @dependabot in #10271
- Build: Bump guava from 33.2.0-jre to 33.2.1-jre by @dependabot in #10414
- Build: Bump io.airlift:aircompressor from 0.26 to 0.27 by @dependabot in #10383
- Build: Bump io.delta:delta-spark_2.12 from 3.1.0 to 3.2.0 by @dependabot in #10320
- Build: Bump io.delta:delta-standalone_2.12 from 3.1.0 to 3.2.0 by @dependabot in #10321
- Build: Bump io.github.goooler.shadow:shadow-gradle-plugin from 8.1.7 to 8.1.8 by @dependabot in #10612
- Build: Bump io.netty:netty-buffer from 4.1.107.Final to 4.1.108.Final by @dependabot in #10032
- Build: Bump io.netty:netty-buffer from 4.1.108.Final to 4.1.109.Final by @dependabot in #10191
- Build: Bump io.netty:netty-buffer from 4.1.109.Final to 4.1.110.Final by @dependabot in #10384
- Build: Bump io.netty:netty-buffer from 4.1.110.Final to 4.1.111.Final by @dependabot in #10504
- Build: Bump jetty from 9.4.53.v20231009 to 9.4.54.v20240208 by @dependabot in #9982
- Build: Bump jetty from 9.4.54.v20240208 to 9.4.55.v20240627 by @dependabot in #10654
- Build: Bump kafka from 3.6.1 to 3.7.0 by @dependabot in #9855
- Build: Bump kafka from 3.7.0 to 3.7.1 by @dependabot in #10653
- Build: Bump mkdocs-material from 9.5.14 to 9.5.15 by @dependabot in #10031
- Build: Bump mkdocs-material from 9.5.15 to 9.5.17 by @dependabot in #10092
- Build: Bump mkdocs-material from 9.5.17 to 9.5.18 by @dependabot in #10189
- Build: Bump mkdocs-material from 9.5.18 to 9.5.19 by @dependabot in #10241
- Build: Bump mkdocs-material from 9.5.19 to 9.5.21 by @dependabot in #10272
- Build: Bump mkdocs-material from 9.5.21 to 9.5.23 by @dependabot in #10353
- Build: Bump mkdocs-material from 9.5.23 to 9.5.25 by @dependabot in #10413
- Build: Bump mkdocs-material from 9.5.25 to 9.5.26 by @dependabot in #10464
- Build: Bump mkdocs-material from 9.5.26 to 9.5.27 by @dependabot in #10555
- Build: Bump mkdocs-material from 9.5.27 to 9.5.28 by @dependabot in #10648
- Build: Bump mkdocs-material from 9.5.9 to 9.5.14 by @dependabot in #9983
- Build: Bump nessie from 0.77.1 to 0.79.0 by @dependabot in #9976
- Build: Bump nessie from 0.79.0 to 0.80.0 by @dependabot in #10237
- Build: Bump nessie from 0.80.0 to 0.81.1 by @dependabot in #10267
- Build: Bump nessie from 0.81.1 to 0.82.0 by @dependabot in #10318
- Build: Bump nessie from 0.82.0 to 0.83.2 by @dependabot in #10381
- Build: Bump nessie from 0.90.4 to 0.91.1 by @dependabot in #10551
- Build: Bump nessie from 0.91.2 to 0.91.3 by @dependabot in #10608
- Build: Bump nessie from 0.92.0 to 0.92.1 by @dependabot in #10697
- Build: Bump net.snowflake:snow...
Apache Iceberg 1.5.2
The 1.5.2 release has the same changes that the 1.5.1 release has. The 1.5.1 release had issues with the spark runtime artifacts; specifically certain artifacts were built with the wrong Scala version. It is strongly recommended to upgrade to 1.5.2 for any systems that are using 1.5.1.
Apache Iceberg 1.5.1
What's Changed
- [1.5.x] API: Fix default FileIO#newInputFile ManifestFile, DataFile and DeleteFile implementations by @amogh-jahagirdar in #10114
- [1.5.x] Core: Mark 502 and 504 failures as retryable to the exponential retry strategy by @amogh-jahagirdar in #10113
- Core: Fix JDBC Catalog table commit when migrating from schema V0 to V1 (#101111) by @jbonofre in #10152
- Core: Fix namespace SQL statement using ESCAPE character that works with MySQL/PostgreSQL (#10167) by @jbonofre in #10169
- (1.5.x cherry-pick) Spark 3.5: Fix system function pushdown in CoW row-level commands by @amogh-jahagirdar in #10170
- (1.5.x Cherry-pick) Spark 3.4: Fix system function pushdown in CoW row-level commands (#10119) by @amogh-jahagirdar in #10171
Full Changelog: apache-iceberg-1.5.0...apache-iceberg-1.5.1
Apache Iceberg 1.5.0
Apache Iceberg 1.5.0 was released on March 11, 2024.
The 1.5.0 release adds a variety of new features and bug fixes.
- API
- Core
- Add view support for REST catalog (#7913)
- Add view support for JDBC catalog (#9487)
- Add catalog type for glue,jdbc,nessie (#9647)
- Support Avro file encryption with AES GCM streams (#9436)
- Add ApplyNameMapping for Avro (#9347)
- Add StandardEncryptionManager (#9277)
- Add REST catalog table session cache (#8920)
- Support view metadata compression (#8552)
- Track partition statistics in TableMetadata (#8502)
- Enable column statistics filtering after planning (#8803)
- Spark
- Remove support for Spark 3.2 (#9295)
- Support views via SQL for Spark 3.4 and 3.5 (#9423, #9421, #9343, #9513, #9582)
- Support executor cache locality (#9563)
- Added support for delete manifest rewrites (#9020)
- Support encrypted output files (#9435)
- Add Spark UI metrics from Iceberg scan metrics (#8717)
- Parallelize reading files in add_files procedure (#9274)
- Support file and partition delete granularity (#9384)
- Flink
- Parquet
- Kafka-Connect
- Spec
- Vendor Integrations
- AWS: Support setting description for Glue table (#9530)
- AWS: Update S3FileIO test to run when CLIENT_FACTORY is not set (#9541)
- AWS: Add S3 Access Grants Integration (#9385)
- AWS: Glue catalog strip trailing slash on DB URI (#8870)
- Azure: Add FileIO that supports ADLSv2 storage (#8303)
- Azure: Make ADLSFileIO implement DelegateFileIO (#8563)
- Nessie: Support views for NessieCatalog (#8909)
- Nessie: Strip trailing slash for warehouse location (#9415)
- Nessie: Infer default API version from URI (#9459)
- Dependencies
- Bump Nessie to 0.77.1
- Bump ORC to 1.9.2
- Bump Arrow to 15.0.0
- Bump AWS Java SDK to 2.24.5
- Bump Azure Java SDK to 1.2.20
- Bump Google cloud libraries to 26.28.0
Note:
- To enable view support for JDBC catalog, configure
jdbc.schema-version
toV1
in catalog properties.
New Contributors
- @reswqa made their first contribution in #7745
- @maxdebayser made their first contribution in #7796
- @mderoy made their first contribution in #7801
- @cxzl25 made their first contribution in #7825
- @tilman151 made their first contribution in #7781
- @TaoZex made their first contribution in #7761
- @Rondiz made their first contribution in #7829
- @grobgl made their first contribution in #7645
- @guiyanakuang made their first contribution in #7839
- @littlecatjianjiao made their first contribution in #7908
- @DaVincii made their first contribution in #7874
- @mumuhhh made their first contribution in #7866
- @Ewan-Keith made their first contribution in #7917
- @nikam14 made their first contribution in #7093
- @hsiang-c made their first contribution in #7920
- @ktk1012 made their first contribution in #8026
- @joan38 made their first contribution in #8002
- @coded9 made their first contribution in #8058
- @rustyconover made their first contribution in #8074
- @mr-brobot made their first contribution in #8061
- @Neuw84 made their first contribution in #7988
- @lintingbin made their first contribution in #8111
- @mrcnc made their first contribution in #8193
- @s-akhtar-baig made their first contribution in #8205
- @MaxNevermind made their first contribution in #7694
- @bmaisonn made their first contribution in #8209
- @HonahX made their first contribution in #8215
- @onerishabh made their first contribution in #8214
- @kengtin made their first contribution in #7161
- @aless10 made their first contribution in #8286
- @advancedxy made their first contribution in #8320
- @dacort made their first contribution in #8341
- @gegef2009 made their first contribution in #8154
- @TjuAachen made their first contribution in #8401
- @baiyangtx made their first contribution in #8416
- @hiteshbedre made their first contribution in #8491
- @harshm-dev made their first contribution in #8385
- @wForget made their first contribution in #8445
- @andreacfm made their first contribution in #8528
- @Paddy0523 made their first contribution in #8547
- @rushilshah1 made their first contribution in #8589
- @lanemoseley made their first contribution in #8618
- @tlm365 made their first contribution in #8447
- @jbonofre made their first contribution in #8612
- @jayceslesar made their first contribution in #8558
- @MehulBatra made their first contribution in #8408
- @clettieri made their first contribution in #8192
- @nk1506 made their first contribution in #8640
- @johanhenriksson made their first contribution in #8751
- @ashutosh-roy made their first contribution in #8707
- @Priyansh121096 made their first contribution in #8748
- @PickBas made their first contribution in #8819
- @jongwooo made their first contribution in #8666
- @rice668 made their first contribution in #8873
- @geruh made their first contribution in #8914
- @bknbkn made their first contribution in #8868
- @wangtaohz made their ...
Apache Iceberg 1.4.3
What's Changed
- Core: Scan only live entries in partitions table (#8969) by @Fokko in #9197
- [1.4.x] Core: Fix missing files from transaction retries with conflicting manifest merges (#9230) by @nastra in #9337
- [1.4.x] JDBC Catalog: Fix namespaceExists check with special characters (#8340) by @ismailsimsek in #9291
- [1.4.x] Core: Expired Snapshot files in a transaction should be deleted by @bartash in #9223
- [1.4.x] Core: Fix missing delete files from transaction (#9354) by @nastra in #9356
Full Changelog: apache-iceberg-1.4.2...apache-iceberg-1.4.3
Apache Iceberg 1.4.2
What's Changed
- Core: Ignore split offsets array when split offset is past file length by @amogh-jahagirdar in #8938
Full Changelog: apache-iceberg-1.4.1...apache-iceberg-1.4.2
Apache Iceberg 1.4.1
What's Changed
- Core: Do not use a lazy split offset list in manifests (#8834) by @nastra in #8845
- Core: Ignore split offsets when the last split offset is past the file length by @amogh-jahagirdar in #8861
- AWS: avoid static global credentials provider which doesn't play well with lifecycle management (#8677) by @nastra in #8843
- Flink: Reverting the default custom partitioner for bucket column (#8848) by @nastra in #8858
Full Changelog: apache-iceberg-1.4.0...apache-iceberg-1.4.1
Apache Iceberg 1.4.0
- API
- Core
- Use V2 format by default in new tables (#8381)
- Use
zstd
compression for Parquet by default in new tables (#8593) - Add strict metadata cleanup mode and enable it by default (#8397) (#8599)
- Avoid generating huge manifests during commits (#6335)
- Add a writer for unordered position deletes (#7692)
- Optimize
DeleteFileIndex
(#8157) - Optimize lookup in
DeleteFileIndex
without useful bounds (#8278) - Optimize split offsets handling (#8336)
- Optimize computing user-facing state in data tasks (#8346)
- Don't persist useless file and position bounds for deletes (#8360)
- Don't persist counts for paths and positions in position delete files (#8590)
- Support setting system-level properties via environmental variables (#5659)
- Add JSON parser for
ContentFile
andFileScanTask
(#6934) - Add REST spec and request for commits to multiple tables (#7741)
- Add REST API for committing changes against multiple tables (#7569)
- Default to exponential retry strategy in REST client (#8366)
- Support registering tables with REST session catalog (#6512)
- Add last updated timestamp and snapshot ID to partitions metadata table (#7581)
- Add total data size to partitions metadata table (#7920)
- Extend
ResolvingFileIO
to support bulk operations (#7976) - Key metadata in Avro format (#6450)
- Add AES GCM encryption stream (#3231)
- Fix a connection leak in streaming delete filters (#8132)
- Fix lazy snapshot loading history (#8470)
- Fix unicode handling in HTTPClient (#8046)
- Fix paths for unpartitioned specs in writers (#7685)
- Fix OOM caused by Avro decoder caching (#7791)
- Spark
- Added support for Spark 3.5
- Code for DELETE, UPDATE, and MERGE commands has moved to Spark, and all related extensions have been dropped from Iceberg.
- Support for WHEN NOT MATCHED BY SOURCE clause in MERGE.
- Column pruning in merge-on-read operations.
- Ability to request a bigger advisory partition size for the final write to produce well-sized output files without harming the job parallelism.
- Dropped support for Spark 3.1
- Deprecated support for Spark 3.2
- Support vectorized reads for merge-on-read operations in Spark 3.4 and 3.5 (#8466)
- Increase default advisory partition size for writes in Spark 3.5 (#8660)
- Support distributed planning in Spark 3.4 and 3.5 (#8123)
- Support pushing down system functions by V2 filters in Spark 3.4 and 3.5 (#7886)
- Support fanout position delta writers in Spark 3.4 and 3.5 (#7703)
- Use fanout writers for unsorted tables by default in Spark 3.5 (#8621)
- Support multiple shuffle partitions per file in compaction in Spark 3.4 and 3.5 (#7897)
- Output net changes across snapshots for carryover rows in CDC (#7326)
- Display read metrics on Spark SQL UI (#7447) (#8445)
- Adjust split size to benefit from cluster parallelism in Spark 3.4 and 3.5 (#7714)
- Add
fast_forward
procedure (#8081) - Support filters when rewriting position deletes (#7582)
- Support setting current snapshot with ref (#8163)
- Make backup table name configurable during migration (#8227)
- Add write and SQL options to override compression config (#8313)
- Correct partition transform functions to match the spec (#8192)
- Enable extra commit properties with metadata delete (#7649)
- Added support for Spark 3.5
- Flink
- Add possibility of ordering the splits based on the file sequence number (#7661)
- Fix serialization in
TableSink
with anonymous object (#7866) - Switch to
FileScanTaskParser
for JSON serialization ofIcebergSourceSplit
(#7978) - Custom partitioner for bucket partitions (#7161)
- Implement data statistics coordinator to aggregate data statistics from operator subtasks (#7360)
- Support alter table column (#7628)
- Parquet
- ORC
- Handle filters with transforms by assuming the filter matches (#8244)
- Vendor Integrations
- GCP: Fix single byte read in
GCSInputStream
(#8071) - GCP: Add properties for OAtuh2 and update library (#8073)
- GCP: Add prefix and bulk operations to
GCSFileIO
(#8168) - GCP: Add bundle jar for GCP-related dependencies (#8231)
- GCP: Add range reads to
GCSInputStream
(#8301) - AWS: Add bundle jar for AWS-related dependencies (#8261)
- AWS: support config storage class for
S3FileIO
(#8154) - AWS: Add
FileIO
tracker/closer to Glue catalog (#8315) - AWS: Update S3 signer spec to allow an optional string body in
S3SignRequest
(#8361) - Azure: Add
FileIO
that supports ADLSv2 storage (#8303) - Azure: Make
ADLSFileIO
implementDelegateFileIO
(#8563) - Nessie: Provide better commit message on table registration (#8385)
- GCP: Fix single byte read in
- Dependencies
- Bump Nessie to 0.71.0
- Bump ORC to 1.9.1
- Bump Arrow to 12.0.1
- Bump AWS Java SDK to 2.20.131