Skip to content

Release v0.6.0

Compare
Choose a tag to compare
@kaijchen kaijchen released this 27 Oct 10:50
· 1050 commits to master since this release

Apache Uniffle (Incubating) Release v0.6.0

Highlight

  • Optimize the assignment strategy

  • Some improvement of stability and performance

  • Add a plugin mechanism of SelectStorageStrategy

  • Add LowestIOSampleCostSelectStorageStrategy

  • Support Kerberos HDFS

ChangeLog

  • Change license owner to ASF by @kaijchen in #5
  • Trivial code improvements by @wForget in #7
  • [Minor] Store shuffleId int to be consistent with other data structure by @zuston in #10
  • Introduce the asList method in ConfigOptions by @zuston in #9
  • Rename package by @jerqi in #6
  • Minimize apache-rat excluded files by @kaijchen in #11
  • Update module names by @kaijchen in #12
  • Covert PartitionAssignmentInfo to static inner class by @pan3793 in #15
  • [Followup] Migrate to Junit5 by @zuston in #14
  • [Bug] Fix NPE problem when process the event if application was cleared already by @colinmjj in #16
  • [CI] Enable codecov report by @kaijchen in #17
  • Correct the config description and fix typo by @zuston in #19
  • Add CI and Codecov badges in README by @kaijchen in #20
  • [Followup] Use asList method in some existing configOptions by @zuston in #18
  • Move rss-integration-spark-common-test module package by @wForget in #23
  • [INFRA] Improve asf.yaml to reduce the notifications by @jerryshao in #25
  • [TEST] Improve code coverage in rss-common by @kaijchen in #26
  • Remove redundant package by @wForget in #27
  • [CI] Switch to temurin JDK by @kaijchen in #24
  • [INFRA] Improve asf.yaml to reduce the notifications (another-try) by @jerryshao in #33
  • Bump commons-lang3 from 3.5 to 3.10 by @wForget in #28
  • Fix the log of incorrectly bound class by @wForget in #35
  • [TYPO] Fix misspelled word "integration" by @kaijchen in #34
  • Fix some hyperlink in README.md by @daugraph in #32
  • Upgrade gRPC to support Apple Silicon by @pan3793 in #13
  • Allow to specify custom tags to decide the assignment of servers by @zuston in #30
  • Optimize the bash script by @zuston in #29
  • [Improvement] reduce compiler warnings by @advancedxy in #46
  • [Chore]: document update and build time optimize by @advancedxy in #45
  • Supplement doc about assignment tags by @zuston in #47
  • [Bug] Fix skip() api maybe skip unexpected bytes which makes inconsistent data by @colinmjj in #40
  • [improvement] Remove experimental feature with ShuffleUploader by @colinmjj in #51
  • [Improvement] Provides utility classes for creating thread factories by @smallzhongfeng in #49
  • Enable spotbugs and fix high priority bugs by @kaijchen in #38
  • [CI] Change default checkstyle severity to error by @kaijchen in #57
  • [Style] Check indentation by @kaijchen in #56
  • [Experimental Feature] MR Supports Remote Spill by @frankliee in #55
  • [Improvement] Log indicate the shuffle server host:port when doing re… by @zuston in #58
  • Send commit concurrently in client side by @zuston in #59
  • Explicitly set the constructor with AccessManager when extending AccessChecker by @zuston in #43
  • [DOC] Replace Firestorm with Uniffle by @jerqi in #60
  • Introduce the extraProperties to support user-defined pluggable accessCheckers by @zuston in #42
  • Log enhancement: Merge multiple logs into oneline and add more description by @zuston in #62
  • [TEST] Add more unit tests in rss-common by @kaijchen in #63
  • [MINOR] Comments of PartitionBalanceAssignmentStrategy miss byte units by @smallzhongfeng in #68
  • [Minor] Make config keys and default values finalized by @kaijchen in #70
  • [Log Improvment] Add more detailed debug info for MR client by @frankliee in #84
  • [Improvement] Shutdown the grpc executors pool when closing by @zuston in #83
  • Log enhancement: return error message when getting assignment servers and log exception when initializing by @zuston in #64
  • [ISSUE-48] [Feature] Init Kubernetes operator directory by @jerqi in #75
  • [Improvement] No need to use synchronized lock of the method scope when getting client by @zuston in #82
  • [DOC] Remove Wechat group in README by @jerqi in #88
  • [Performance Optimization] Improve the speed of writing index file in shuffle server by @zuston in #91
  • [DOC] Update title and description in README by @kaijchen in #94
  • [Improvement] ShuffleBlock should be release when finished reading by @xianjingfeng in #74
  • [IMPROVEMENT][COMMON] Fix common module code style by @jerqi in #99
  • [Improvement]LocalStorage init use multi thread #71 by @xianjingfeng in #72
  • [Improvement] Use OR operation instead of serialization for cloning BitMaps by @kaijchen in #103
  • [Improvement] Ignore partial failure on initializing local storage in shuffle server side by @zuston in #102
  • [CI] Test compile in Java 11 and Java 17 by @kaijchen in #105
  • Sleep less time but try more times when stopping by @xianjingfeng in #112
  • [Improvement] Use ConfigBuilder to rewrite the class RssSparkConfig by @smallzhongfeng in #104
  • [Improvement] Introduce config to customize assignment server numbers in client side by @zuston in #100
  • Assign partition again if registerShuffleServers failed by @xianjingfeng in #115
  • [ISSUE-106][IMPROVEMENT] Set rpc timeout for all rpc interface by @xianjingfeng in #113
  • [MINOR][IMPROVEMENT] Avoid CoordinatorServer#initialization multiple new Configuration() by @zwangsheng in #118
  • [Improve] Remove useless server id from StorageManagerFactory#createStorageManager by @zwangsheng in #119
  • [MINOR][IMPROVEMENT][COORD] Fix coordinator module code style by @jerqi in #122
  • [Improvement] Set heartBeatExecutorService as daemon thread by @smallzhongfeng in #121
  • [JUnit] Introduce the property of trimStackTrace to show error stacktrace in mvn-test by @zuston in #126
  • Make the conf of rss.storage.basePath as list by @zuston in #130
  • [MINOR][IMPROVEMENT][STORAGE] Fix storage module code style by @jerqi in #131
  • [Improvement] Add timeout reconnection when DelegationRssShuffleManager send the request of AccessCluster by @smallzhongfeng in #139
  • [MINOR] Fix flaky test testGetHostIp by @izchen in #141
  • [Improvement] Add the number of unhealthy nodes in CoordinatorMetrics by @smallzhongfeng in #147
  • [ISSUE-48][FEATURE] Add Uniffle Dockerfile by @wangao1236 in #132
  • [BUGFIX] Fix memory leak which cause oom by @summaryzb in #145
  • [Log Improvement] Output the registering/lost/exclude nodes in log by @zuston in #148
  • [MINOR] Tagged spark hadoop version in release package by @izchen in #149
  • [DOC] Migrate the coordinator doc from README to docs page by @zuston in #153
  • [MINOR][DOC] Remove spaces when reading file of excluded nodes by @smallzhongfeng in #155
  • [Improvement] Filter null value when selecting remote storage in ApplicationManager by @smallzhongfeng in #156
  • Introduce more grpc server metrics by @zuston in #150
  • [Improvement] Introduce a new class ShuffleTaskInfo by @smallzhongfeng in #158
  • [ISSUE-76] Disallow sendShuffleData if requireBufferId expired by @xianjingfeng in #159
  • Support storing shuffle data to secured dfs cluster by @zuston in #53
  • [FOLLOWUP] Delete hdfs shuffle data files using proxy user by @zuston in #170
  • [ISSUE-48][FEATURE] Init Operator Directory by @wangao1236 in #161
  • PID file name should contains program name by @zuston in #165
  • [BUGFIX] Fix resource leak when shuffle read by @izchen in #174
  • [Improvement] ShuffleBufferManager supports triggering flush according to the size of single ShuffleBuffer by @leixm in #176
  • [Improvement] Should match from pathToStorages when appId does not exist in appIdToStorages by @smallzhongfeng in #168
  • [ISSUE-173][FOLLOWUP] The size of single buffer flush should reach rss.server.flush.cold.storage.threshold.size by @leixm in #178
  • Revert "[ISSUE-173][FOLLOWUP] The size of single buffer flush should reach rss.server.flush.cold.storage.threshold.size " by @jerqi in #179
  • [ISSUE-173][FOLLOWUP] The size of single buffer flush should reach rss.server.flush.cold.storage.threshold.size by @leixm in #180
  • [FOLLOWUP] Store app user in ShuffleTaskInfo by @smallzhongfeng in #181
  • [ISSUE-123] Fix all test code style by @macroguo-ghy in #185
  • [ISSUE-48][FEATURE][FOLLOW UP] Add RemoteShuffleService CRD by @wangao1236 in #175
  • [FOLLOWUP] Add the conf of rss.security.hadoop.krb5-conf.file by @zuston in #184
  • Fix flaky test about kerberos by @zuston in #191
  • [Improvement] Add optional environment variables by @izchen in #187
  • [MINOR] Fix some bad practices reported by spotbugs by @kaijchen in #177
  • [ISSUE-48][FEATURE][FOLLOW UP] Add webhook component by @wangao1236 in #188
  • [Log-Improvement] Log the newly registered app id by @zuston in #193
  • [MINOR] Replace HashSet with ImmutableSet in configs by @kaijchen in #195
  • [IMPROVEMENT] Introduce the enumType in ConfigOptions by @zuston in #199
  • [ISSUE-48][FEATURE][FOLLOW UP] Generate informer and lister for crd by @wangao1236 in #202
  • [ISSUE-144] Fix flaky test RssShuffleUtilsTest#testDestroyDirectByteBuffer by @LuciferYang in #203
  • [Issue-194][Feature] Support spark 3.2.0 by @leixm in #201
  • [ISSUE-186][Feature] Use I/O cost time to select storage paths by @smallzhongfeng in #192
  • [Improvement][AQE] Avoid calling getShuffleResult multiple times by @leixm in #190
  • Fix flaky test of heartbeatTimeoutTest by @zuston in #206
  • [IMPROVEMENT] Add more metrics about local storage info by @zuston in #205
  • [MINOR][IMPROVEMENT] Return index-file size of n*SEGMENT_SIZE in HDFS reading by @zuston in #204
  • Add DISCLAIMER by @jerqi in #212
  • [TEST] Improve SimpleClusterManagerTest by @kaijchen in #216
  • [Minor] Modify the format of DISCLAIMER by @jerqi in #217
  • Add Notice and DISCLAMER file by @frankliee in #215

Credits

The release of Uniffle 0.6.0 is inseparable from the contributors of the Uniffle community. Thanks to all the community contributors!