Releases: real-logic/aeron
Releases · real-logic/aeron
1.42.1
- [Java] Ensure that the
LoggingErrorHandler
instance is closed when wrapping a user-definedErrorHandler
. - [Java] Invoke background work when idle and service is active so external connections from a cluster can be maintained.
- [Doc] JavaDoc clarifications on MediaDriver dirDeleteOnStart behavior.
- Upgrade to Agrona 1.19.2.
- Upgrade to ByteBuddy 1.14.7.
1.42.0
Register here for Aeron Community Meetups - 12 October London / 18 October NY / 16 November Chicago.
- Add port manager to C and Java drivers to allow for a limited range of ports that can be used when subscription endpoint addresses and publication control addresses specify a port of
0
. This can make network access configuration simpler (Driver). - Wait for end of stream event on the log instead of the counter becoming unavailable before entering into a new Election (Cluster).
- Close log publications before trying to stop recording to speed up graceful leader step down (Cluster).
- Ignore resting subscription positions when calculating join position or draining driver entities when tethering is being used (C Driver).
- Extensions points to support Standby Snapshots (Cluster).
- Ensure that the
clusterDirectoryName
field is synced with theclusterDir
configuration parameter during conclude inConsensusModule
andClusterBacker
(Cluster). - Cluster tutorial updates (Documentation).
- Extract invocation of background work in the service container.
- Do not fail if Aeron directory exists, but the CnC file does not (C Driver).
- Use
aeron_errcode
to get the latest error code when reporting problems creating the Aeron directory (C Driver). - Don't let a bounded replay go beyond counter value to stop position so commit position can be used for replay (Archive).
- Add addition example Authentication and Authorisation services (Archive/Cluster).
- Add additional documentation to specify details when Authorisation is used (Archive/Cluster).
- Clarify logic about extending a replay while being bounded (Archive).
- Changes to host name resolution on start up (Driver).
- Perform host name resolution once and track its execution time via the duty cycle tracker and the event log.
- Use
<unresolved>
as a host name if it cannot be resolved instead of null. - Fix GCC 9 warning
- Await initial counters being updated by the conductor thread before reading the values.
- Trigger graceful leader close election based on recording stop signal (Archive).
- Add end of stream position to Subscriptions (Client).
- Allow for relocatable mark files for the
ConsensusModule
,ClusteredServiceContainer
, andClusterBackup
(Cluster). - Move close operations on MDS transports on the conductor not the receiver thread (C Driver).
- Allow preparing for new leadership to be less serial to reduce election time (Cluster).
- Async removal of destinations as log adaptor closes when preparing for election (Cluster).
- Fix CMD_OUT_ERROR log format (C Driver).
- Fix
aeron_err_clear()
(C). - C driver handle untethered subscriptions correctly, ensure that tethering behaviour matches Java driver (C Driver).
- Fix name in toString (Java Client).
- Improve error message (C++ Archive Client).
- Clean up Apple M1 compilation. Move loss reporter function into compilation unit. Add pragmas to account for unused but set variables (C).
- Add logging to track information when a
ReplicationSession
ends (Archive). - Add deduplication step for aeron-agent jar.
- Add deduplication step for aeron-all jar.
- Add an annotation processor to generate a version class for a number of the Aeron packages.
- Include the git sha in the version C binaries.
- Extract
untetheredWindowLimit
method. - Upgrade to SBE 1.29.0.
- Upgrade to JUnit 5.10.0.
- Upgrade to Agrona 1.19.1.
1.41.4
- Store
Aeron.NULL_VALUE
for lastActivityTimestamp in the snapshot for sessions as the value should be transient (Cluster). - On snapshot of PendingServiceMessageTracker, check to see if the buffer is empty and the nextServiceSessionId is lower than what it should be. Correct this and log an error that follower may not have executed the service IPC logic deterministically (Cluster).
- Include the recording position in the debug log when replicate session state changes (Archive).
- Don't cool down an Image that is EOS (Driver)
- Reset eos_flagged when adding a receiver (C Driver)
- Filter images by session id when adding new network subscriptions (C Driver)
- Allow the session-id of the publication used during replication to be specified to avoid session-id clashes on replication retries (Archive).
- Allow the authorisation credentials to be specified on a replication request to support simple authentication between the destination and source archives (Archive).
- Use custom session-ids when replicating snapshots and logs (Cluster).
- SBE 1.28.3
- Agrona 1.18.2
- ByteBuddy 1.14.5
- Version 0.47.0.
1.41.3
- [Java] Add debug method for a replication completing withing Cluster.
- [Java] Let Archive take care of handling Aeron exceptions when client is embedded.
- [C] Prevent double free of the
aeron_udp_channel_t
when creating asend_channel_endpoint_t
. - [C] Upgrade to HDR Histogram 0.11.8
- [Java] Upgrade to JUnit 5.9.3 (platform console 1.9.3)
- [Java/C] Name resolution execution time tracking
- [Java] Simplify Client Close
- [Java] Perform Runtime.exit(int) on another thread in default error handler to avoid deadlocks if Aeron instances are used on JVM shutdown hooks.
- [C] Fix potential memory leak. Use correct action with publication notification.
- [Java] Ensure segment file write of final byte for extension is successful
1.39.1
1.41.2
- [Java] Fixed ClusterBackup backwards compatibility issue caused by the missing
memberId
field. - [Java] Set isLeader flag on the list of cluster members passed to
ClusterBackupEventsListener
. - [C] Prevent negative shifts when receiving some messages from ATS.
- Upgrade to ByteBuddy 1.14.4.
- Upgrade to Gradle 8.1.
- Upgrade to Shadow 8.1.1.
1.41.1
- Upgrade to Agrona 1.18.1.
- Upgrade to SBE 1.28.2.
1.41.0
- Allow
NameResolver
to be configured for theConsensusModule
in order to support custom name resolution when configuring the ingress channel. - Delay election state transitions if there is an active leader to avoid unnecessary reset and new election.
- Make
AeronCluster.asyncConnect
work completely asynchronously. Don't report exceptions to the error handler that are used for async resources. - Add a system property and API to allow changing a directory where an Archive mark file (
archive-mark.dat
) is stored. - Check the state of the interface when trying to resolve the multicast interface. Only use interfaces that are up. Issue #1387
- CnC file length validation. Issue #1410
- Fix issue of not capturing return code when recording signal arrives after an error to the archive client.
- Support migrating segments to the beginning or end of an existing archive recording.
- [C] Fix issue of using transport after it had been removed.
- [Java] Fix concurrent close of receive destination counters on multi-destination subscriptions.
- [C] Fix
remove_if
methods on pointer value maps which previously could miss an item. - Add debug logging for clustered service acking.
- Add a specific error for archive replication failing to create a remote connection.
- Fix leak with Archive replay session if the async publication has a session clash.
- Shorten duration of cluster election after a leader has closed gracefully.
- [C] Fix image rejoin by swapping correcting cooldown map insertion and removal. PR #1338
- Candidate ballot for 5+ node cluster cannot be cut short on quorum otherwise most up to date member may not be elected.
- [C] Allow for attempted recreation of an Image if initial attempt fails. PR #1435
- Perform most replay validations before sending OK to the client so errors are synchronous when starting a replay.
- Delete all recording segment files when a recording is truncated to its start position.
- Close
ArchiveMarkFile
last when shutting down Archive to capture all errors. - [C++] Apply
std::forward
to fragment handler to avoid unnecessary copy. PR #1405 - Fix handling of padding greater than max message length in Archive replay.
- Add debug logging for Archive recording signals.
- Close log subscription first when clustered service is cleanly closed to drop follower out of flow control as soon as possible.
- Drop cluster follower as soon as possible out of flow control to allow cluster to progress when follower is cleanly closed.
- [C] Report timeout accurately when driver keepalive beyond timeout. PR #1429
- Add ability to run Archive with only IPC control channels for clients.
- Add
ClusterTool.isLeader
method. - Add
Image
toSubscription
before calling available handler rather than after. - Set URI in receiver counters to match subscription channel.
- Add cluster member node state file and migrate out state that needs to be persistent, such as
candidateTermId
and member list, so the mark file can be in /dev/shm. - [C] Fix issue with removing naming resolver neighbor that deleted adjacent memory.
- [C] Improve socket error handling on Windows.
- [Java] Add
toString()
to many Aeron classes to help debugging. - [C] Improve parsing of unsigned 32-bit integers.
- [C] Set max of resource free queue length and resource free limit to
INT32_MAX
. This stops them being incorrectly set to 0 by aeron_config_parse_uint32 when comparing against int32 0. PR #1421 - Deprecate cluster dynamic join feature. This is to be replaced with a more robust and user friendly premium offering.
- [C] Fix counter leak when subscription fails.
- [C] Fix spy channel memory leak when destination is removed for multi-destination subscription.
- [C] Fix channel memory leak on error when creating publications or subscriptions.
- Fix NPE on timeout exception for cluster client in some connect states.
- [Java] Improve efficiency of URI parsing.
- [C] Fix error messages with incorrect varargs.
- Warnings clean up in codebase to have less noisy CodeQL analysis.
- Support having mark files for
Archive
,ConsensusModule
, andClusteredServiceContainer
to be in alternative directory such a /dev/shm so timeouts can be avoided when recording writes queue up on a network filesystem. - Add timestamp params to stripped channel for pass through to Archive operations.
- Queue resource freeing operations in driver to avoid timeouts when unmapping operations are slow.
- [C++] Work around compiler concurrency bug for
AtomicArrayUpdater
that can impact client Subscriptions causing image list to become corrupted. - Improve javadoc for recording signal usage.
- Be strict on handling cluster leader liveness to the current leadership term.
- Only try unblocking a client command after liveness timeout to avoid "lost" commands. PR #1369
- Make archive counters unique so multiple archives can run on the same media driver.
- Truncate files after
ArchiveTool.compact
is invoked to free disk space. - Fix basic auction cluster tutorial configuration.
- Improve
ClusterConfig
sample to allow for ingress configuration. - Add counters for the number of active recordings or replays in an Archive.
- Add counters for reporting on read and write operations in an Archive.
- Support allowing a
ClusteredService
being started before theConsensusModule
. - Improve false sharing protections for more consistent latency.
- Simplify
ReplayMerge
samples to not require entity tags. - Add batch script for launching low-latency media driver on Windows.
- Support message lengths greater than MTU in ping pong samples.
- Fix options handling in
cping
sample. - Improve handling of timeouts in cluster elections for more robust state transitions when network is unstable. Effects are more pronounced in 5+ member clusters.
- [Java] Add
Aeron.addAsyncSubscripiton
for non-block setup. - Compute source identity of images more precisely based on channel configuration.
- Improved handling of out of disk space errors.
- Support taking a cluster consensus module snapshot when member names are greater than MTU in length.
- Allow a follower to veto a member being elected cluster leader if they believe the leader is not valid. This is important in 5+ node clusters.
- Extend debugging for voting in cluster elections.
- Increment error counter when invalid version exceptions occur.
- Handle backpressure from commands between dedicated threads in driver with controlled polls to avoid live locks.
- [C] Add support for controlled poll operations on SPSC and MPSC ring buffers.
- Increase command queues to allow for more concurrent active changes in publications and images.
- Serve cluster backup queries from followers to take load from the leader.
- [C] Fix build when dot is used as thousands separator. PR #1372
- Upgrade to JUnit 5.9.2.
- Upgrade to BND 6.4.0.
- Upgrade to ByteBuddy 1.14.3.
- Upgrade to Mockito 4.11.0.
- Upgrade to Version 0.46.0.
- Upgrade to Gradle 7.6.
- Upgrade to SBE 1.28.1.
- Upgrade to Agrona 1.18.0.
1.40.0
- Memory align allocated buffers in
PublicationTest
so it works on Apple M1 processors. - Check that
NoOpLock
is only allowed to be used when using Aeron client in invoker mode. - Handle case of a delayed concurrent offer to a publication in which other threads have raced terms ahead without throwing an exception.
- Collapse term appenders into publications to reduce memory footprint and avoid data dependent loads.
- Short circuit Image polling operation when bound limit is less than current position to prevent term overrun.
- Add different aliases for consensus module/service container subscriptions. PR #1366.
- Stop an active cluster log replay when
ClusterBackup
is closed rather than waiting for timeout. - Send unavailable counter events to Aeron clients when a client closes or times out.
- Allow Consensus Module Agent to be run via an Invoker in addition to having its own thread.
- Apply liveness checks to Archive and Cluster mark files so that multiple instances cannot be run in the same directory and corrupt files.
- [Java] Use fixed format for timestamps in agent debug logs.
- Allow Archive replicate to overwrite all metadata for an empty recording.
- [C] Handle log buffer files with
term_length == AERON_LOGBUFFER_TERM_MAX_LENGTH
on Windows. PR #1360. - [C] Fix inclusion of symbols for debug builds on Windows.
- Remove
localhost
defaults for Archive and Cluster to help avoid mis-configuration in production. PR #1356. - Await 'REPLICATE_END' when catching up as a follower across multiple leadership terms to avoid clashing session-id.
- Allow setting of receive socket buffer and window on cluster log channel subscribers. PR #1345.
- Fix application of send socket buffer lengths as configured when using MDC.
- Fix
ArchiveTool.dump
when fragment length is set <= 0. - Capture closing sessions into snapshot so session close event is lost on cluster shutdown.
- Remove brackets from counters labels to make it easier for extract to Prometheus.
- Send cluster client session open acknowledgement before appending to the log to avoid race with service sending egress on open event. Issue #1351.
- [C] Fix off by one error local socket address into channel indicator counter.
- Add protocol version support to cluster consensus protocol.
- Add more context to error messages on Archive
ReplaySession
. PR #1349. - Apply strict validation of consensus module snapshot state when messages are offered from clustered services. A number of customers have not been strict with all cluster nodes being deterministic and doing exactly the same thing which can result in corrupted and diverged snapshots.
- Consensus module state snapshot can be inspected with the
describe-latest-cm-snapshot
option toClusterTool
. - If a consensus module snapshot is shown to be corrupt it may be fixed by running
ConsensusModuleSnapshotPendingServiceMessagesPatch
and if non-support customers wish to have help then they can contact [email protected]. The patch can fix the leader and the fixed snapshot then needs to be replicated to the followers which can be done withAeronArchive.replicate
using the correct recording ids. - Add a tool to replicate a specific recording between archives. PR #1363.
- [C++] use
getAsString
calls for pollers for record descriptors for channel fields. Add test from PR #1348. - Add
ClusteredService.doBackgroundWork
which can be used for maintaining external connections beyond ingress and egress. - Increase default message timeout from 5 to 10 seconds for Archive clients.
- Add EOS flag to status messages (SMs) once a stream is totally received so the sender can take clean up action.
- When EOS status message is received by a sender then allow the publication linger on unicast to be cut short so resources are received sooner.
- When EOS status message is received by a sender then remove the receiver from flow control for multicast and MDC with tagged and min FC.
- Fix the closing of session specific subscriptions to prevent resource leak.
- Add scripts for testing raw network performance on Windows.
- Close egress from cluster on change of leader so clients can detect it before a new leader is elected.
- Don't timeout and close cluster client session if quorum cannot be temporarily reached.
- Add logging support for
ClusterBackup
state changes. - Close cluster clients when complete cluster is restarted.
- Support automatic reconnect from cluster client when the same leader is re-elected after a net split or temporarily loosing quorum.
- Add authentication for
ClusterBackup
to a cluster. - Validate Archive mark file length before reading when mapped read-only to avoid access violations.
- Preserve iteration order for cluster client session based on session id so snapshots can have binary compatibility.
- Capture leadership term id for cluster backup queries.
- Account for padding when sweeping pending services messages to avoid out of bounds exception.
- Prevent
-1
leadership term ids appearing in theRecordingLog
. - Allow Archive replication and replay request to specify session level file IO max buffer length for throttling a stream.
- Add support for custom app version validation to clustered services with
AppVersionValidator
. - Add false sharing protection to
DutyCycleTracker
. - Update doc on
ReplayMerge
to indicate theAeronArchive
client should not be shared. Issue #1340. - Upgrade to Versions 0.43.0.
- Upgrade to Mockito 4.8.1.
- Upgrade to Google Test 1.12.1.
- Upgrade to JUnit 5.9.1.
- Upgrade to ByteBuddy 1.12.18.
- Upgrade to Gradle 7.5.1.
- Upgrade to SBE 1.27.0.
- Upgrade to Agrona 1.17.1.
Java binaries can be found here.
1.39.0
- [Java] Fix
IllegalStateException
that could exist for an MDS subscription on the rapid recycling ofReplayMerge
operations. - [C] Align ring buffer implementations and feature set with Java.
- [Java] Make sure that C and Java are aligned on resend window. Re-instate the max message length being accounted in the bottom of the resend window for Java.
- Add duty cycle duration tracking to all agents across all modules.
- [C++] Improve efficiency by reducing the number of copy operations for fragment assembly when a stream has many fragmented messages.
- [C] Default to CLOCK_REALTIME for send/receive timestamps.
- [Java] Add setters for send/receive timestamp clocks to the
MediaDriver.Context
. - Fix handling of fragment assemble when
reliable=false
is set for a channel and loss occurs. - Improve handling of short sends on MDC publication to backoff from overloading a socket.
- Add round-robin facility to MDC publication for increased fairness.
- [Java] Publish
aeron-test-support
package as a JAR. - [Java] Downgrade "unknown replay" errors to warnings for cluster catchup.
- [Java] Add
appVersion
to event logging for consensus module and check for correct app version when replaying log. - [Java] Prevent timeout warnings with cluster dynamic nodes and log replication.
- [Java] Add cluster dynamic join state change logging events.
- Add counters for the number of receivers in min and tagged flow control strategies.
- [Java] Avoid race unmapping buffers on concurrent close of media drivers.
- Modify flow control strategies to have new method for when elicited setups are sent and add counters manager to
init
methods. Modify Min and Tagged flow control to use setupsnd-lmt
as min position until timeout or receiver added on SM. - [Java] Account for possible padding in log buffer when checking for bottom resend window for retransmits.
- [C] Flush output when printing configuration.
- [C] Raise warning on failure to setup media timestamping.
- [Java] Update
recordingId
on any signal with a valid recording id when handling signals for snapshot replication. - [Java] When attempting
ClientSession.tryClaim
, ensure that there is enough buffer space when returning a mocked offer for a follower. - [C] Ensure publication image is released before it it freed.
- [C] Fix
scanf
that could result in buffer overflow when parsing HTTP for configuration. - [Java] Change default cluster session timeout from 5 to 10 seconds.
- Prevent receiver joining min/tagged flow control if they are more than a window behind.
- [C] Add sample for working with large messages.
- [Java] Add logging event for appending a cluster session close.
- Upgrade to BND 6.3.1.
- Upgrade to Mockito 4.6.1.
- Upgrade to ByteBuddy 1.12.10.
- Upgrade to SBE 1.26.0.
- Upgrade to Agrona 1.16.0.
Java binaries can be found here.