-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accumulation of gigabytes of .tmp files after daily rollover #921
Comments
While testing the above, I used a very simple config: <appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>logs/diagnostic.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
<fileNamePattern>logs/diagnostic-%d{yyyy-MM-dd'T'HHmmss, UTC}.%i.log.gz</fileNamePattern> <!-- rollover every second -->
<maxFileSize>1KB</maxFileSize>
<maxHistory>10</maxHistory>
<totalSizeCap>10MB</totalSizeCap>
</rollingPolicy>
<encoder>
<pattern>%date{"yyyy-MM-dd'T'HH:mm:ss.SSSXXX", UTC} %-5level [%thread] %logger{36}.%M\(%line\) - %msg%n</pattern>
</encoder>
</appender> I then used this config in a program like for (int i = 0; i < 1_000_000; i++) {
LOG.info("log message here - {}", UUID.randomUUID());
Thread.sleep(5);
} This would occasionally (i.e. a few times in 100 rollovers) leave
It seems this is related to running the Compressor asynchronously, i.e. by the time the Compressor starts, the log file has already rolled over and created a new file. |
As of version 1.5.8, Which version of logback-classic are you using in the above test? In addition, I would also recommend leaving the |
Thanks for your comment. |
Description
We are logging on software running on devices with intermittent connectivity.
We are using
SizeAndTimeBasedRollingPolicy
to roll over the log file either daily, or when it gets too big.Recently we noticed some devices with completely full storage. This was caused by the presence of thousands of
.tmp
files in the logging directory, consuming many gigabytes of space.Steps to Reproduce
The devices occasionally restart after a few days, which also causes their system clock to reset. The limited connectivity means they cannot sync their time correctly with NTP. This causes logs to be written out with the same date each time as last time.
This does not immediately cause a problem, because when the app restarts
SizeAndTimeBasedFileNamingAndTriggeringPolicy::computeCurrentPeriodsHighestCounterValue()
finds the latest counter value and continues logging from there.However, when the time-based rollover occurs at midnight, then internally the counter is just set back to 0. But because the device has restarted many times before, logs already exist for this new day.
Then, when the next size-based rollover occurs, the main log file (
app.log
) is renamed toapp.0.log
+<nanoTime>
+.tmp
. This file is sent for compression, with a target name ofapp.0.log.gz
. Normally, the compressor will delete this.tmp
file on success. However, the target file already exists (having been written before at some point in the past). Its existence is checked for inCompressor::gzCompress()
, where a warning is logged and the method returns immediately. This means the '.tmp' file is never removed.The logic there is understandable, as the Compressor should probably not delete the file upon error.
It is probably the responsibility of the
TimeBasedRollingPolicy
class to ensure its temp file is removed.Expected Behavior
Firstly, erratic clocks could be compensated for by checking what the next counter should be on date rollover, rather than assuming it is 0.
The expected behaviour if the target file already exists during compression is harder to define. As compression occurs asynchronously, the
.tmp
file can no longer be renamed back toapp.log
as this probably already exists with new entries. Maybe they could be compressed to the target name plus a suffix?Leaving the
.tmp
files to accumulate indefinitely is certainly not the expected behavior. They could periodically be cleaned up instead perhaps.Bonus problem
When the devices' connectivity is eventually restored, the time will synchronise and be persisted to a time point a few months in the future. However, then the reset-rollover cycle continues again.
This time, any old log files from the previous month are also kept indefinitely. This seems to be because
TimeBasedArchiveRemover
has a constantINACTIVITY_TOLERANCE_IN_MILLIS
of 32 days. Files much older than this do not get considered for removal, despite the<maxHistory>100</maxHistory>
in our config. Our expectation was that we would have a maximum of 100 files, whatever happens.Logs and Configuration
Environment
Logback Version: seen in 1.5.6 but likely present in current (v1.5.16) too
Java Version: 17
Operating System: Ubuntu Core 22.04
Related
Other issues with .tmp files are:
The text was updated successfully, but these errors were encountered: