Skip to content

Understanding the tracking output

Kevin Brubeck Unhammer edited this page Nov 3, 2015 · 1 revision

Bedup runs in two stages, first a scan and then the actual deduplication.

The scan will show output like

Scanning volume /mnt/disk/snapshots/a_snapshot generations from 77367 to 87609, with size cutoff 8388608
03:50:33 Scanned 874737 retained 0

for every volume involved. Bedup can skip some generations since it's been run before. The size cutoff means ????. The "retained" means ?????.

The deduplication will show output like

Deduplicated: 
- '/mnt/disk/snapshots/a_snapshot/f/a'
- '/mnt/disk/snapshots/another_snapshot/g/a'

which just means what it says, and a bunch of counters at the bottom, e.g.

19:22:47 Size group 2878/40715 (28745388) sampled 228328 hashed 36079 freed 10059170835

So here we have 40715 size groups to go through, and we're currently checking group number 2878, which is all the files of size 28745388. We have sampled from 228328 files (to check if they're worth hashing), and hashed (fully reading) 36079 files. If two files in the same size group have the same hash, bedup will do a complete comparison (there may be a hash collision), and if that passes, will deduplicate. Here, bedup has so far freed 10059170835 bytes.

Clone this wiki locally