-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add zfs_recover_ms parameter #17094
base: master
Are you sure you want to change the base?
Add zfs_recover_ms parameter #17094
Conversation
Signed-off-by: Igor Ostapenko <[email protected]>
ac2b7cc
to
963c816
Compare
ZFS_MODULE_PARAM(zfs, zfs_, recover_ms, INT, ZMOD_RW, | ||
"Set to attempt to recover from fatal errors during metaslab loading"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure it is really about metaslab loading, or even about metaslabs. Doesn't it apply to all spacemap operations, or even some non-spacemap? Also as I have mentioned during the call, I recall it more often happening not even during the metaslab loading (under which I guess you mean spacemap condensing, or whatever it is called), but when deleting snapshots, moving their list of blocks to parent of to free space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, currently it focuses on ms_allocatable
only, but, indeed, it covers all operations with it, not only upon loading.
zfs_panic_recover_ms("zfs: adding segment " | ||
"(offset=%llx size=%llx) overlapping with " | ||
"existing one (offset=%llx size=%llx)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wish we could say something about what was the range tree and what we were doing, otherwise just a message that something on pool overlapped somewhere does not help us with debugging. In case of panic we should get a stack, but I wonder if we could get more. May be giving range trees some types, names, etc. for debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking of more generalized approach, I guess we could cover both comments with something like zfs_recover_range_tree_mask
(instead of zfs_recover_ms
) which allows selecting up to 64 most interesting range tree classes: ms_allocatable
, ms_freeing
, ms_freed
, and so on. I think 64 bits are enough to cover all range tree use cases we have currently, even the ones not related to spacemaps. Each instance of the selected tree classes would go the warning path instead of the panic one. In addition, each range tree instance could really have some description which says about the actual name like ms_allocatable
and, probably, extra info like spa/vdev/metaslab id or something. Does it seem like one of the options we could consider implementing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what @amotin had in mind was just the one tunable, but make the log output mention which range-tree, and as much details as possible so we can have something to track down when a user reports encountering the problem. Knowing which of the range trees would be a good first step, but it'd also be nice to show the range that was being added/removed, the full details of the range it overlapped with (do we have a birth time for each available?), and in general to just leave as many bread crumbs as possible to chase this problem down
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it seems that the range tree "selector" using some mask is an unnecessary extra complexity.
Motivation and Context
There are production cases when loading of a metaslab leads to a ZFS panic due to unexpected entries in its spacemap (presumably). The assertions in
zfs_range_tree_add_impl()
andzfs_range_tree_remove_impl()
fail due to overlapping or missing segments, etc. A business would like to go ahead with such pools while the root cause is being investigated.Description
The idea is to allow loading such metaslabs with a potential space leak as a trade-off instead of a potential data loss.
We already have
zfs_recover
module parameter to mitigate various issues, including some range tree cases, and this patch addszfs_recover_ms
parameter to localize the recovery behavior to the metaslab loading process only.The following diagrams are expected to help with the details:
How Has This Been Tested?
Types of changes
Checklist:
Signed-off-by
.