Improved handling of enum columns in pk #412

tgriffitts-vs · 2019-09-01T18:46:16Z

https://jira.percona.com/browse/PT-1591
https://jira.percona.com/browse/PT-1572

These tickets describe a problem with the underlying NibbleIterator in the way it treats enum values in keys. In short, e.g., c enum('a', 'z', 'b'), the issue is that MySQL ORDER BY clause orders by enum index value ('a', 'z', 'b') and the NibbleIterator specifies boundaries as enum name value strings, e.g., WHERE c <= 'b', which are treated as alphanumeric comparisons-- not enum index value comparisons. In MySQL, this returns only 'a' because 'a' is alphanumerically <= 'b', but the ORDER BY returns 'z' rows before 'b' rows because ORDER BY is sorting by enum index values; thus, 'z' rows may be missed in the nibble. The solution provided in the tickets quoted above changes the ORDER BY clause to force conversion of the enum to a string, which in turn forces creation of a new sort index each nibble. On large tables, this is unreasonably slow (it would have taken 32 years to convert our table of 1.7billion rows).

The solution in this patch assures all references to enum columns are referenced by their enum index value. In MySQL, this is done by casting an enum column to an UNSIGNED; when an enum value is cast to an UNSIGNED, its enum index value is used. This allows the ORDER BY clause to naturally sort by the PK sort order using the enum index value, and also allows comparisons to operate correctly on enum index values, since they are comparing the index values as UNSIGNEDs instead of performing an alphanumeric comparison on the enum name values. This patch removes all references to enum columns by their name values. This enables the natural order of the key to be used in the ORDER BY, removing the forced conversion of enum values to strings, which in turn removes the need for MySQL to create a sort index for each nibble. The practical result of this patch reduces the duration for a schema update of our tables from years to hours, and I hope is a more desired fix for the issue recorded in the tickets mentioned above. In summary: reference enums by enum index values, always.

it-percona · 2019-09-01T18:46:23Z

All committers have signed the CLA.

svetasmirnova

Thank you for your contribution.

You modified the auto-generated code. It will be removed once we run the update-modules utility. Please put your changes into the TableNibbler package instead.

percona-csalguero · 2022-09-20T11:23:03Z

Hi,
As Sveta mentioned, these changes should be applied on NibbleIterator.pm and then they must be applied to all programs using that package with something like:

for t in bin/*; do util/update-modules $t NibbleIterator; done

percona-csalguero · 2022-09-20T11:23:55Z

Please also:

Fix merge conflicts
This should be merged into the 3.x branch, not in 3.0

percona-csalguero · 2023-02-22T14:04:51Z

@svetasmirnova Since this issue seems to be abandoned, can we close it?

svetasmirnova · 2023-02-22T16:37:07Z

@percona-csalguero let me review if changes are safe first. We can probably adjust the patch ourselves if it can help us to improve speed of handling ENUM indexes without loosing data.

Improved handling of enum columns in pk

781e5c7

rnovikovP requested a review from percona-csalguero September 2, 2019 09:21

svetasmirnova requested changes Sep 14, 2022

View reviewed changes

svetasmirnova changed the base branch from 3.0 to 3.x November 7, 2023 13:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved handling of enum columns in pk #412

Improved handling of enum columns in pk #412

tgriffitts-vs commented Sep 1, 2019 •

edited

Loading

it-percona commented Sep 1, 2019 •

edited

Loading

svetasmirnova left a comment

percona-csalguero commented Sep 20, 2022

percona-csalguero commented Sep 20, 2022

percona-csalguero commented Feb 22, 2023

svetasmirnova commented Feb 22, 2023

Improved handling of enum columns in pk #412

Are you sure you want to change the base?

Improved handling of enum columns in pk #412

Conversation

tgriffitts-vs commented Sep 1, 2019 • edited Loading

it-percona commented Sep 1, 2019 • edited Loading

svetasmirnova left a comment

Choose a reason for hiding this comment

percona-csalguero commented Sep 20, 2022

percona-csalguero commented Sep 20, 2022

percona-csalguero commented Feb 22, 2023

svetasmirnova commented Feb 22, 2023

tgriffitts-vs commented Sep 1, 2019 •

edited

Loading

it-percona commented Sep 1, 2019 •

edited

Loading