gh-126835: make CFG optimizer skip over NOP's when looking for const sequence construction #129703

WolframAlph · 2025-02-05T20:58:48Z

We need to account for optimizations interfering with each other when moving more and more AST optimizations to CFG. Currently is_constant_sequence & (fold_tuple_on_constants & optimize_if_const_list_or_set that use it) do not take into account NOPs in between. This becomes problematic when there are several optimizations performed one after another as they NOP out unused instructions. Example:

(1, (1+2,),)[1][0]

This expression will trigger subscript & binop foldings. With current is_constant_sequence implementation, resulting instruction sequence cannot be optimized correctly. Here is final dis:

  1           RESUME                   0

  2           LOAD_SMALL_INT           1
              LOAD_CONST               1 ((3,))
              BUILD_TUPLE              2
              LOAD_SMALL_INT           1
              BINARY_SUBSCR
              LOAD_SMALL_INT           0
              BINARY_SUBSCR
              RETURN_VALUE

When correct sequence should be:

  1           RESUME                   0

  2           LOAD_SMALL_INT           3
              RETURN_VALUE

Here is basic block dump in a state when BUILD_TUPLE fails to be optimized by fold_tuple_on_constants:

As you can see, BUILD_TUPLE 2 has 2 consts before it but they are separated by NOPs due to previous binop folding. Obvious solution would be to remove NOPs on every iteration in optimize_basic_block which makes sense, but we cannot do that because we would be changing basic block size while iterating over it which breaks everything. So a different approach should be implemented. This PR presents one of the possible approaches to prepare us for the next optimizations migration to CFG. My main concern is complexity & readability as I think they suffer a bit from it. Another cleaner way I see is to use PyMem_Malloc to store indexes of corresponding instructions which seems cleaner but should be discussed.

cc @Eclips4 @tomasr8 @iritkatriel

Issue: Move const folding to the peephole optimizer #126835

Eclips4 · 2025-02-05T21:07:28Z

Good job. This problem also blocks #128802 where's NOP isn't taken into account in between.
I'll take a closer look tomorrow.

Python/flowgraph.c

iritkatriel · 2025-02-05T22:14:35Z

One thing to keep in mind about NOPs is that they are not necessarily going to be removed. When we replace an instruction by a NOP it retains the source location of the instruction we overwrote. If there is no other instruction with the same line, the NOP stays so that tracing/debugging sees that an instruction executed on that line. For example:

>>> def f():
...     1
...     2
...     3
...     return 4
...     
>>> dis.dis(f)
  1           RESUME                   0

  2           NOP

  3           NOP

  4           NOP

  5           LOAD_SMALL_INT           4
              RETURN_VALUE

I don't see a problem related to this here, just want you to be aware of this.

markshannon · 2025-02-06T13:39:49Z

I don't think adding complexity to the individual optimizations is the way to go.
All these optimizations can expose the opportunity for other optimizations, so ultimately we want to iterate to a fixed point.
We already have a pass to remove NOPs from a basic block, so we should incorporate that into the loop.

Take the constant 3 if (1 if True else 0, 1)[0 if 2 == 2 else 1] else 2 which evaluates to 3. To preform that evaluation in the CFG requires tuple folding, flow control simplification, comparison folding and subscript folding. All of which depend on each other to get the final result. The only way to fold this is iteratively.

iritkatriel · 2025-02-06T13:44:52Z

I don't think adding complexity to the individual optimizations is the way to go.
All these optimizations can expose the opportunity for other optimizations, so ultimately we want to iterate to a fixed point.
We already have a pass to remove NOPs from a basic block, so we should incorporate that into the loop.

The issue here is skipping NOPs that are not going to be removed (because source locations), but they are in the middle of a const sequence construction.

WolframAlph · 2025-02-06T13:50:56Z

@markshannon

3 if (1 if True else 0, 1)[0 if 2 == 2 else 1] else 2 which evaluates to 3

Current Python version does not fold it to constant 3. Constant folding is not done across basic block boundaries in cases where control flow present (like you posted) IIRC, and AST optimizations being moved to CFG operate on single blocks.

Constant folding is not done across basic block boundaries

Do you suggest this is something we will be adding?

markshannon · 2025-02-06T13:58:10Z

The example I give could be folded using AST transforms, even if it happens not to be.
So I think we should do the same in the CFG optimizer, which will need iteration.

@iritkatriel what's your take on this?

WolframAlph · 2025-02-06T14:04:37Z

But this particular logic does not interfere with potential control flow simplification. We just fold inside individual basic blocks. Sometimes multiple such foldings happen together and we need to take into account intermediate NOPs they leave so this is all this PR is about. Or am I missing something?

iritkatriel · 2025-02-06T14:10:31Z

The example I give could be folded using AST transforms, even if it happens not to be. So I think we should do the same in the CFG optimizer, which will need iteration.

@iritkatriel what's your take on this?

We have some of that in inline_small_or_no_lineno_blocks. We can put more steps into that loop.
But as I said, the issue here is NOPs that are not going to be removed, so fixed point won't help them.

Python/flowgraph.c

WolframAlph · 2025-02-08T11:00:18Z

Oh, turns out @iritkatriel merged a1417b2, so tests are failing because BINARY_SUBSCR does not exist anymore. I will rebase & update the PR.

WolframAlph · 2025-02-08T11:42:56Z

Done.

Lib/test/test_peepholer.py

Eclips4

LGTM. Thanks, Yan! Just to make sure there are no refleaks, I will run the test suite locally in huntrefleak mode.

Lib/test/test_peepholer.py

Co-authored-by: Irit Katriel <[email protected]>

Eclips4 · 2025-02-09T17:47:16Z

LGTM. Thanks, Yan! Just to make sure there are no refleaks, I will run the test suite locally in huntrefleak mode.

Results:

== Tests result: SUCCESS ==

22 tests skipped:
    test.test_asyncio.test_windows_events
    test.test_asyncio.test_windows_utils test_android test_apple
    test_devpoll test_free_threading test_idle test_kqueue
    test_launcher test_msvcrt test_smtpnet test_ssl test_startfile
    test_tcl test_tkinter test_ttk test_ttk_textonly test_turtle
    test_winapi test_winconsoleio test_winreg test_wmi

8 tests skipped (resource denied):
    test_curses test_peg_generator test_pyrepl test_socketserver
    test_urllib2net test_urllibnet test_winsound test_zipfile64

454 tests OK.

Total duration: 13 min 49 sec
Total tests: run=44,920 skipped=2,362
Total test files: run=476/484 skipped=22 resource_denied=8
Result: SUCCESS

WolframAlph requested review from markshannon and iritkatriel as code owners February 5, 2025 20:58

bedevere-app bot added the awaiting review label Feb 5, 2025

bedevere-app bot mentioned this pull request Feb 5, 2025

Move const folding to the peephole optimizer #126835

Open

Eclips4 added the skip news label Feb 5, 2025

Eclips4 reviewed Feb 5, 2025

View reviewed changes

Python/flowgraph.c Show resolved Hide resolved

iritkatriel reviewed Feb 5, 2025

View reviewed changes

Python/flowgraph.c Outdated Show resolved Hide resolved

Python/flowgraph.c Outdated Show resolved Hide resolved

python deleted a comment from raymondiiii Feb 6, 2025

iritkatriel reviewed Feb 6, 2025

View reviewed changes

Python/flowgraph.c Outdated Show resolved Hide resolved

Python/flowgraph.c Show resolved Hide resolved

Python/flowgraph.c Outdated Show resolved Hide resolved

Python/flowgraph.c Show resolved Hide resolved

Python/flowgraph.c Outdated Show resolved Hide resolved

WolframAlph requested a review from iritkatriel February 8, 2025 10:47

WolframAlph added 8 commits February 8, 2025 12:01

Make is_constant_sequence NOP aware

20e61ee

implement get_constant_sequence

8b6d294

disable debugging code

0845cd7

address review

d3768ac

address review

96ff172

add tests

ed7431d

disable debug code

f539224

update tests

5866faa

WolframAlph force-pushed the adapt-const-seq branch from 2fe4e7d to 5866faa Compare February 8, 2025 11:24

iritkatriel reviewed Feb 8, 2025

View reviewed changes

Lib/test/test_peepholer.py Show resolved Hide resolved

update tests

578d66b

WolframAlph requested a review from iritkatriel February 8, 2025 16:34

update tests

cd92614

iritkatriel approved these changes Feb 9, 2025

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting review labels Feb 9, 2025

Eclips4 approved these changes Feb 9, 2025

View reviewed changes

iritkatriel reviewed Feb 9, 2025

View reviewed changes

Lib/test/test_peepholer.py Outdated Show resolved Hide resolved

Lib/test/test_peepholer.py Outdated Show resolved Hide resolved

Update Lib/test/test_peepholer.py

1574122

iritkatriel changed the title ~~gh-126835: Redesign is_constant_sequence to take NOP's into account~~ gh-126835: make CFG optimizer skip over NOP's when looking for const sequence construction Feb 9, 2025

Update Lib/test/test_peepholer.py

16c66f9

Co-authored-by: Irit Katriel <[email protected]>

Eclips4 enabled auto-merge (squash) February 9, 2025 17:48

Eclips4 merged commit 91d9544 into python:main Feb 9, 2025
41 checks passed

bedevere-app bot removed the awaiting merge label Feb 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-126835: make CFG optimizer skip over NOP's when looking for const sequence construction #129703

gh-126835: make CFG optimizer skip over NOP's when looking for const sequence construction #129703

WolframAlph commented Feb 5, 2025 •

edited

Loading

Eclips4 commented Feb 5, 2025 •

edited

Loading

iritkatriel commented Feb 5, 2025

markshannon commented Feb 6, 2025

iritkatriel commented Feb 6, 2025

WolframAlph commented Feb 6, 2025 •

edited

Loading

markshannon commented Feb 6, 2025

WolframAlph commented Feb 6, 2025

iritkatriel commented Feb 6, 2025

WolframAlph commented Feb 8, 2025 •

edited

Loading

WolframAlph commented Feb 8, 2025

Eclips4 left a comment

Eclips4 commented Feb 9, 2025

gh-126835: make CFG optimizer skip over NOP's when looking for const sequence construction #129703

gh-126835: make CFG optimizer skip over NOP's when looking for const sequence construction #129703

Conversation

WolframAlph commented Feb 5, 2025 • edited Loading

Eclips4 commented Feb 5, 2025 • edited Loading

iritkatriel commented Feb 5, 2025

markshannon commented Feb 6, 2025

iritkatriel commented Feb 6, 2025

WolframAlph commented Feb 6, 2025 • edited Loading

markshannon commented Feb 6, 2025

WolframAlph commented Feb 6, 2025

iritkatriel commented Feb 6, 2025

WolframAlph commented Feb 8, 2025 • edited Loading

WolframAlph commented Feb 8, 2025

Eclips4 left a comment

Choose a reason for hiding this comment

Eclips4 commented Feb 9, 2025

WolframAlph commented Feb 5, 2025 •

edited

Loading

Eclips4 commented Feb 5, 2025 •

edited

Loading

WolframAlph commented Feb 6, 2025 •

edited

Loading

WolframAlph commented Feb 8, 2025 •

edited

Loading