Skip to content

Commit

Permalink
[AMDGPU] Push amdgpu-preload-kern-arg-prolog after livedebugvalues (l…
Browse files Browse the repository at this point in the history
…lvm#126148)

This is effectively a workaround for a bug in livedebugvalues, but seems
to potentially be a general improvement, as BB sections seems like it
could ruin the special 256-byte prelude scheme that
amdgpu-preload-kern-arg-prolog requires anyway. Moving it even later
doesn't seem to have any material impact, and just adds livedebugvalues
to the list of things which no longer have to deal with pseudo
multiple-entry functions.

AMDGPU debug-info isn't supported upstream yet, so the bug being avoided
isn't testable here. I am posting the patch upstream to avoid an
unnecessary diff with AMD's fork.
  • Loading branch information
slinder1 authored Feb 17, 2025
1 parent eaa460c commit 29ca3b8
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 5 deletions.
6 changes: 6 additions & 0 deletions llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1151,6 +1151,7 @@ class GCNPassConfig final : public AMDGPUPassConfig {
void addPostRegAlloc() override;
void addPreSched2() override;
void addPreEmitPass() override;
void addPostBBSections() override;
};

} // end anonymous namespace
Expand Down Expand Up @@ -1690,6 +1691,11 @@ void GCNPassConfig::addPreEmitPass() {
addPass(&AMDGPUInsertDelayAluID);

addPass(&BranchRelaxationPassID);
}

void GCNPassConfig::addPostBBSections() {
// We run this later to avoid passes like livedebugvalues and BBSections
// having to deal with the apparent multi-entry functions we may generate.
addPass(createAMDGPUPreloadKernArgPrologLegacyPass());
}

Expand Down
10 changes: 5 additions & 5 deletions llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
Original file line number Diff line number Diff line change
Expand Up @@ -145,11 +145,11 @@
; GCN-O0-NEXT: Post RA hazard recognizer
; GCN-O0-NEXT: AMDGPU Insert waits for SGPR read hazards
; GCN-O0-NEXT: Branch relaxation pass
; GCN-O0-NEXT: AMDGPU Preload Kernel Arguments Prolog
; GCN-O0-NEXT: Register Usage Information Collector Pass
; GCN-O0-NEXT: Remove Loads Into Fake Uses
; GCN-O0-NEXT: Live DEBUG_VALUE analysis
; GCN-O0-NEXT: Machine Sanitizer Binary Metadata
; GCN-O0-NEXT: AMDGPU Preload Kernel Arguments Prolog
; GCN-O0-NEXT: Lazy Machine Block Frequency Analysis
; GCN-O0-NEXT: Machine Optimization Remark Emitter
; GCN-O0-NEXT: Stack Frame Layout Analysis
Expand Down Expand Up @@ -430,11 +430,11 @@
; GCN-O1-NEXT: AMDGPU Insert waits for SGPR read hazards
; GCN-O1-NEXT: AMDGPU Insert Delay ALU
; GCN-O1-NEXT: Branch relaxation pass
; GCN-O1-NEXT: AMDGPU Preload Kernel Arguments Prolog
; GCN-O1-NEXT: Register Usage Information Collector Pass
; GCN-O1-NEXT: Remove Loads Into Fake Uses
; GCN-O1-NEXT: Live DEBUG_VALUE analysis
; GCN-O1-NEXT: Machine Sanitizer Binary Metadata
; GCN-O1-NEXT: AMDGPU Preload Kernel Arguments Prolog
; GCN-O1-NEXT: Lazy Machine Block Frequency Analysis
; GCN-O1-NEXT: Machine Optimization Remark Emitter
; GCN-O1-NEXT: Stack Frame Layout Analysis
Expand Down Expand Up @@ -743,11 +743,11 @@
; GCN-O1-OPTS-NEXT: AMDGPU Insert waits for SGPR read hazards
; GCN-O1-OPTS-NEXT: AMDGPU Insert Delay ALU
; GCN-O1-OPTS-NEXT: Branch relaxation pass
; GCN-O1-OPTS-NEXT: AMDGPU Preload Kernel Arguments Prolog
; GCN-O1-OPTS-NEXT: Register Usage Information Collector Pass
; GCN-O1-OPTS-NEXT: Remove Loads Into Fake Uses
; GCN-O1-OPTS-NEXT: Live DEBUG_VALUE analysis
; GCN-O1-OPTS-NEXT: Machine Sanitizer Binary Metadata
; GCN-O1-OPTS-NEXT: AMDGPU Preload Kernel Arguments Prolog
; GCN-O1-OPTS-NEXT: Lazy Machine Block Frequency Analysis
; GCN-O1-OPTS-NEXT: Machine Optimization Remark Emitter
; GCN-O1-OPTS-NEXT: Stack Frame Layout Analysis
Expand Down Expand Up @@ -1062,11 +1062,11 @@
; GCN-O2-NEXT: AMDGPU Insert waits for SGPR read hazards
; GCN-O2-NEXT: AMDGPU Insert Delay ALU
; GCN-O2-NEXT: Branch relaxation pass
; GCN-O2-NEXT: AMDGPU Preload Kernel Arguments Prolog
; GCN-O2-NEXT: Register Usage Information Collector Pass
; GCN-O2-NEXT: Remove Loads Into Fake Uses
; GCN-O2-NEXT: Live DEBUG_VALUE analysis
; GCN-O2-NEXT: Machine Sanitizer Binary Metadata
; GCN-O2-NEXT: AMDGPU Preload Kernel Arguments Prolog
; GCN-O2-NEXT: Lazy Machine Block Frequency Analysis
; GCN-O2-NEXT: Machine Optimization Remark Emitter
; GCN-O2-NEXT: Stack Frame Layout Analysis
Expand Down Expand Up @@ -1394,11 +1394,11 @@
; GCN-O3-NEXT: AMDGPU Insert waits for SGPR read hazards
; GCN-O3-NEXT: AMDGPU Insert Delay ALU
; GCN-O3-NEXT: Branch relaxation pass
; GCN-O3-NEXT: AMDGPU Preload Kernel Arguments Prolog
; GCN-O3-NEXT: Register Usage Information Collector Pass
; GCN-O3-NEXT: Remove Loads Into Fake Uses
; GCN-O3-NEXT: Live DEBUG_VALUE analysis
; GCN-O3-NEXT: Machine Sanitizer Binary Metadata
; GCN-O3-NEXT: AMDGPU Preload Kernel Arguments Prolog
; GCN-O3-NEXT: Lazy Machine Block Frequency Analysis
; GCN-O3-NEXT: Machine Optimization Remark Emitter
; GCN-O3-NEXT: Stack Frame Layout Analysis
Expand Down

0 comments on commit 29ca3b8

Please sign in to comment.