[Loads] Support dereferenceable assumption with variable size. #128436

fhahn · 2025-02-23T21:14:27Z

Update isDereferenceableAndAlignedPointer to make use of dereferenceable assumptions with variable sizes via SCEV.

To do so, factor out the logic to check via an assumption to a helper, and use SE to check if the access size is less than the dereferenceable size.

llvmbot · 2025-02-23T21:15:02Z

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-analysis

Author: Florian Hahn (fhahn)

Changes

Update isDereferenceableAndAlignedPointer to make use of dereferenceable assumptions with variable sizes via SCEV.

To do so, factor out the logic to check via an assumption to a helper, and use SE to check if the access size is less than the dereferenceable size.

Full diff: https://github.com/llvm/llvm-project/pull/128436.diff

4 Files Affected:

(modified) llvm/include/llvm/Analysis/AssumeBundleQueries.h (+1)
(modified) llvm/lib/Analysis/AssumeBundleQueries.cpp (+1)
(modified) llvm/lib/Analysis/Loads.cpp (+75-28)
(modified) llvm/test/Transforms/LoopVectorize/dereferenceable-info-from-assumption-variable-size.ll (+42-8)

diff --git a/llvm/include/llvm/Analysis/AssumeBundleQueries.h b/llvm/include/llvm/Analysis/AssumeBundleQueries.h
index f7a893708758c..8577fc72ecd0f 100644
--- a/llvm/include/llvm/Analysis/AssumeBundleQueries.h
+++ b/llvm/include/llvm/Analysis/AssumeBundleQueries.h
@@ -99,6 +99,7 @@ void fillMapFromAssume(AssumeInst &Assume, RetainedKnowledgeMap &Result);
 struct RetainedKnowledge {
   Attribute::AttrKind AttrKind = Attribute::None;
   uint64_t ArgValue = 0;
+  Value *IRArgValue = nullptr;
   Value *WasOn = nullptr;
   bool operator==(RetainedKnowledge Other) const {
     return AttrKind == Other.AttrKind && WasOn == Other.WasOn &&
diff --git a/llvm/lib/Analysis/AssumeBundleQueries.cpp b/llvm/lib/Analysis/AssumeBundleQueries.cpp
index c27bfa6f3cc2c..7366fabca3eeb 100644
--- a/llvm/lib/Analysis/AssumeBundleQueries.cpp
+++ b/llvm/lib/Analysis/AssumeBundleQueries.cpp
@@ -114,6 +114,7 @@ llvm::getKnowledgeFromBundle(AssumeInst &Assume,
   };
   if (BOI.End - BOI.Begin > ABA_Argument)
     Result.ArgValue = GetArgOr1(0);
+  Result.IRArgValue = getValueFromBundleOpInfo(Assume, BOI, ABA_Argument);
   if (Result.AttrKind == Attribute::Alignment)
     if (BOI.End - BOI.Begin > ABA_Argument + 1)
       Result.ArgValue = MinAlign(Result.ArgValue, GetArgOr1(1));
diff --git a/llvm/lib/Analysis/Loads.cpp b/llvm/lib/Analysis/Loads.cpp
index b461c41d29e84..3b9df62f3a0bd 100644
--- a/llvm/lib/Analysis/Loads.cpp
+++ b/llvm/lib/Analysis/Loads.cpp
@@ -31,6 +31,35 @@ static bool isAligned(const Value *Base, Align Alignment,
   return Base->getPointerAlignment(DL) >= Alignment;
 }
 
+static bool isDereferenceableAndAlignedPointerViaAssumption(
+    const Value *Ptr, Align Alignment,
+    function_ref<bool(const RetainedKnowledge &RK)> CheckSize,
+    const DataLayout &DL, const Instruction *CtxI, AssumptionCache *AC,
+    const DominatorTree *DT) {
+  if (!CtxI || Ptr->canBeFreed())
+    return false;
+  /// Look through assumes to see if both dereferencability and alignment can
+  /// be proven by an assume if needed.
+  RetainedKnowledge AlignRK;
+  RetainedKnowledge DerefRK;
+  bool IsAligned = Ptr->getPointerAlignment(DL) >= Alignment;
+  return getKnowledgeForValue(
+      Ptr, {Attribute::Dereferenceable, Attribute::Alignment}, AC,
+      [&](RetainedKnowledge RK, Instruction *Assume, auto) {
+        if (!isValidAssumeForContext(Assume, CtxI, DT))
+          return false;
+        if (RK.AttrKind == Attribute::Alignment)
+          AlignRK = std::max(AlignRK, RK);
+        if (RK.AttrKind == Attribute::Dereferenceable)
+          DerefRK = std::max(DerefRK, RK);
+        IsAligned |= AlignRK && AlignRK.ArgValue >= Alignment.value();
+        if (IsAligned && DerefRK && CheckSize(DerefRK))
+          return true; // We have found what we needed so we stop looking
+        return false;  // Other assumes may have better information. so
+                       // keep looking
+      });
+}
+
 /// Test if V is always a pointer to allocated and suitably aligned memory for
 /// a simple load or store.
 static bool isDereferenceableAndAlignedPointer(
@@ -174,33 +203,41 @@ static bool isDereferenceableAndAlignedPointer(
   // information for values that cannot be freed in the function.
   // TODO: More precisely check if the pointer can be freed between assumption
   // and use.
-  if (CtxI && !V->canBeFreed()) {
-    /// Look through assumes to see if both dereferencability and alignment can
-    /// be proven by an assume if needed.
-    RetainedKnowledge AlignRK;
-    RetainedKnowledge DerefRK;
-    bool IsAligned = V->getPointerAlignment(DL) >= Alignment;
-    if (getKnowledgeForValue(
-            V, {Attribute::Dereferenceable, Attribute::Alignment}, AC,
-            [&](RetainedKnowledge RK, Instruction *Assume, auto) {
-              if (!isValidAssumeForContext(Assume, CtxI, DT))
-                return false;
-              if (RK.AttrKind == Attribute::Alignment)
-                AlignRK = std::max(AlignRK, RK);
-              if (RK.AttrKind == Attribute::Dereferenceable)
-                DerefRK = std::max(DerefRK, RK);
-              IsAligned |= AlignRK && AlignRK.ArgValue >= Alignment.value();
-              if (IsAligned && DerefRK &&
-                  DerefRK.ArgValue >= Size.getZExtValue())
-                return true; // We have found what we needed so we stop looking
-              return false;  // Other assumes may have better information. so
-                             // keep looking
-            }))
-      return true;
+  if (CtxI) {
+    const Value *UO = getUnderlyingObjectAggressive(V);
+    if (!V->canBeFreed() || (UO && !UO->canBeFreed())) {
+      /// Look through assumes to see if both dereferencability and alignment
+      /// can be proven by an assume if needed.
+      RetainedKnowledge AlignRK;
+      RetainedKnowledge DerefRK;
+      bool IsAligned = V->getPointerAlignment(DL) >= Alignment;
+      if (getKnowledgeForValue(
+              V, {Attribute::Dereferenceable, Attribute::Alignment}, AC,
+              [&](RetainedKnowledge RK, Instruction *Assume, auto) {
+                if (!isValidAssumeForContext(Assume, CtxI, DT))
+                  return false;
+                if (RK.AttrKind == Attribute::Alignment)
+                  AlignRK = std::max(AlignRK, RK);
+                if (RK.AttrKind == Attribute::Dereferenceable)
+                  DerefRK = std::max(DerefRK, RK);
+                IsAligned |= AlignRK && AlignRK.ArgValue >= Alignment.value();
+                if (IsAligned && DerefRK &&
+                    DerefRK.ArgValue >= Size.getZExtValue())
+                  return true; // We have found what we needed so we stop
+                               // looking
+                return false;  // Other assumes may have better information. so
+                               // keep looking
+              }))
+        return true;
+    }
   }
 
-  // If we don't know, assume the worst.
-  return false;
+  return isDereferenceableAndAlignedPointerViaAssumption(
+      V, Alignment,
+      [Size](const RetainedKnowledge &RK) {
+        return RK.ArgValue >= Size.getZExtValue();
+      },
+      DL, CtxI, AC, DT);
 }
 
 bool llvm::isDereferenceableAndAlignedPointer(
@@ -317,8 +354,8 @@ bool llvm::isDereferenceableAndAlignedInLoop(
     return false;
 
   const SCEV *MaxBECount =
-      Predicates ? SE.getPredicatedConstantMaxBackedgeTakenCount(L, *Predicates)
-                 : SE.getConstantMaxBackedgeTakenCount(L);
+      Predicates ? SE.getPredicatedSymbolicMaxBackedgeTakenCount(L, *Predicates)
+                 : SE.getSymbolicMaxBackedgeTakenCount(L);
   if (isa<SCEVCouldNotCompute>(MaxBECount))
     return false;
 
@@ -334,9 +371,11 @@ bool llvm::isDereferenceableAndAlignedInLoop(
 
   Value *Base = nullptr;
   APInt AccessSize;
+  const SCEV *AccessSizeSCEV = nullptr;
   if (const SCEVUnknown *NewBase = dyn_cast<SCEVUnknown>(AccessStart)) {
     Base = NewBase->getValue();
     AccessSize = MaxPtrDiff;
+    AccessSizeSCEV = PtrDiff;
   } else if (auto *MinAdd = dyn_cast<SCEVAddExpr>(AccessStart)) {
     if (MinAdd->getNumOperands() != 2)
       return false;
@@ -360,12 +399,20 @@ bool llvm::isDereferenceableAndAlignedInLoop(
       return false;
 
     AccessSize = MaxPtrDiff + Offset->getAPInt();
+    AccessSizeSCEV = SE.getAddExpr(PtrDiff, Offset);
     Base = NewBase->getValue();
   } else
     return false;
 
   Instruction *HeaderFirstNonPHI = &*L->getHeader()->getFirstNonPHIIt();
-  return isDereferenceableAndAlignedPointer(Base, Alignment, AccessSize, DL,
+  return isDereferenceableAndAlignedPointerViaAssumption(
+             Base, Alignment,
+             [&SE, PtrDiff](const RetainedKnowledge &RK) {
+               return SE.isKnownPredicate(CmpInst::ICMP_ULE, PtrDiff,
+                                          SE.getSCEV(RK.IRArgValue));
+             },
+             DL, HeaderFirstNonPHI, AC, &DT) ||
+         isDereferenceableAndAlignedPointer(Base, Alignment, AccessSize, DL,
                                             HeaderFirstNonPHI, AC, &DT);
 }
 
diff --git a/llvm/test/Transforms/LoopVectorize/dereferenceable-info-from-assumption-variable-size.ll b/llvm/test/Transforms/LoopVectorize/dereferenceable-info-from-assumption-variable-size.ll
index d1cbe02192e31..344f4c5bb0d79 100644
--- a/llvm/test/Transforms/LoopVectorize/dereferenceable-info-from-assumption-variable-size.ll
+++ b/llvm/test/Transforms/LoopVectorize/dereferenceable-info-from-assumption-variable-size.ll
@@ -185,15 +185,32 @@ define void @deref_assumption_in_preheader_too_small_non_constant_trip_count_acc
 ; CHECK-NEXT:    [[N_VEC:%.*]] = sub i64 [[N]], [[N_MOD_VF]]
 ; CHECK-NEXT:    br label %[[VECTOR_BODY:.*]]
 ; CHECK:       [[VECTOR_BODY]]:
-; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[PRED_LOAD_CONTINUE2:.*]] ]
 ; CHECK-NEXT:    [[TMP0:%.*]] = add i64 [[INDEX]], 0
-; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr i32, ptr [[A]], i64 [[TMP0]]
 ; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[TMP0]]
 ; CHECK-NEXT:    [[TMP3:%.*]] = getelementptr inbounds i32, ptr [[TMP2]], i32 0
 ; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <2 x i32>, ptr [[TMP3]], align 1
 ; CHECK-NEXT:    [[TMP4:%.*]] = icmp sge <2 x i32> [[WIDE_LOAD]], zeroinitializer
-; CHECK-NEXT:    [[TMP5:%.*]] = getelementptr i32, ptr [[TMP1]], i32 0
-; CHECK-NEXT:    [[WIDE_LOAD1:%.*]] = load <2 x i32>, ptr [[TMP5]], align 1
+; CHECK-NEXT:    [[TMP15:%.*]] = xor <2 x i1> [[TMP4]], splat (i1 true)
+; CHECK-NEXT:    [[TMP5:%.*]] = extractelement <2 x i1> [[TMP15]], i32 0
+; CHECK-NEXT:    br i1 [[TMP5]], label %[[PRED_LOAD_IF:.*]], label %[[PRED_LOAD_CONTINUE:.*]]
+; CHECK:       [[PRED_LOAD_IF]]:
+; CHECK-NEXT:    [[TMP16:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP0]]
+; CHECK-NEXT:    [[TMP17:%.*]] = load i32, ptr [[TMP16]], align 1
+; CHECK-NEXT:    [[TMP18:%.*]] = insertelement <2 x i32> poison, i32 [[TMP17]], i32 0
+; CHECK-NEXT:    br label %[[PRED_LOAD_CONTINUE]]
+; CHECK:       [[PRED_LOAD_CONTINUE]]:
+; CHECK-NEXT:    [[TMP9:%.*]] = phi <2 x i32> [ poison, %[[VECTOR_BODY]] ], [ [[TMP18]], %[[PRED_LOAD_IF]] ]
+; CHECK-NEXT:    [[TMP10:%.*]] = extractelement <2 x i1> [[TMP15]], i32 1
+; CHECK-NEXT:    br i1 [[TMP10]], label %[[PRED_LOAD_IF1:.*]], label %[[PRED_LOAD_CONTINUE2]]
+; CHECK:       [[PRED_LOAD_IF1]]:
+; CHECK-NEXT:    [[TMP11:%.*]] = add i64 [[INDEX]], 1
+; CHECK-NEXT:    [[TMP12:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP11]]
+; CHECK-NEXT:    [[TMP13:%.*]] = load i32, ptr [[TMP12]], align 1
+; CHECK-NEXT:    [[TMP14:%.*]] = insertelement <2 x i32> [[TMP9]], i32 [[TMP13]], i32 1
+; CHECK-NEXT:    br label %[[PRED_LOAD_CONTINUE2]]
+; CHECK:       [[PRED_LOAD_CONTINUE2]]:
+; CHECK-NEXT:    [[WIDE_LOAD1:%.*]] = phi <2 x i32> [ [[TMP9]], %[[PRED_LOAD_CONTINUE]] ], [ [[TMP14]], %[[PRED_LOAD_IF1]] ]
 ; CHECK-NEXT:    [[PREDPHI:%.*]] = select <2 x i1> [[TMP4]], <2 x i32> [[WIDE_LOAD]], <2 x i32> [[WIDE_LOAD1]]
 ; CHECK-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i32, ptr [[C]], i64 [[TMP0]]
 ; CHECK-NEXT:    [[TMP7:%.*]] = getelementptr inbounds i32, ptr [[TMP6]], i32 0
@@ -268,15 +285,32 @@ define void @deref_assumption_in_preheader_too_small2_non_constant_trip_count_ac
 ; CHECK-NEXT:    [[N_VEC:%.*]] = sub i64 [[N]], [[N_MOD_VF]]
 ; CHECK-NEXT:    br label %[[VECTOR_BODY:.*]]
 ; CHECK:       [[VECTOR_BODY]]:
-; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[PRED_LOAD_CONTINUE2:.*]] ]
 ; CHECK-NEXT:    [[TMP0:%.*]] = add i64 [[INDEX]], 0
-; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr i32, ptr [[A]], i64 [[TMP0]]
 ; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[TMP0]]
 ; CHECK-NEXT:    [[TMP3:%.*]] = getelementptr inbounds i32, ptr [[TMP2]], i32 0
 ; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <2 x i32>, ptr [[TMP3]], align 1
 ; CHECK-NEXT:    [[TMP4:%.*]] = icmp sge <2 x i32> [[WIDE_LOAD]], zeroinitializer
-; CHECK-NEXT:    [[TMP5:%.*]] = getelementptr i32, ptr [[TMP1]], i32 0
-; CHECK-NEXT:    [[WIDE_LOAD1:%.*]] = load <2 x i32>, ptr [[TMP5]], align 1
+; CHECK-NEXT:    [[TMP15:%.*]] = xor <2 x i1> [[TMP4]], splat (i1 true)
+; CHECK-NEXT:    [[TMP5:%.*]] = extractelement <2 x i1> [[TMP15]], i32 0
+; CHECK-NEXT:    br i1 [[TMP5]], label %[[PRED_LOAD_IF:.*]], label %[[PRED_LOAD_CONTINUE:.*]]
+; CHECK:       [[PRED_LOAD_IF]]:
+; CHECK-NEXT:    [[TMP16:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP0]]
+; CHECK-NEXT:    [[TMP17:%.*]] = load i32, ptr [[TMP16]], align 1
+; CHECK-NEXT:    [[TMP18:%.*]] = insertelement <2 x i32> poison, i32 [[TMP17]], i32 0
+; CHECK-NEXT:    br label %[[PRED_LOAD_CONTINUE]]
+; CHECK:       [[PRED_LOAD_CONTINUE]]:
+; CHECK-NEXT:    [[TMP9:%.*]] = phi <2 x i32> [ poison, %[[VECTOR_BODY]] ], [ [[TMP18]], %[[PRED_LOAD_IF]] ]
+; CHECK-NEXT:    [[TMP10:%.*]] = extractelement <2 x i1> [[TMP15]], i32 1
+; CHECK-NEXT:    br i1 [[TMP10]], label %[[PRED_LOAD_IF1:.*]], label %[[PRED_LOAD_CONTINUE2]]
+; CHECK:       [[PRED_LOAD_IF1]]:
+; CHECK-NEXT:    [[TMP11:%.*]] = add i64 [[INDEX]], 1
+; CHECK-NEXT:    [[TMP12:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP11]]
+; CHECK-NEXT:    [[TMP13:%.*]] = load i32, ptr [[TMP12]], align 1
+; CHECK-NEXT:    [[TMP14:%.*]] = insertelement <2 x i32> [[TMP9]], i32 [[TMP13]], i32 1
+; CHECK-NEXT:    br label %[[PRED_LOAD_CONTINUE2]]
+; CHECK:       [[PRED_LOAD_CONTINUE2]]:
+; CHECK-NEXT:    [[WIDE_LOAD1:%.*]] = phi <2 x i32> [ [[TMP9]], %[[PRED_LOAD_CONTINUE]] ], [ [[TMP14]], %[[PRED_LOAD_IF1]] ]
 ; CHECK-NEXT:    [[PREDPHI:%.*]] = select <2 x i1> [[TMP4]], <2 x i32> [[WIDE_LOAD]], <2 x i32> [[WIDE_LOAD1]]
 ; CHECK-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i32, ptr [[C]], i64 [[TMP0]]
 ; CHECK-NEXT:    [[TMP7:%.*]] = getelementptr inbounds i32, ptr [[TMP6]], i32 0

david-arm · 2025-02-24T09:53:06Z

llvm/test/Transforms/LoopVectorize/dereferenceable-info-from-assumption-variable-size.ll

+; CHECK-NEXT:    br i1 [[TMP5]], label %[[PRED_LOAD_IF:.*]], label %[[PRED_LOAD_CONTINUE:.*]]
+; CHECK:       [[PRED_LOAD_IF]]:
+; CHECK-NEXT:    [[TMP16:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP0]]
+; CHECK-NEXT:    [[TMP17:%.*]] = load i32, ptr [[TMP16]], align 1


I'm a bit confused. If you're increasing the number of places where we can treat the load as dereferenceable, why does it regress the code here to use conditional loads?

Yep, the current code handles the assumption with variable sizes incorrectly due to getStartAndEndForAccess wrapping; in this case we pass the unsigned max as trip count (-1) and the end wraps around to 0 (fixed by #128061)

llvm/lib/Analysis/Loads.cpp

david-arm · 2025-02-24T11:33:44Z

llvm/lib/Analysis/Loads.cpp

@@ -334,9 +342,11 @@ bool llvm::isDereferenceableAndAlignedInLoop(

  Value *Base = nullptr;
  APInt AccessSize;
+  const SCEV *AccessSizeSCEV = nullptr;


This seems to be unused

Replaced the PtrDiff use below with AccessSizeSCEV, thanks

Update isDereferenceableAndAlignedPointer to make use of dereferenceable assumptions with variable sizes via SCEV. To do so, factor out the logic to check via an assumption to a helper, and use SE to check if the access size is less than the dereferenceable size.

nikic

This looks conceptually fine to me, but we should wait for #128061 to land first to clarify the wrapping situation.

fhahn requested review from nikic, preames, jdoerfert and david-arm February 23, 2025 21:14

llvmbot added llvm:analysis llvm:transforms labels Feb 23, 2025

david-arm reviewed Feb 24, 2025

View reviewed changes

nikic reviewed Feb 24, 2025

View reviewed changes

llvm/lib/Analysis/Loads.cpp Outdated Show resolved Hide resolved

fhahn force-pushed the loads-deref-variable-size branch from 46d4a8c to beabe4d Compare February 24, 2025 11:28

david-arm reviewed Feb 24, 2025

View reviewed changes

fhahn added 3 commits February 24, 2025 21:05

!fixup remove left over code.

7175d97

!fixup use AccessSizeSCEV.

ce12cac

fhahn force-pushed the loads-deref-variable-size branch from beabe4d to ce12cac Compare February 24, 2025 21:24

Kmeakin mentioned this pull request Feb 24, 2025

Emit dereferenceable assumptions for slices rust-lang/rust#137568

Open

nikic reviewed Feb 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Loads] Support dereferenceable assumption with variable size. #128436

[Loads] Support dereferenceable assumption with variable size. #128436

fhahn commented Feb 23, 2025

llvmbot commented Feb 23, 2025 •

edited

Loading

david-arm Feb 24, 2025

fhahn Feb 24, 2025

david-arm Feb 24, 2025

fhahn Feb 24, 2025

nikic left a comment

[Loads] Support dereferenceable assumption with variable size. #128436

Are you sure you want to change the base?

[Loads] Support dereferenceable assumption with variable size. #128436

Conversation

fhahn commented Feb 23, 2025

llvmbot commented Feb 23, 2025 • edited Loading

david-arm Feb 24, 2025

Choose a reason for hiding this comment

fhahn Feb 24, 2025

Choose a reason for hiding this comment

david-arm Feb 24, 2025

Choose a reason for hiding this comment

fhahn Feb 24, 2025

Choose a reason for hiding this comment

nikic left a comment

Choose a reason for hiding this comment

llvmbot commented Feb 23, 2025 •

edited

Loading