Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the fmod intrinsic #130320

Merged
merged 11 commits into from
Mar 12, 2025
Merged

Conversation

kmpeng
Copy link
Contributor

@kmpeng kmpeng commented Mar 7, 2025

Replaced the current fmod definition with a templatized version, implemented fmod algorithm for DirectX targets that matches the DXC implementation, added corresponding tests in clang/test/CodeGenHLSL/builtins/fmod.hlsl and clang/test/SemaHLSL/BuiltIns/fmod-errors.hlsl.

Closes #99118.

Copy link

github-actions bot commented Mar 7, 2025

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics HLSL HLSL Language Support labels Mar 7, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 7, 2025

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-hlsl

Author: Kaitlin Peng (kmpeng)

Changes

Replaced the current fmod definition with a templatized version, implemented fmod algorithm for DirectX targets that matches the DXC implementation, added corresponding tests in clang/test/CodeGenHLSL/builtins/fmod.hlsl and clang/test/SemaHLSL/BuiltIns/fmod-errors.hlsl.


Full diff: https://github.com/llvm/llvm-project/pull/130320.diff

5 Files Affected:

  • (modified) clang/lib/Headers/hlsl/hlsl_alias_intrinsics.h (-34)
  • (modified) clang/lib/Headers/hlsl/hlsl_detail.h (+25)
  • (modified) clang/lib/Headers/hlsl/hlsl_intrinsics.h (+32)
  • (modified) clang/test/CodeGenHLSL/builtins/fmod.hlsl (+111-23)
  • (added) clang/test/SemaHLSL/BuiltIns/fmod-errors.hlsl (+33)
diff --git a/clang/lib/Headers/hlsl/hlsl_alias_intrinsics.h b/clang/lib/Headers/hlsl/hlsl_alias_intrinsics.h
index 7573f6e024167..cd59fef82d9f9 100644
--- a/clang/lib/Headers/hlsl/hlsl_alias_intrinsics.h
+++ b/clang/lib/Headers/hlsl/hlsl_alias_intrinsics.h
@@ -1237,40 +1237,6 @@ float3 floor(float3);
 _HLSL_BUILTIN_ALIAS(__builtin_elementwise_floor)
 float4 floor(float4);
 
-//===----------------------------------------------------------------------===//
-// fmod builtins
-//===----------------------------------------------------------------------===//
-
-/// \fn T fmod(T x, T y)
-/// \brief Returns the linear interpolation of x to y.
-/// \param x [in] The dividend.
-/// \param y [in] The divisor.
-///
-/// Return the floating-point remainder of the x parameter divided by the y
-/// parameter.
-
-_HLSL_16BIT_AVAILABILITY(shadermodel, 6.2)
-_HLSL_BUILTIN_ALIAS(__builtin_elementwise_fmod)
-half fmod(half, half);
-_HLSL_16BIT_AVAILABILITY(shadermodel, 6.2)
-_HLSL_BUILTIN_ALIAS(__builtin_elementwise_fmod)
-half2 fmod(half2, half2);
-_HLSL_16BIT_AVAILABILITY(shadermodel, 6.2)
-_HLSL_BUILTIN_ALIAS(__builtin_elementwise_fmod)
-half3 fmod(half3, half3);
-_HLSL_16BIT_AVAILABILITY(shadermodel, 6.2)
-_HLSL_BUILTIN_ALIAS(__builtin_elementwise_fmod)
-half4 fmod(half4, half4);
-
-_HLSL_BUILTIN_ALIAS(__builtin_elementwise_fmod)
-float fmod(float, float);
-_HLSL_BUILTIN_ALIAS(__builtin_elementwise_fmod)
-float2 fmod(float2, float2);
-_HLSL_BUILTIN_ALIAS(__builtin_elementwise_fmod)
-float3 fmod(float3, float3);
-_HLSL_BUILTIN_ALIAS(__builtin_elementwise_fmod)
-float4 fmod(float4, float4);
-
 //===----------------------------------------------------------------------===//
 // frac builtins
 //===----------------------------------------------------------------------===//
diff --git a/clang/lib/Headers/hlsl/hlsl_detail.h b/clang/lib/Headers/hlsl/hlsl_detail.h
index 39254a3cc3a0a..470cc4db24cd2 100644
--- a/clang/lib/Headers/hlsl/hlsl_detail.h
+++ b/clang/lib/Headers/hlsl/hlsl_detail.h
@@ -97,6 +97,31 @@ constexpr vector<T, L> reflect_vec_impl(vector<T, L> I, vector<T, L> N) {
 #endif
 }
 
+template <typename T>
+constexpr enable_if_t<is_same<float, T>::value || is_same<half, T>::value, T>
+fmod_impl(T X, T Y) {
+#if !defined(__DIRECTX__)
+  return __builtin_elementwise_fmod(X, Y);
+#else 
+  T div = X / Y;
+  bool ge = div >= -div;
+  T frc = frac(abs(div));
+  return select<T>(ge, frc, -frc) * Y;
+#endif
+}
+
+template <typename T, int N>
+constexpr vector<T, N> fmod_vec_impl(vector<T, N> X, vector<T, N> Y) {
+#if !defined(__DIRECTX__)
+  return __builtin_elementwise_fmod(X, Y);
+#else 
+  vector<T, N> div = X / Y;
+  vector<bool, N> ge = div >= -div;
+  vector<T, N> frc = frac(abs(div)); 
+  return select<T>(ge, frc, -frc) * Y;
+#endif
+}
+
 } // namespace __detail
 } // namespace hlsl
 #endif //_HLSL_HLSL_DETAILS_H_
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsics.h b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
index fe9441080433d..3393c955e9a4b 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsics.h
+++ b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
@@ -109,6 +109,38 @@ const inline float distance(vector<float, N> X, vector<float, N> Y) {
   return __detail::distance_vec_impl(X, Y);
 }
 
+//===----------------------------------------------------------------------===//
+// fmod builtins
+//===----------------------------------------------------------------------===//
+
+/// \fn T fmod(T x, T y)
+/// \brief Returns the linear interpolation of x to y.
+/// \param x [in] The dividend.
+/// \param y [in] The divisor.
+///
+/// Return the floating-point remainder of the x parameter divided by the y
+/// parameter.
+
+_HLSL_16BIT_AVAILABILITY(shadermodel, 6.2)
+const inline half fmod(half X, half Y) {
+  return __detail::fmod_impl(X, Y);
+}
+
+const inline float fmod(float X, float Y) {
+  return __detail::fmod_impl(X, Y);
+}
+
+template <int N>
+_HLSL_16BIT_AVAILABILITY(shadermodel, 6.2)
+const inline vector<half, N> fmod(vector<half, N> X, vector<half, N> Y) {
+  return __detail::fmod_vec_impl(X, Y);
+}
+
+template <int N>
+const inline vector<float, N> fmod(vector<float, N> X, vector<float, N> Y) {
+  return __detail::fmod_vec_impl(X, Y);
+}
+
 //===----------------------------------------------------------------------===//
 // length builtins
 //===----------------------------------------------------------------------===//
diff --git a/clang/test/CodeGenHLSL/builtins/fmod.hlsl b/clang/test/CodeGenHLSL/builtins/fmod.hlsl
index b62967114d456..359463b914c98 100644
--- a/clang/test/CodeGenHLSL/builtins/fmod.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/fmod.hlsl
@@ -4,16 +4,16 @@
 //
 // RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
 // RUN:   dxil-pc-shadermodel6.3-library %s -fnative-half-type \
-// RUN:   -emit-llvm -disable-llvm-passes -o - | FileCheck %s \
-// RUN:   -DFNATTRS="noundef nofpclass(nan inf)" -DTYPE=half
+// RUN:   -emit-llvm -o - | FileCheck %s -DFNATTRS="noundef nofpclass(nan inf)" \
+// RUN:   -DTYPE=half --check-prefixes=DXCHECK,DXNATIVE_HALF
 
 //
 // ---------- No Native Half support test -----------
 //
 // RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
-// RUN:   dxil-pc-shadermodel6.3-library %s -emit-llvm -disable-llvm-passes \
-// RUN:   -o - | FileCheck %s \
-// RUN:   -DFNATTRS="noundef nofpclass(nan inf)" -DTYPE=float
+// RUN:   dxil-pc-shadermodel6.3-library %s -emit-llvm \
+// RUN:   -o - | FileCheck %s -DFNATTRS="noundef nofpclass(nan inf)" \
+// RUN:   -DTYPE=float --check-prefixes=DXCHECK,DXNO_HALF
 
 
 // Spirv target:
@@ -22,56 +22,144 @@
 //
 // RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
 // RUN:   spirv-unknown-vulkan-compute %s -fnative-half-type \
-// RUN:   -emit-llvm -disable-llvm-passes -o - | FileCheck %s \
+// RUN:   -emit-llvm -o - | FileCheck %s \
 // RUN:   -DFNATTRS="spir_func noundef nofpclass(nan inf)" -DTYPE=half
 
 //
 // ---------- No Native Half support test -----------
 //
 // RUN: %clang_cc1 -finclude-default-header -x hlsl -triple \
-// RUN:   spirv-unknown-vulkan-compute %s -emit-llvm -disable-llvm-passes \
+// RUN:   spirv-unknown-vulkan-compute %s -emit-llvm \
 // RUN:   -o - | FileCheck %s \
 // RUN:   -DFNATTRS="spir_func noundef nofpclass(nan inf)" -DTYPE=float
 
 
 
+// DXCHECK: define [[FNATTRS]] [[TYPE]] @
+// DXCHECK: %div1.i = fdiv reassoc nnan ninf nsz arcp afn [[TYPE]]
+// DXCHECK: %fneg.i = fneg reassoc nnan ninf nsz arcp afn [[TYPE]]
+// DXCHECK: %cmp.i = fcmp reassoc nnan ninf nsz arcp afn oge [[TYPE]]
+// DXNATIVE_HALF: %elt.abs.i = call reassoc nnan ninf nsz arcp afn [[TYPE]] @llvm.fabs.f16([[TYPE]]
+// DXNO_HALF: %elt.abs.i = call reassoc nnan ninf nsz arcp afn [[TYPE]] @llvm.fabs.f32([[TYPE]]
+// DXNATIVE_HALF: %hlsl.frac.i = call reassoc nnan ninf nsz arcp afn [[TYPE]] @llvm.dx.frac.f16([[TYPE]]
+// DXNO_HALF: %hlsl.frac.i = call reassoc nnan ninf nsz arcp afn [[TYPE]] @llvm.dx.frac.f32([[TYPE]]
+// DXCHECK: %fneg2.i = fneg reassoc nnan ninf nsz arcp afn [[TYPE]]
+// DXCHECK: %hlsl.select.i = select reassoc nnan ninf nsz arcp afn i1
+// DXCHECK: %mul.i = fmul reassoc nnan ninf nsz arcp afn [[TYPE]]
+// DXCHECK: ret [[TYPE]] %mul.i
 // CHECK: define [[FNATTRS]] [[TYPE]] @
-// CHECK: %fmod = frem reassoc nnan ninf nsz arcp afn [[TYPE]]
-// CHECK: ret [[TYPE]] %fmod
+// CHECK: %fmod.i = frem reassoc nnan ninf nsz arcp afn [[TYPE]]
+// CHECK: ret [[TYPE]] %fmod.i
 half test_fmod_half(half p0, half p1) { return fmod(p0, p1); }
 
+// DXCHECK: define [[FNATTRS]] <2 x [[TYPE]]> @
+// DXCHECK: %div1.i = fdiv reassoc nnan ninf nsz arcp afn <2 x [[TYPE]]>
+// DXCHECK: %fneg.i = fneg reassoc nnan ninf nsz arcp afn <2 x [[TYPE]]>
+// DXCHECK: %cmp.i = fcmp reassoc nnan ninf nsz arcp afn oge <2 x [[TYPE]]>
+// DXNATIVE_HALF: %elt.abs.i = call reassoc nnan ninf nsz arcp afn <2 x [[TYPE]]> @llvm.fabs.v2f16(<2 x [[TYPE]]>
+// DXNO_HALF: %elt.abs.i = call reassoc nnan ninf nsz arcp afn <2 x [[TYPE]]> @llvm.fabs.v2f32(<2 x [[TYPE]]>
+// DXNATIVE_HALF: %hlsl.frac.i = call reassoc nnan ninf nsz arcp afn <2 x [[TYPE]]> @llvm.dx.frac.v2f16(<2 x [[TYPE]]>
+// DXNO_HALF: %hlsl.frac.i = call reassoc nnan ninf nsz arcp afn <2 x [[TYPE]]> @llvm.dx.frac.v2f32(<2 x [[TYPE]]>
+// DXCHECK: %fneg2.i = fneg reassoc nnan ninf nsz arcp afn <2 x [[TYPE]]>
+// DXCHECK: %hlsl.select.i = select reassoc nnan ninf nsz arcp afn <2 x i1>
+// DXCHECK: %mul.i = fmul reassoc nnan ninf nsz arcp afn <2 x [[TYPE]]>
+// DXCHECK: ret <2 x [[TYPE]]> %mul.i
 // CHECK: define [[FNATTRS]] <2 x [[TYPE]]> @
-// CHECK: %fmod = frem reassoc nnan ninf nsz arcp afn <2 x [[TYPE]]>
-// CHECK: ret <2 x [[TYPE]]> %fmod
+// CHECK: %fmod.i = frem reassoc nnan ninf nsz arcp afn <2 x [[TYPE]]>
+// CHECK: ret <2 x [[TYPE]]> %fmod.i
 half2 test_fmod_half2(half2 p0, half2 p1) { return fmod(p0, p1); }
 
+// DXCHECK: define [[FNATTRS]] <3 x [[TYPE]]> @
+// DXCHECK: %div1.i = fdiv reassoc nnan ninf nsz arcp afn <3 x [[TYPE]]>
+// DXCHECK: %fneg.i = fneg reassoc nnan ninf nsz arcp afn <3 x [[TYPE]]>
+// DXCHECK: %cmp.i = fcmp reassoc nnan ninf nsz arcp afn oge <3 x [[TYPE]]>
+// DXNATIVE_HALF: %elt.abs.i = call reassoc nnan ninf nsz arcp afn <3 x [[TYPE]]> @llvm.fabs.v3f16(<3 x [[TYPE]]>
+// DXNO_HALF: %elt.abs.i = call reassoc nnan ninf nsz arcp afn <3 x [[TYPE]]> @llvm.fabs.v3f32(<3 x [[TYPE]]>
+// DXNATIVE_HALF: %hlsl.frac.i = call reassoc nnan ninf nsz arcp afn <3 x [[TYPE]]> @llvm.dx.frac.v3f16(<3 x [[TYPE]]>
+// DXNO_HALF: %hlsl.frac.i = call reassoc nnan ninf nsz arcp afn <3 x [[TYPE]]> @llvm.dx.frac.v3f32(<3 x [[TYPE]]>
+// DXCHECK: %fneg2.i = fneg reassoc nnan ninf nsz arcp afn <3 x [[TYPE]]>
+// DXCHECK: %hlsl.select.i = select reassoc nnan ninf nsz arcp afn <3 x i1>
+// DXCHECK: %mul.i = fmul reassoc nnan ninf nsz arcp afn <3 x [[TYPE]]>
+// DXCHECK: ret <3 x [[TYPE]]> %mul.i
 // CHECK: define [[FNATTRS]] <3 x [[TYPE]]> @
-// CHECK: %fmod = frem reassoc nnan ninf nsz arcp afn <3 x [[TYPE]]>
-// CHECK: ret <3 x [[TYPE]]> %fmod
+// CHECK: %fmod.i = frem reassoc nnan ninf nsz arcp afn <3 x [[TYPE]]>
+// CHECK: ret <3 x [[TYPE]]> %fmod.i
 half3 test_fmod_half3(half3 p0, half3 p1) { return fmod(p0, p1); }
 
+// DXCHECK: define [[FNATTRS]] <4 x [[TYPE]]> @
+// DXCHECK: %div1.i = fdiv reassoc nnan ninf nsz arcp afn <4 x [[TYPE]]>
+// DXCHECK: %fneg.i = fneg reassoc nnan ninf nsz arcp afn <4 x [[TYPE]]>
+// DXCHECK: %cmp.i = fcmp reassoc nnan ninf nsz arcp afn oge <4 x [[TYPE]]>
+// DXNATIVE_HALF: %elt.abs.i = call reassoc nnan ninf nsz arcp afn <4 x [[TYPE]]> @llvm.fabs.v4f16(<4 x [[TYPE]]>
+// DXNO_HALF: %elt.abs.i = call reassoc nnan ninf nsz arcp afn <4 x [[TYPE]]> @llvm.fabs.v4f32(<4 x [[TYPE]]>
+// DXNATIVE_HALF: %hlsl.frac.i = call reassoc nnan ninf nsz arcp afn <4 x [[TYPE]]> @llvm.dx.frac.v4f16(<4 x [[TYPE]]>
+// DXNO_HALF: %hlsl.frac.i = call reassoc nnan ninf nsz arcp afn <4 x [[TYPE]]> @llvm.dx.frac.v4f32(<4 x [[TYPE]]>
+// DXCHECK: %fneg2.i = fneg reassoc nnan ninf nsz arcp afn <4 x [[TYPE]]>
+// DXCHECK: %hlsl.select.i = select reassoc nnan ninf nsz arcp afn <4 x i1>
+// DXCHECK: %mul.i = fmul reassoc nnan ninf nsz arcp afn <4 x [[TYPE]]>
+// DXCHECK: ret <4 x [[TYPE]]> %mul.i
 // CHECK: define [[FNATTRS]] <4 x [[TYPE]]> @
-// CHECK: %fmod = frem reassoc nnan ninf nsz arcp afn <4 x [[TYPE]]>
-// CHECK: ret <4 x [[TYPE]]> %fmod
+// CHECK: %fmod.i = frem reassoc nnan ninf nsz arcp afn <4 x [[TYPE]]>
+// CHECK: ret <4 x [[TYPE]]> %fmod.i
 half4 test_fmod_half4(half4 p0, half4 p1) { return fmod(p0, p1); }
 
+// DXCHECK: define [[FNATTRS]] float @
+// DXCHECK: %div1.i = fdiv reassoc nnan ninf nsz arcp afn float
+// DXCHECK: %fneg.i = fneg reassoc nnan ninf nsz arcp afn float
+// DXCHECK: %cmp.i = fcmp reassoc nnan ninf nsz arcp afn oge float
+// DXCHECK: %elt.abs.i = call reassoc nnan ninf nsz arcp afn float @llvm.fabs.f32(float
+// DXCHECK: %hlsl.frac.i = call reassoc nnan ninf nsz arcp afn float @llvm.dx.frac.f32(float
+// DXCHECK: %fneg2.i = fneg reassoc nnan ninf nsz arcp afn float
+// DXCHECK: %hlsl.select.i = select reassoc nnan ninf nsz arcp afn i1 
+// DXCHECK: %mul.i = fmul reassoc nnan ninf nsz arcp afn float 
+// DXCHECK: ret float %mul.i
 // CHECK: define [[FNATTRS]] float @
-// CHECK: %fmod = frem reassoc nnan ninf nsz arcp afn float
-// CHECK: ret float %fmod
+// CHECK: %fmod.i = frem reassoc nnan ninf nsz arcp afn float
+// CHECK: ret float %fmod.i
 float test_fmod_float(float p0, float p1) { return fmod(p0, p1); }
 
+// DXCHECK: define [[FNATTRS]] <2 x float> @
+// DXCHECK: %div1.i = fdiv reassoc nnan ninf nsz arcp afn <2 x float>
+// DXCHECK: %fneg.i = fneg reassoc nnan ninf nsz arcp afn <2 x float>
+// DXCHECK: %cmp.i = fcmp reassoc nnan ninf nsz arcp afn oge <2 x float>
+// DXCHECK: %elt.abs.i = call reassoc nnan ninf nsz arcp afn <2 x float> @llvm.fabs.v2f32(<2 x float>
+// DXCHECK: %hlsl.frac.i = call reassoc nnan ninf nsz arcp afn <2 x float> @llvm.dx.frac.v2f32(<2 x float>
+// DXCHECK: %fneg2.i = fneg reassoc nnan ninf nsz arcp afn <2 x float>
+// DXCHECK: %hlsl.select.i = select reassoc nnan ninf nsz arcp afn <2 x i1>
+// DXCHECK: %mul.i = fmul reassoc nnan ninf nsz arcp afn <2 x float>
+// DXCHECK: ret <2 x float> %mul.i
 // CHECK: define [[FNATTRS]] <2 x float> @
-// CHECK: %fmod = frem reassoc nnan ninf nsz arcp afn <2 x float>
-// CHECK: ret <2 x float> %fmod
+// CHECK: %fmod.i = frem reassoc nnan ninf nsz arcp afn <2 x float>
+// CHECK: ret <2 x float> %fmod.i
 float2 test_fmod_float2(float2 p0, float2 p1) { return fmod(p0, p1); }
 
+// DXCHECK: define [[FNATTRS]] <3 x float> @
+// DXCHECK: %div1.i = fdiv reassoc nnan ninf nsz arcp afn <3 x float>
+// DXCHECK: %fneg.i = fneg reassoc nnan ninf nsz arcp afn <3 x float>
+// DXCHECK: %cmp.i = fcmp reassoc nnan ninf nsz arcp afn oge <3 x float>
+// DXCHECK: %elt.abs.i = call reassoc nnan ninf nsz arcp afn <3 x float> @llvm.fabs.v3f32(<3 x float>
+// DXCHECK: %hlsl.frac.i = call reassoc nnan ninf nsz arcp afn <3 x float> @llvm.dx.frac.v3f32(<3 x float>
+// DXCHECK: %fneg2.i = fneg reassoc nnan ninf nsz arcp afn <3 x float>
+// DXCHECK: %hlsl.select.i = select reassoc nnan ninf nsz arcp afn <3 x i1>
+// DXCHECK: %mul.i = fmul reassoc nnan ninf nsz arcp afn <3 x float>
+// DXCHECK: ret <3 x float> %mul.i
 // CHECK: define [[FNATTRS]] <3 x float> @
-// CHECK: %fmod = frem reassoc nnan ninf nsz arcp afn <3 x float>
-// CHECK: ret <3 x float> %fmod
+// CHECK: %fmod.i = frem reassoc nnan ninf nsz arcp afn <3 x float>
+// CHECK: ret <3 x float> %fmod.i
 float3 test_fmod_float3(float3 p0, float3 p1) { return fmod(p0, p1); }
 
+// DXCHECK: define [[FNATTRS]] <4 x float> @
+// DXCHECK: %div1.i = fdiv reassoc nnan ninf nsz arcp afn <4 x float>
+// DXCHECK: %fneg.i = fneg reassoc nnan ninf nsz arcp afn <4 x float>
+// DXCHECK: %cmp.i = fcmp reassoc nnan ninf nsz arcp afn oge <4 x float>
+// DXCHECK: %elt.abs.i = call reassoc nnan ninf nsz arcp afn <4 x float> @llvm.fabs.v4f32(<4 x float>
+// DXCHECK: %hlsl.frac.i = call reassoc nnan ninf nsz arcp afn <4 x float> @llvm.dx.frac.v4f32(<4 x float>
+// DXCHECK: %fneg2.i = fneg reassoc nnan ninf nsz arcp afn <4 x float>
+// DXCHECK: %hlsl.select.i = select reassoc nnan ninf nsz arcp afn <4 x i1>
+// DXCHECK: %mul.i = fmul reassoc nnan ninf nsz arcp afn <4 x float>
+// DXCHECK: ret <4 x float> %mul.i
 // CHECK: define [[FNATTRS]] <4 x float> @
-// CHECK: %fmod = frem reassoc nnan ninf nsz arcp afn <4 x float>
-// CHECK: ret <4 x float> %fmod
+// CHECK: %fmod.i = frem reassoc nnan ninf nsz arcp afn <4 x float>
+// CHECK: ret <4 x float> %fmod.i
 float4 test_fmod_float4(float4 p0, float4 p1) { return fmod(p0, p1); }
 
diff --git a/clang/test/SemaHLSL/BuiltIns/fmod-errors.hlsl b/clang/test/SemaHLSL/BuiltIns/fmod-errors.hlsl
new file mode 100644
index 0000000000000..86f5a6f7bea9c
--- /dev/null
+++ b/clang/test/SemaHLSL/BuiltIns/fmod-errors.hlsl
@@ -0,0 +1,33 @@
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple dxil-pc-shadermodel6.6-library %s -fnative-half-type -emit-llvm-only -disable-llvm-passes -verify
+
+float test_no_second_arg(float2 p0) {
+  return fmod(p0);
+  // expected-error@-1 {{no matching function for call to 'fmod'}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 1 was provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function template not viable: requires 2 arguments, but 1 was provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function template not viable: requires 2 arguments, but 1 was provided}}
+}
+
+float test_too_many_arg(float2 p0) {
+  return fmod(p0, p0, p0);
+  // expected-error@-1 {{no matching function for call to 'fmod'}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 3 were provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function not viable: requires 2 arguments, but 3 were provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function template not viable: requires 2 arguments, but 3 were provided}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function template not viable: requires 2 arguments, but 3 were provided}}
+}
+
+float test_double_inputs(double p0, double p1) {
+  return fmod(p0, p1);
+  // expected-error@-1  {{call to 'fmod' is ambiguous}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function}}
+}
+
+float test_int_inputs(int p0, int p1) {
+  return fmod(p0, p1);
+  // expected-error@-1  {{call to 'fmod' is ambiguous}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function}}
+  // expected-note@hlsl/hlsl_intrinsics.h:* {{candidate function}}
+}

Copy link
Contributor

@Icohedron Icohedron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the most part, this PR looks good! Just needs some changes to the codegen test to make it more clear that the implementation is correctly being tested.

Copy link

github-actions bot commented Mar 7, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@kmpeng kmpeng force-pushed the users/kmpeng/fmod_intrinsic branch from f39f647 to f190de5 Compare March 10, 2025 20:31
@kmpeng kmpeng force-pushed the users/kmpeng/fmod_intrinsic branch 2 times, most recently from fd36c1f to 214cb4a Compare March 10, 2025 21:27
@kmpeng kmpeng requested a review from Icohedron March 10, 2025 21:31
@kmpeng kmpeng requested a review from Icohedron March 11, 2025 20:15
@farzonl farzonl requested a review from Icohedron March 12, 2025 00:39
@farzonl
Copy link
Member

farzonl commented Mar 12, 2025

@kmpeng could you rebase and squash these commits. If I try and squash merge its going to set your personal email as the author because your first commit was using your personal: https://github.com/llvm/llvm-project/commit/ef16da4479d9d2034f3b6072410b488d7568ce88.patch

but looks like you switch to your work email later: https://github.com/llvm/llvm-project/commit/bd1a02b0e7bacf1ecbc54de8356aef0a5df8bd5d.patch

The only way I know how to fix this is if the first commit was using your work email hence my ask to rebase and squash to one commit.

@kmpeng kmpeng force-pushed the users/kmpeng/fmod_intrinsic branch from bd1a02b to 4b62159 Compare March 12, 2025 16:38
@kmpeng kmpeng force-pushed the users/kmpeng/fmod_intrinsic branch from 4b62159 to 8aeb5bd Compare March 12, 2025 16:56
@farzonl farzonl merged commit 184f944 into llvm:main Mar 12, 2025
7 of 10 checks passed
Copy link

@kmpeng Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

frederik-h pushed a commit to frederik-h/llvm-project that referenced this pull request Mar 18, 2025
Replaced the current `fmod` definition with a templatized version,
implemented `fmod` algorithm for DirectX targets that matches the DXC
implementation, added corresponding tests in
`clang/test/CodeGenHLSL/builtins/fmod.hlsl` and
`clang/test/SemaHLSL/BuiltIns/fmod-errors.hlsl`.

Closes llvm#99118.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics clang Clang issues not falling into any other category HLSL HLSL Language Support
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

Implement the fmod HLSL Function
4 participants