fix: broadcast vectors for grad calculation #1535

polvalente · 2024-09-15T05:24:45Z

nx/lib/nx/defn/grad.ex

josevalim · 2024-09-16T10:19:03Z

nx/lib/nx/defn/grad.ex

-    Expr.constant(%T{shape: shape, type: {:f, 32}, names: names}, float, [])
+    case shape do
+      %T{vectorized_axes: [_ | _]} = t ->
+        Expr.tensor(Nx.fill(t, float, type: :f32))


We should probably get rid of the names here too.

I also wonder if should move the check for vectorized_axes to constant. Today if someone passes vectorized_axes, Expr.constant is broken. So maybe we should create a tensor if a vectorized axes is given to tensor?

josevalim · 2024-09-16T10:20:07Z

nx/lib/nx/defn/grad.ex

@@ -338,6 +333,8 @@ defmodule Nx.Defn.Grad do
  @verify_grad Application.compile_env(:nx, :verify_grad, false)

  defp update_grads(op, args, ans, g, _to_grad_ids, grads) do
+    args = revectorize_args(args, ans)


I would prefer to not revectorized everything on every operation. Is there any chance we could do in broadcast only?

[unbroadcast(x, Nx.multiply(g, y), ans), unbroadcast(y, Nx.multiply(g, x), ans)]

Lines like this one make it so that g is vectorized and y is unvectorized but has axes with the same name, so things break there.

josevalim · 2024-09-16T20:08:51Z

nx/lib/nx/defn/expr.ex

@@ -1394,6 +1394,11 @@ defmodule Nx.Defn.Expr do

  ## Constant helpers and related optimizations

+  defp constant(%{vectorized_axes: [_ | _]} = out, number) do
+    out = %{out | names: Enum.map(out.names, fn _ -> nil end)}


I don't think this part should be done here, we should preserve the names. Sorry for the confusion.

josevalim · 2024-09-16T20:15:18Z

nx/lib/nx/defn/grad.ex

@@ -1343,9 +1334,77 @@ defmodule Nx.Defn.Grad do

  ## General helpers

-  defp unbroadcast(%{shape: shape} = x, res, %{shape: shape}), do: {x, res}
+  defp revectorize_args(args, ans) do


Let's only apply this if args has more than one element and there are vectorized axes.

Also please test x * sin(y) where y is vectorized.

fix: broadcast vectors for grad calculation

394a12d

polvalente self-assigned this Sep 15, 2024

polvalente commented Sep 15, 2024

View reviewed changes

nx/lib/nx/defn/grad.ex Outdated Show resolved Hide resolved

polvalente added 7 commits September 15, 2024 02:45

fix attempt

414726b

test: make core tests pass

a08d0fd

fix: inspect vectorized axes as usual

2f7c5f1

chore: revert some changes

d87ffa1

chore: remove commented code

7fbdffd

chore: remove stray comments

db0b6f0

chore: remove more stray comments

20cc168

polvalente requested a review from josevalim September 16, 2024 09:46

josevalim reviewed Sep 16, 2024

View reviewed changes

refactor: support vectorized constant

22b9a24

josevalim reviewed Sep 16, 2024

View reviewed changes

josevalim approved these changes Sep 16, 2024

View reviewed changes

test: add x * sin(y) grad test

8f60a71

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: broadcast vectors for grad calculation #1535

fix: broadcast vectors for grad calculation #1535

polvalente commented Sep 15, 2024

josevalim Sep 16, 2024

polvalente Sep 16, 2024

josevalim Sep 16, 2024

polvalente Sep 16, 2024

josevalim Sep 16, 2024

josevalim Sep 16, 2024

fix: broadcast vectors for grad calculation #1535

Are you sure you want to change the base?

fix: broadcast vectors for grad calculation #1535

Conversation

polvalente commented Sep 15, 2024

josevalim Sep 16, 2024

Choose a reason for hiding this comment

polvalente Sep 16, 2024

Choose a reason for hiding this comment

josevalim Sep 16, 2024

Choose a reason for hiding this comment

polvalente Sep 16, 2024

Choose a reason for hiding this comment

josevalim Sep 16, 2024

Choose a reason for hiding this comment

josevalim Sep 16, 2024

Choose a reason for hiding this comment