torch_xla scan forces inputs to have gradients #8783

tengyifei · 2025-03-04T04:05:46Z

The snippet

Lines 217 to 226 in 00fac78

    
           # Make some fake tensors to trace the user function and obtain the 
        
           # forward and backward graphs. Note that the init/carry fake tensor 
        
           # always requires grad. That's because even if the user passed in some 
        
           # `init` that does not require grad, we still want gradients to flow 
        
           # through the `carry` from one iteration of the user function to the 
        
           # next. In summary, the `carry` argument used to trace a user function 
        
           # to get a correct backward pass always requires grad. 
        
           def make_fake_tensor(v: torch.Tensor, requires_grad=True) -> torch.Tensor: 
        
             return torch.empty_like( 
        
                 v, dtype=v.dtype, device=v.device, requires_grad=requires_grad)

is probably wrong.

The most obvious example is that if one of the input is an integer, then it can't possibly have gradients.

tengyifei self-assigned this Mar 4, 2025

ysiraichi added the bug Something isn't working label Mar 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torch_xla scan forces inputs to have gradients #8783

torch_xla scan forces inputs to have gradients #8783

tengyifei commented Mar 4, 2025

torch_xla scan forces inputs to have gradients #8783

torch_xla scan forces inputs to have gradients #8783

Comments

tengyifei commented Mar 4, 2025