-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AdamaxOptimizer#applyGradients
fails when the gradient's order changes
#8379
Comments
Note: As a workaround, I simply replace my |
Hi @benoitkoenig , We are pleased to hear that your issue has been resolved. Please consider closing this issue. If you encounter any further difficulties, please feel free to raise a new issue. Thank You!! |
Hello @shmishra99 and thanks for your answer. The issue is not resolved ^^' My second comment points out that there is a work-around possible, at least in my case. However, the current behavior, which is that the Adamax optimizer will fail if the gradients are not consistently passed in the same order, still seems like abug to me. Let me know if I can help fix this. I've checked the code and would be happy to submit a pull request if that can help :-) |
Hi @benoitkoenig , Thank you for expressing your interest in contributing to tfjs. Could you please share a small, reproducible code snippet? This will help me to verify the behavior from my end. Thank you! |
Hi @shmishra99, Here is a code snippet to reproduce the issue:
This situation happens in my scenario where I am training an actor-critic asynchronously on multiple threads: sometimes the weights of the actor come first, and sometimes the weights of the critic do. I offered to open a PR to fix this: my idea is to update the Adamax Optimizer for Thank you for your time, Benoît |
Please make sure that this is a bug. As per our
GitHub Policy,
we only address code/doc bugs, performance issues, feature requests and
build/installation issues on GitHub. tag:bug_template
System information
@tensorflow/[email protected]
Describe the current behavior
I have a
NamedVariableMap
of gradients that I want to apply to my models. When runningAdamaxOptimizer#applyGradients
, I sometimes get the errorInvalid TF_Status: 3 - required broadcastable shapes
. When loggingObject.keys(gradients)
, I have noticed that the keys are not always sorted in the same way. This could explain the issue asAdamaxOptimizer#applyGradients
's implementation relies on the order ofObject.keys(variableGradients)
Describe the expected behavior
AdamaxOptimizer#applyGradients
should not rely on the order of the keys being consistent, especially when calling it withNamedVariableMap
since the order of keys then depends onObject#keys
.Standalone code to reproduce the issue
My use-case requires worker threads, which necessarily make it hard to reproduce. Let me know if you want me to write a reproduction repo for this
Other info / logs
To explain my use-case, I am training two models in an actor-critic experiment. The gradients for both the actor and the critic are compute within the same call to
optimizer#computeGradients
. Since Node.js is mono-threaded, I generate the gradients in three distinct worker threads, and periodically send them to the main thread to update the gradients of a centralized copy of the model. For each model taken individually, it appears that their weights are always in the same order; however, sometimes the weights of the actor appear first, sometimes the weights of the critic appear first. This bug always arise at the start of the training, never on the first call tooptimizer#applyGradients
, which indicates that the ordr of the gradients is consistent per thread, so the issue onl arises when one of the threads has a different order than the othersThe text was updated successfully, but these errors were encountered: