You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank for your work! I'm confused about the batch normalization layer used in the discriminator of wgan_gp. I think there shouldn't be any batch normalization layer in the discriminator.
I think the reason is that the gradient penalty term directly otimizes the gradient loss of each vector sampled between data distribution and generated distribution, each of the vectors has its gradient and the gradient is independent of all other vectors, so the gradient penalty must be calculated separately w.r.t. each sampled vector. If batch normalization is applied in discriminator, the region constrained by 1-Lipchitz, I guess, would be somewhere else instead of "the region between data distribution and generated distribution".
I'm not sure whether i misinterpreted the idea of wgan-gp paper.
The text was updated successfully, but these errors were encountered:
Thank for your work! I'm confused about the batch normalization layer used in the discriminator of wgan_gp. I think there shouldn't be any batch normalization layer in the discriminator.
I think the reason is that the gradient penalty term directly otimizes the gradient loss of each vector sampled between data distribution and generated distribution, each of the vectors has its gradient and the gradient is independent of all other vectors, so the gradient penalty must be calculated separately w.r.t. each sampled vector. If batch normalization is applied in discriminator, the region constrained by 1-Lipchitz, I guess, would be somewhere else instead of "the region between data distribution and generated distribution".
I'm not sure whether i misinterpreted the idea of wgan-gp paper.
The text was updated successfully, but these errors were encountered: