Question about batch normalization in WGAN_GP #17

YukiRain · 2017-11-26T09:14:57Z

Thank for your work! I'm confused about the batch normalization layer used in the discriminator of wgan_gp. I think there shouldn't be any batch normalization layer in the discriminator.
I think the reason is that the gradient penalty term directly otimizes the gradient loss of each vector sampled between data distribution and generated distribution, each of the vectors has its gradient and the gradient is independent of all other vectors, so the gradient penalty must be calculated separately w.r.t. each sampled vector. If batch normalization is applied in discriminator, the region constrained by 1-Lipchitz, I guess, would be somewhere else instead of "the region between data distribution and generated distribution".
I'm not sure whether i misinterpreted the idea of wgan-gp paper.

njwfish · 2018-09-29T19:05:09Z

You're correct, the bn layer should be removed from the discriminator.

iskangkang · 2020-06-16T01:12:16Z

You're correct, the bn layer should be removed from the discriminator.

when I removed the bn layer I find the results become worse,do you have met this problem?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about batch normalization in WGAN_GP #17

Question about batch normalization in WGAN_GP #17

YukiRain commented Nov 26, 2017 •

edited

Loading

njwfish commented Sep 29, 2018

iskangkang commented Jun 16, 2020

Question about batch normalization in WGAN_GP #17

Question about batch normalization in WGAN_GP #17

Comments

YukiRain commented Nov 26, 2017 • edited Loading

njwfish commented Sep 29, 2018

iskangkang commented Jun 16, 2020

YukiRain commented Nov 26, 2017 •

edited

Loading