Both our memorable image generators are designed to use the same loss function, the Wasserstein metric combined with a component which calculates the difference between the desired and generated memorability for a given image. This training mechanism works for both single-score and VMS map memorability training examples.
The loss function is designed to embed a memorability predictor and contains the following components: a generator network , a discriminator and memorability predictor network . Considering the latent code distribution , target VMS distribution , real image distribution , predicted VMS distribution , and generated image distribution based upon the latent code and we define the loss function in Eq. (3). The latent code is drawn from a Gaussian distribution and from a distribution of target VMS maps, where height, width, and intensity of VMS regions is drawn from a uniform distribution. refers to the gradient penalty loss in37. controls the strength of the memorability loss. represents the probability of the generated data and is the probability of the real data. The additional term controlled by the hyperparameter prevents the gradients inside the discriminator from violating Lipschitz continuity, whereas the first two terms evaluate the Earth-Mover distance between the generated and real distributions. The additional memorability loss, combined with the Wasserstein loss, constrains the image generation by both ‘realness’ and memorability simultaneously.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.