FRAME Revisited: An Interpretation View Based on Particle Evolution

发布人:吕梅

Paper

Xu Cai, Yang Wu, Guanbin Li*, Ziliang Chen, Liang Lin, “FRAME Revisited: An Interpretation View Based on Particle Evolution ”, Proc. of AAAI Conference on Artificial Intelligence (AAAI), 2019.(camera ready) Slides Code Paper

Abstract

FRAME (Filters, Random fields, And Maximum Entropy)[1] is an energy-based descriptive model that synthesizes visual realism by capturing mutual patterns from structural input signals. The maximum likelihood estimation (MLE) is applied by default, yet conventionally causes the unstable training energy that wrecks the generated structures, which remains unexplained. In this paper, we provide a new theoretical insight to analyze FRAME, from a perspective of particle physics ascribing the weird phenomenon to KL-vanishing issue. In order to stabilize the energy dissipation, we propose an alternative Wasserstein distance in discrete time based on the conclusion that the Jordan-Kinderlehrer-Otto (JKO)[2] discrete flow approximates KL discrete flow when the time step size tends to 0. Besides, this metric can still maintain the model’s statistical consistency. Quantitative and qualitative experiments have been respectively conducted on several widely used datasets. The empirical studies have evidenced the effectiveness and superiority of our method.

Motivations

A large number of experimental results reveal that FRAME tends to generate inferior synthesized images and is often arduous to converge during training. The synthesized images of FRAME seriously deteriorates along with the model energy. This phenomenon is caused by KL-vanishing in the stepwise parameters estimation of the model due to the existence of the great filter responses disparity between distribution of model and distribution of real data. Moreover, in the discrete time setting of the actual iterative training process, the dissipation of the model energy may become considerably unstable, and the stepwise minimization scheme may suffer serious KL-vanishing issue during the communicative parameters estimation.

30_1

To tackle the above shortcomings, we first investigate this model from a particle perspective by regarding all the observed signals as Brownian particles (pre-condition of KL discrete flow), which helps explore the reasons for the collapses of the FRAME model. We then delve into the model in discrete time state and translate its learning mechanism from KL discrete flow into the Jordan-Kinderlehrer-Otto (JKO) discrete flow, which is a procedure for finding time-discrete approximations to solutions of diffusion equations in Wasserstein space.

Model Collapse Identification

This confirmation of existance of model collapse is implemented on a subset of SUN[5].

30_2

 

Quantitive and Qualitive Experiments Results

On large datasets CelebA[3] and LSUN-Bedroom[4]

30_3

On CIFAR-10[5] dataset

30_4

And the quantitive results of those three datasets above, The table is the inception scores of comapred generative models on dataset CIFAR-10.

30_530_6

Reference

[1] Zhu, S. C.; Wu, Y. N.; and Mumford, D. 1997. Minimax entropy principle and its application to texture modeling. Neural computation 9(8):1627–1660.

[2] Jordan, R.; Kinderlehrer, D.; and Otto, F. 1998. The vari-ational formulation of the fokker–planck equation.SIAMjournal on mathematical analysis29(1):1–17.

[3] Liu, Z.; Luo, P.; Wang, X.; and Tang, X. 2015. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision, 3730–3738.

[4] Yu, F.; Zhang, Y.; Song, S.; Seff, A.; and Xiao, J. 2015. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365.

[5] Xiao, J.; Hays, J.; Ehinger, K. A.; Oliva, A.; and Torralba, A. 2010. Sun database: Large-scale scene recognition from abbey to zoo. In Computer vision and pattern recognition (CVPR), 2010 IEEE conference on, 3485–3492. IEEE.