Improved Autoregressive Modeling with Distribution Smoothing
Chenlin Meng
Jiaming Song
Yang Song
Shengjia Zhao
Stefano Ermon
Stanford University
Paper | GitHub
Abstract
While autoregressive models excel at image compression, their sample quality is often lacking. Although not realistic, generated images often have high likelihood according to the model, resembling the case of adversarial examples. Inspired by a successful adversarial defense method, we incorporate randomized smoothing into autoregressive generative modeling. We first model a smoothed version of the data distribution, and then reverse the smoothing process to recover the original data distribution. This procedure drastically improves the sample quality of existing autoregressive models on several synthetic and real-world image datasets while obtaining competitive likelihoods on synthetic datasets.
Paper
arXiv 2103.15089, 2021.
Citation
Chenlin Meng, Jiaming Song, Yang Song, Shengjia Zhao and Stefano Ermon. "Improved Autoregressive Modeling with Distribution Smoothing"
Stage 1: Learning the smoothed distribution
Instead of directly modeling the data distribution, we propose to first train an autoregressive model on the smoothed version of the data distribution. |
Stage 2: Reverse smoothing
Although the smoothing process makes the distribution easier to learn, it also introduces bias. Thus, we need an extra step to debias the learned distribution by reverting the smoothing process. |
Conclusion
In this paper, we propose to incorporate randomized smoothing techniques into autoregressive modeling. By choosing the smoothness level appropriately, this seemingly simple approach is able to drastically improve the sample quality of existing autoregressive models on several synthetic and real-world datasets while retaining reasonable likelihoods. Our work provides insights into how recent adversarial defense techniques can be leveraged to building more robust generative models. Since we apply randomized smoothing technique directly to the target data distribution other than the model, we believe our approach is also applicable to other generative models such as variational autoencoders (VAEs) and generative adversarial networks (GANs).