🧀 BigCheese.ai

Social

Diffusion Is Spectral Autoregression

🧀

Researcher Sander Dieleman explores the close relationship between diffusion models and autoregressive models in image generation, asserting that diffusion models use approximate autoregression in the frequency domain. The article also touches on the implications for sound and language processing, and speculates on the future of generative models for multimodal inputs.

  • Dieleman is a Research Scientist at DeepMind.
  • Diffusion models of images perform autoregression in frequency domain.
  • Python notebook available for reproducing the blog's experiments.
  • Image spectra tend to follow a power law.
  • Diffusion doesn't quite apply to audio waveforms.