# Diffusion Process

Diffusion is the movement of a substance from a region of high concentration to a region of low concentration without bulk motion.

1. Forward Process
2. Reverse Process
• Loss Function
• Model Architecture
3. Train and Sample

## Forward Process

$q\left(\mathbf{x}_t \mid \mathbf{x}_{t-1}\right)=\mathcal{N}\left(\mathbf{x}_t ; \sqrt{1-\beta_t} \mathbf{x}_{t-1}, \beta_t \mathbf{I}\right) \quad \tag{1}$

$\mathbf{x}_t =\sqrt{1-\beta_t} \mathbf{x}_{t-1}+\sqrt{\beta_t} \epsilon, \quad \epsilon \sim \mathcal{N}(0, \mathbf{I}) \tag{2}$

$$(2)$$式中的項$$\beta_t$$改寫，定義：

\begin{aligned} \alpha_t & =1-\beta_t \\ \bar{\alpha}_t & =\prod_{i=1}^t \alpha_i \end{aligned}

\begin{aligned} \mathbf{x}_t =\sqrt{\bar{\alpha}_t} \mathbf{x}_0+\sqrt{1-\bar{\alpha}_t} \epsilon \end{aligned}, \quad \epsilon \sim \mathcal{N}(0, \mathbf{I})\tag{3}

## Reverse Process

$$\textit{forward process}$$進行完畢後，認爲$$\mathbf{x}_T$$基本看作是一張噪聲圖片，它的像素點服從標準高斯分佈。
$$p(\mathbf{x}_T) \sim \mathcal{N}(0, \mathbf{I})$$

$\begin{gathered}p_\theta\left(x_0\right):=\int p_\theta\left(x_{0: T}\right) d x_{1: T} \\ p_\theta\left(\mathbf{x}_{0: T}\right):=p\left(\mathbf{x}_T\right) \prod_{t=1}^T p_\theta\left(\mathbf{x}_{t-1} \mid \mathbf{x}_t\right), \quad p(\mathbf{x}_T) \sim \mathcal{N}(\mathbf{x}_T;0, \mathbf{I}) \\ p_\theta\left(\mathbf{x}_{t-1} \mid \mathbf{x}_t\right):=\mathcal{N}\left(\mathbf{x}_{t-1} ; \boldsymbol{\mu}_\theta\left(\mathbf{x}_t, t\right), \boldsymbol{\Sigma}_\theta\left(\mathbf{x}_t, t\right)\right) \end{gathered}\tag{4}$

### Loss Function

$$L = \left|\boldsymbol{\epsilon}-\boldsymbol{\epsilon}_\theta\left(\sqrt{\bar{\alpha}_t} \mathbf{x}_0+\sqrt{1-\bar{\alpha}_t} \boldsymbol{\epsilon}, t\right)\right|^2 \tag{5}$$

### Model Architecture

DDPM中使用的UNet在原基礎上，對於每個Block加入Attention以及Residual等模塊。