DreamFusion (SDS loss)

Text-to-3D using 2D Diffusion (ICLR 2023)

DreamFusion: Text-to-3D using 2D Diffusion (ICLR 2023)

Ben Poole, Ajay Jain, Jonathan T. Barron, Ben Mildenhall

paper :
https://arxiv.org/abs/2209.14988
project website :
https://dreamfusion3d.github.io/
pytorch code :
https://github.com/ashawkey/stable-dreamfusion

Contribution

Overview

Random camera, light sampling

NeRF Rendering with shading

albedo : NeRF가 예측한 color

Diffusion loss with conditioning

Sample in Parmater Space, not Pixel Space

Optimization

Rendering

Structure

Geometry Regularizer

SDS Loss

Simple Derivation of SDS Loss

x는 NeRF가 생성한 image이고, y는 text embedding vector

Derivation of SDS Loss

Pseudo Code

params = generator.init() # NeRF param.
opt_state = optimizer.init(params) # optimizer
diffusion_model = diffusion.load_model() # Imagen diffusion model
for iter in iterations:
  t = random.uniform(0., 1.) # noise level (time step)
  alpha_t, sigma_t = diffusion_model.get_coeffs(t) # determine constant for noisy z_t's mean, std.
  eps = random.normal(img_shape) # gaussian noise (epsilon)
  x = generator(params, ...) # NeRF rendered image
  z_t = alpha_t * x + sigma_t * eps # noisy NeRF image
  epshat_t = diffusion_model.epshat(z_t, y, t) # denoising U-Net
  g = grad(weight(t) * dot(stopgradient[epshat_t - eps], x), params) # derivative of SDS loss; stopgradient since do not update diffusion model
  params, opt_state = optimizer.update(g, opt_state) # update NeRF param.
return params

Experiment

Metric

Result

Geo(metry)의 CLIP R-Precision 점수가 높다는 것은 평평한 3D model이 아니라 shape 정보까지 고려했다는 것!

Ablation Study

Limitation

Latest Papers

Question