We introduce a method for animating human images, using the SMPL 3D human parametric model within a latent diffusion framework to improve shape alignment and motion guidance. By incorporating various maps and skeleton-based guidance, we enrich the model with detailed 3D shape and pose attributes, fusing them via a multi-layer motion fusion module with self-attention mechanisms.
ECCV 2024    Project Page    Code