Saliency-based editing of character animations using differentiable rendering

Submitted by Marc CHRISTIE on dim 02/06/2024 - 23:46

Team

VIRTUS

Website of the team

https://team.inria.fr/virtus/

Date of the beginning of the PhD (if already known)

1er Octobre 2024

Place

IRISA

Laboratory

IRISA - UMR 6074

Description of the subject

Context

The wide availability of 3D animation datasets has considerably eased the task of animators in crafting animated sequences with synthetic characters. Many algorithms have explored how to smoothly blend between existing animations to perform realistic transitions, or to adapt existing animations to specific requirements (morphology of the character, self-collisions or collisions with the environment). While such adaptations of animations are central in easing the creative process of animators, or in improving real-time animations of characters, there are a number of dimensions which have been neglected by the research community. One of these dimensions is the saliency of the generated motion, either perceived by the camera, or perceived by other characters in the environment. As humans, we have learned through social interactions how to alter our gestures and motions to improve their perception, which is a missing feature in today’s interactive virtual characters. There is therefore a need to propose animation methods able to generate such interactions, so that each communicative action performed by a character can be properly perceived by the interlocutor or the spectator.

Some recent work in our team [1] has explored this topic by (i) expressing the correlation between a number of visual features of a motion (spatial coverture, speed, visibility) from a viewpoint and the degrees of freedom in the animation, and (ii) performing an inverse kinematics optimization to alter the animation so that the intended visual features are satisfied. The proposed solution is yet contrived in that the visual features are limited (and require the design of a specific optimization metric for each), they only hold as a proxy of visual saliency which encompasses aspects related to contrast, chrominance, and visual motion, and the optimization process remains computationally expensive with issues in convergence due to the non-derivability of the rendering process.

Research Challenges

The main research challenge of this PhD is to propose a more elaborate and principled model of representation to express the correlation between character motions and perceived saliency. Computational saliency techniques [2] have gained traction with the exploitation of deep-learning techniques, and display compelling capacities in estimating saliency of static and dynamic contents. Establishing the correlation between the saliency of a character in a scene (in its context, which lighting, occluders or motions) and the degrees of freedom of the character would enable the design of a gradient to optimize the motion. The problem remains complex, yet recent neural representations such as NeRFs [3], 3D Gaussian splats [4] or Gaussian frosting [5] open the possibility of performing differentiable rendering, which would be key in establishing the required correlation. Yet to date, only a few representations (e.g. Gaussian frosting) enable a mixture between geometric representations that can be animated and neural representations that can be rendered.

Research Objectives

The research objectives are (i) provide a more principled approach in establishing the correlation between visual saliency and degrees of freedom in the character animation compared to [1], (ii) design a character animation process that exploits Gaussian frosting to link geometric and neural representations (see [6]), (iii) design modalities to link the neural rendering process with the computational saliency process and (iv) perform an optimization of existing character animations by controlling relevant features in terms of visual saliency. The quality of the results will be evaluated through both quantitative metrics, for which the criteria need to be defined, and qualitative metrics based on user evaluations (perceived effect, quality of the motion adaptation, degree of realism,…).

The PhD will be conducted at IRISA / INRIA Rennes research facility, supervised by a group of researchers from the Virtus team that focus on character animation and perceptual features of animations. We will also collaborate with artists from Dada Animation company within the framework of the ANR Animation Conductor project.

Bibliography

[1] Jovane, A., Raimbaud, P., Zibrek, K., Pacchierotti, C., Christie, M., Hoyet, L., ... & Pettré, J. (2023). Warping character animations using visual motion features. Computers & Graphics, 110, 38-48.

[2] Le Meur, O., Le Callet, P., Barba, D., & Thoreau, D. (2006). A coherent computational approach to model bottom-up visual attention. IEEE transactions on pattern analysis and machine intelligence, 28(5), 802-817.

[3] Mildenhall, B., Srinivasan, P. P., Tancik, M., Barron, J. T., Ramamoorthi, R., & Ng, R. (2021). Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1), 99-106.

[4] Kerbl, B., Kopanas, G., Leimkühler, T., & Drettakis, G. (2023). 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 42(4), 1-14.

[5] Guédon, A., & Lepetit, V. (2024). Gaussian Frosting: Editable Complex Radiance Fields with Real-Time Rendering. arXiv preprint arXiv:2403.14554.

[6] Kocabas, M., Chang, J. H. R., Gabriel, J., Tuzel, O., & Ranjan, A. (2023). Hugs: Human gaussian splats. arXiv preprint arXiv:2311.17910.

Researchers

HOYET Ludovic

Type of supervision

Director

Laboratory

IRISA UMR 6074

Department

D6 - Signal, Image, Language

Team

VIRTUS

CHRISTIE Marc

Type of supervision

Supervisor (optional)

Laboratory

UMR IRISA 6074

Department

D6 - Signal, Image, Language

Team

VIRTUS

Contact·s

Nom

HOYET Ludovic

marc.christie@irisa.fr

Téléphone

0650012922

Nom

CHRISTIE Marc

marc.christie@irisa.fr

Téléphone

0650012922

Keywords

Computer graphics, 3D character animation, deep-learning, Gaussian frosting