DReCon: Data-Driven responsive Control of Physics-Based Characters

Blog, Technology

Physics-based animation holds the promise of unlocking unprecedented levels of interaction, fidelity, and variety in games. The intricate interactions between a character and it’s environment can only be faithfully synthesized by respecting real physical principles. On the other hand, data-driven animation systems utilizing large amounts of motion capture data have already shown that artistic style and motion variety can be preserved even when tight constraints on responsiveness and motion control objectives are required by a game’s design.

To combine the strengths of both methods we developed DReCon, a character controller created using deep reinforcement learning. Essentially, simulated human characters learn to move around and balance from precisely controllable motion capture examples. Once trained, gamepad controlled characters can be fully simulated using physics and simultaneously directed with a high level of responsiveness at a surprisingly low runtime cost on today’s hardware.

 


 

Animators spend a lot of time trying to get things to look good. It’s an artistic process without clearly defined rules, however there are basic elements animators try to include in their work, the most well known being:

The twelve basic principles of animation.

 

What are these? Many of the principles are methods of showing interesting dynamic effects that arise from real world physics. A couple of examples: the principle of “Squash and stretch” captures the behaviour of elastic deformation, and the principle of “Follow through and overlapping action” captures the effects of inertia and modal vibration. I would argue that interesting animation usually has some physically plausible aspects – even if movement is often exaggerated for stylistic reasons. Animators working on films can capture all the effects they want in their work, but this can be hard in games because gameplay systems need control over the animation of interactive elements.

Squash and stretch,Follow through and overlapping action

 

The worlds in games are filled with physical phenomena, yet sadly, characters don’t participate with very much complexity. Instead, game developers create an illusion by playing back animations that fit the context of gameplay or in some cases by using forces as puppet strings to pull the characters around. This can sometimes look wrong because movement of the character will be inconsistent with physical laws. Physical inconsistency could manifest as foot sliding, floating, unnatural acceleration, unbalanced poses, limbs clipping geometry, and strange collisions. If real physical principles were used to move characters around in games, rather than current methods, physically accurate character movement would arise naturally.

So why haven’t character animation systems been totally replaced with physics simulations yet? Well, there have been some difficult barriers to overcome. Rigid-body physics can be easily applied to generate chaotic motion of characters when used for something like ragdolls, usually enabled during deaths, – but controlling simulated characters means finding a strategy to intelligently control this chaos. Its been hard to find a good solution which produces high quality motion while offering a degree of control similar to traditional animation, but plenty of researchers have been working hard on this problem.

 


 

Our research at Ubisoft La Forge over the last year has focused on how animation and physics can work together in games. At SIGGRAPH ASIA 2019 I will be presenting a research paper which presents a step towards that direction, titled:

DReCon: Data-Driven Responsive Control of Physics-Based Characters




DReCon in action (Move the camera with right mouse button)
this is only a playback of a recording

This research presents the method we developed for controlling locomotion of physically simulated characters by applying deep reinforcement learning and a data-driven animation technique called Motion Matching. The characters move and balance entirely using their own strength, allowing a realistic simulation of their interactions with the environment. We also kept the unique constraints of games in mind,

 

Performance breakdown for one character (Intel Xeon E5-1650 v3)

 

ensuring our method has a very low runtime cost in order to make it viable to fit the performance budgets of real games on today’s hardware. DReCon is unique in that it maintains a high level of responsiveness to user input while also keeping the simulated movement of the characters as natural looking as possible.

Physically based characters controlled using DReCon retain the movement quality of the animation

 

Approach

When you think about it, controlling a character in a physically simulated world isn’t very different from controlling a legged robot in the real world. Unfortunately, legged robotic locomotion is tough. That’s why lots of robots walk with very slow careful movement, never lifting their feet high off the ground or straightening their knees completely. Legged robots are not only chaotic systems composed of many interconnected pieces – they’re also under-actuated. This means we cannot fully control their trajectory through space, so sophisticated planning and coordination is generally needed.

Fortunately, game characters have an advantage over robots when it comes to solving these problems. Robots need to solve problems in the real world using real hardware. Simulated characters don’t have to worry about things like inaccurate sensors, underpowered actuators, maintenance, or damage. Most importantly simulations can run faster than real-time, which lets us effectively apply deep reinforcement learning (DRL).

DRL essentially works by training a control policy using trial and error experiences. The policy is a neural network which is optimized to output decisions which maximize performance on a task by learning from the experiences. Reward values associated with the decisions let the policy learn from which decisions were good and bad in the long run, and encourage it to repeat those which were good. This has led to breakthroughs on many difficult AI problems, most famously DeepMind’s AlphaGo used DRL to become the first Go AI capable of winning against professional players. Simultaneously, it has also been causing a revolution in robotics.

DReCon trains characters to track randomly generated trajectories like this using DRL. These are gathered thousands of times a second

 

Complex robotic tasks such as locomotion or object manipulation usually have relied on expensive optimization methods to achieve any level of success, which made application to real time systems difficult. An unsatisfactory trade-off between runtime cost and task performance was almost unavoidable, but I would argue DRL has the capability to eliminate this trade-off in some applications. With DRL, expensive operations are performed as precomputation yet result in a controller with high performance and low runtime cost. Doing this precomputation requires extensive trial-and-error – really time consuming on real robots, much faster with simulated ones.

Users defines the short term trajectory / heading they want for the character – an artificial “worst case” user is used during training

 

DReCon trains simulated characters using DRL to use their joint actuators (similar to muscles) to follow a user controlled animated character. The controller automatically learns to correct any physical problems with the motion of the animated character since the goal is to track the motion as best as physically possible. This necessitates learning how to maintain balance and walk around while preserving the overall animation style. We get a high degree of responsiveness by training with an artificial user who varies gamepad input aggressively and randomly. The system has no choice but to learn a strategy that adapts to such a “worst case” user.

Motion Matching continually generates animation through a search process

 

To generate animation we use Motion Matching, a data-driven technique that enables very realistic character animation. “For Honor” was the first game to use this method, and nowadays it’s seeing more widespread adoption. Motion Matching replaces the traditional method of directly choosing animations using a finite state machine. Developers instead choose constraints they want respected. Then, a large dataset of motion-capture is searched and the pieces of animation that violate the constraints the least are continuously combined to generate the resulting character motion. This can be implemented as a nearest neighbor search in a high dimensional space, where each dimension represents one of the constraints. Our implementation constrains these features: the characters path in the next few frames, facing direction along the path, animation style, and continuity of foot placements.

 

A visual example of Motion Matching searching animations from “For Honor”

 

Through this search process, Motion Matching generates very realistic looking animation because large amounts of data are intelligently patched together specifically with the intent of achieving high-level motion objectives and preserving continuity. This allows complex but subtle locomotion behaviours from motion capture to be reproduced. Because of this, using DRL to train a policy which tracks the output of Motion Matching has an advantage over previous work, such as DeepMimic which tracked fixed animations and used rewards to achieve secondary goals. Planning movement (for example steering) is handled by Motion Matching and as a result motion objectives can only be achieved by learning how humans achieve these objectives in the motion capture data – rather than by maximizing arbitrary goal based rewards that will invariably compete with tracking.

 

A simulated character can be controlled precisely (red) to follow a path (black)

 

Nonetheless, learning this way significantly increases the amount of system states the controller must adapt to and makes training more complicated. To solve this, our control policy assumes the joint positions from the animation are somewhat acceptable inputs to a simple open-loop control system, and only train the policy to make small corrections to these to maximize tracking performance. We also assume corrections should be temporally coherent and employ a filtering scheme to force the corrections to be smoothed out over time. In this way the learning is constrained towards more favorable control strategies which are in the neighborhood of the open-loop controller, and cannot exploit noise in the simulation.

 

 

Schematic diagram of DReCon

 

This research is only scratching the surface of what’s possible, but shows physically simulated characters can actually be implemented in games without blowing the performance budget or significantly limiting responsiveness – super cool in our opinion. We want to push things further in the future, improving the range of behaviours simulated characters can recreate and the overall quality of our results. Physically-based animation is going to allow characters to move and interact more realistically than ever before, which will definitely allow interesting applications in the games of the future.

Auteurs

KEVIN BERGAMIN, McGill University, Canada
SIMON CLAVET, Ubisoft La Forge, Canada
DANIEL HOLDEN, Ubisoft La Forge, Canada
JAMES RICHARD FORBES, McGill University, Canada

Menu