Next-generation games need affordable next-generation interactivity

Jorge del Val
Jun 26, 2024
5 min read

Updated: Jul 17, 2024

Interaction is crucial in games. After all, it separates them from other forms of content, like movies or video. As the industry advances, the way in which we deal with interactivity remains fundamentally unchanged, limited and expensive.

Static interaction: the status-quo

Every interaction can be described as a cause and an effect. A situation, representing the game or character's state, leading to a reaction, such as an animation or dialogue. We can define static interaction as the framework where the available reactions are predefined before the game is played. For example, a sword hits a character on a particular spot, and one out of N prebaked animations is chosen to react; a character talks, and one out of N possible dialogue responses is selected, etc.

Static interactions may be seen as a mapping from a (big, possibly infinite) space of possible situations to a limited number of available reactions. This is the status-quo in today's industry.

It is not uncommon to try and circumvent the limitations of the framework by combining existing reactions to obtain new ones. In animation this can be achieved with techniques such as blend trees or motion matching. However, the main set of prebaked reactions are our only ingredients: nothing fundamentally different can happen. Procedural methods, such as those relying on techniques like Inverse Kinematics, can also help adapting ad-hoc, specific reactions.

Example of static interactions in Elden Ring.

Other approaches are also emerging. For example, companies like Meshcapade, Motorica, or even Unity's Muse are offering a hybrid approach, where AI may help to generate a big number of reactions prior to building the game. Then, these are either mapped manually to specific situations, or mapped through techniques like motion matching. This is still static, as defined above, since reactions are fixed prior to building the game. These are also unable to run in-engine in real time due to the size of the models. However, they can save effort in crafting them and, therefore, enable more for the same cost and time.

The problem with static interaction

Two crucial considerations must be kept in mind when it comes to static interaction. Firstly, it is widely acknowledged that games may become monotonous when the range of reactions is limited. After all, reactions must necessarily repeat if there are more situations than reactions — which is typically true.

Addressing this issue by merely increasing the set of available reactions is not a scalable solution. Even when the reaction space is big, these might not be tuned to specific situations, creating a loss of immersion. For example, it is unfeasible to tune a stumbling animation to every possible object you might stumble over, or tune every dialogue to every possible input text from the player. This status-quo demands players to be understanding, while studios must either shift focus or innovate within constraints. For instance, incorporating features like critical hits or counters has emerged as a popular approach to infuse dynamism in a melee fight and provide challenging gameplay. Nevertheless, once a reaction is triggered, players have no control over the outcome.

Example of counter-based static interaction in Assassin’s Creed 3.

Lastly, there has been a noticeable trend in the gaming industry towards more dynamic worlds. Examples include user-generated experiences (such as Roblox or Fortnite Creative), open worlds (like GTA, Elder’s Scrolls or No Man’s Sky), or dynamic adaptations of traditional experiences (such as The Finals, a shooter infused with destruction and physics). Here, unexpected situations can emerge, driven by either users' creativity, physics, or interaction between several systems. Developing reactions to address an unlimited range of scenarios can be expensive, time-consuming, and often unfeasible. Employing advanced techniques to design reactions prior hand is not applicable in this context either, as the situations may be inherently unforeseen.

Dynamic interaction

It is exactly in these contexts where AI may come as a powerful tool, as it allows us to extrapolate at runtime without the intervention of a human. It allows for a dynamic form of interaction, where the reaction is created as you play it.

Dynamic dialogue with Inworld.ai

In this setup we can get reactions to any situation on the fly, whether they are implicitly extrapolated from available reactions or from a big dataset. We can even provide variance to the reaction, generating a range of possibilities for every situation. When using Machine Learning techniques, the mapping can be more grounded to the game or the developer's desires through penalties in training or conditions at inference time.

We can see straight away how this paradigm can dramatically save costs when developing an immersive experience. In the case of movement and animation, few to no companies are offering dynamic interaction, with reasons varying from technical complexity to inference costs at runtime. Latent Technology is the exception, building a truly dynamic approach: Generative Physics Animation.

Dynamic interaction with Latent Technology. Reactions are not predefined. Note the behaviors of zombies are far from ragdolls.

Emergent interaction through physics

Dynamic interaction is an intriguing paradigm, but might still become monotonous if it is unable to influence the behavior of the game. Nonetheless, if we can generate reactions to any situation, we can think of coupling them to the evolution of the game to produce emergent behavior. The concept of emergence is typically defined as "the whole is bigger than the sum of its parts" or "complex behavior appearing from simple interactions".

Emergent behavior with Latent Technology. Reactions are unscripted; injured movement emerges from weakened joints.

Why is emergence important? Because, done properly, it enriches the game for free, giving players immersion, surprise and freedom at no additional cost. If the whole is bigger than the sum of its parts, we will get a bigger experience at a smaller cost.

Nevertheless, it is notoriously difficult to build compelling emergent experiences. The unpredictability that they yield is a big drawback for game studios, for which the control of what can and what cannot happen is very important. This can be mitigated by: 1. enabling emergence only at desirable parts of the game and 2. understanding and tuning the emergent behavior to our design. Ragdolling, for example — the most common emergent movement nowadays — follows these two principles, although it is a very limited experience.

A few game studios have proposed experiences with emergent behavior. Within movement, we may find games such as Totally Accurate Battle Simulator, Human: Fall Flat, or Gang Beasts, all of which use physics as the source for emergence.

Emergent movement in Totally Accurate Battle Simulator.

Still, since it is really hard to create compelling animation in a physics-based world, all of those share the same relaxed, meme-ish feel. There is currently no method — apart from Latent's Generative Physics — which manages to create rich, realistic, and emergent physical movement for complex rigs (such as humanoids) at a feasible cost.

Emergent Cheese Rolling with Latent Technology. Reactions influence the evolution of the game.

Some final remarks

With interactivity being such a fundamental part of video games, it is paramount that the industry works towards bringing fresh, dynamic forms of interaction to the hands of players and game developers. Nowadays only AAA studios can afford the ilusion of immersiveness by brute-forcing the interactions, for lack of a better method; indies and small to mid studios are left on the sidelines. This barrier sustains a status-quo in which players are left to relive the same experiences over and over again, instead of exploring fundamentally different ones.

In particular, seamless emergent movement can lead to a natural interactivity at a dramatically lower cost than its traditional counterpart. Animators and game designers could embrace this for parts of the game which are cumbersome to make, gathering all the benefits while focusing on what really matters: art, creativity, expression, and providing unforgettably fun experiences.

Feel free to tune to our Twitter or Linkedin to see the latest updates and fun snippets of our tech in action!