Understanding Fighting Game Networking

So earlier today I was talking with Zinac about timing and synchronization issues in fighting games. More specifically, the cases that need to be handled with respect to packet loss and frame loss. These are actually very different things!

Then it occurred to me that I haven’t actually written an article on how fighting game netplay really works at a nuts and bolts level, and so here we are.

Fighting games are, for the most part, 1v1 affairs and therefore benefit most from having a direct connection between the two players. In a situation like this, the game can be kept in perfect synchronization, which means that all you need to do is transit inputs. It also makes it so that ‘cheating’ isn’t really possible, outside of multi-frame macros anyway.

For the purposes of this article I’m just going to assume that connections have been established and that we only care about the actual in-match part of the networking. Also, that the game itself is fully deterministic and has nothing other than inputs that need to be transferred.

On the internet data is transferred in individual packets, which each contain a blob of data. Each one has some small bandwidth overhead for transport information, and a maximum size that each one can contain. Simple enough, just throw your inputs on there and you’re done!

But, the internet being what it is, you have a few problems that all need to be taken care of on the way:

  • Packets take time to reach their destination.
  • Packets can get lost on their way.
  • Packets can get there, but have their data corrupted. (very rare, but it does happen.)
  • Computers can run at different speeds.
  • Computers can occasionally get hung up on doing things and skip a frame or two.

Fortunately you can solve most of these. We’ll start with compensating for packet transit time:

The simplest way of keeping synchronization between two players is to delay both player’s inputs by a set frame amount. This way it will only do input at a frame when both player’s inputs are had.

The number of frames to delay input by is calculated as follows:

InputDelayFrames = Ceiling( (RoundTripLatency + Constant) / (2 * FrameDuration) )

A small constant is added to the latency in order to give a bit of extra buffering, because computer timing is relatively imperfect. Because we only need to send data one way for it to reach its destination, we only need half of the calculated round trip latency, aka the ping, so it gets divided by half. We also divide by the duration of each frame, usually 16.667ms for 60fps, in order to convert it to frame units, and then round up.

Something most people don’t think about is that you generally want to handle receiving packets in a separate high-priority blocking thread in order to keep responsiveness of data requests as close to instantaneous as possible. Without doing that, you’ll be rounding up to the nearest frame in timing. This could make a 35ms ping act exactly like a 67ms ping, and you really don’t want that.

Packets can get lost on their way or are just occasionally delayed by networking bottlenecks. An unfortunate fact of life, and you can’t do anything about it other than deal with it.

In an input delay model where you synchronize to a given frame value, you have no choice but to wait at the frame where you have yet to receive the data for that frame. This is also a pretty good time to send a request for a resend of the data for that frame.

You can send multiple frames worth of input data for each frame, so that when one frame gets lost you don’t need to wait for a full resend cycle to complete in order to continue with the game. Input data’s pretty cheap so it doesn’t cost much to include 5-10 frames worth of data in a packet, so might as well include it. This is something that you should pretty much always do.

Additionally, when you wait on that frame, input will not be received by the other side, so this small drop will be reflected on both players.

There are some tricks that can be done to compensate for this.

By adding an extra frame of input delay, you can compensate for a single frame of dropped data, provided that you include multiple frames worth of input data per packet. In the image above you can see that, as suggested above, each input is sent multiple times, once to the first frame received and again on the one after.

As you can see, the Frame+1 packet gets lost to the dimensional cleft, but because input delay is set to 2, the other player’s Frame+2 is using the input from +0. One frame later it will receive a second packet, containing Frame+1 and Frame+2’s data, allowing Frame+3 to proceed as normal with no interruptions.

Simple fix for a simple problem, but there’s a big, big gotcha here that needs to be understood.

If one game is running slower than another, dropping a frame here or there, it cannot be treated the same as if it were a packet loss. This is because you lose that bonus extra buffer of delay you added to compensate for it, and get effectively nothing of value out of it. Obviously, this isn’t desirable.

This leads to the most important rule of this sort of network code: The goal is to maintain complete synchronization with the other system. This includes performance issues. If one has a drop, this drop must be reflected in the other as well. Always. Anything else will lead to a desynchronization of the intended behavior.

In an input delay system, you can handle this by knowing both the necessary input delay and the current amount. When a packet is received you can check to make sure that it matches the necessary input delay value from the current frame, and if that is not synchronized, then the code skips the current frame in order to keep synchronization across systems.

This is different from waiting for input! The process becomes this:

  • When a packet is received, test to see if the frame for the data given matches the minimal input delay, wait if there was a drop on the other side.
  • When it’s time to run the game, the input is necessary, so test with the real input delay value to make sure that you have the data for this frame.

This simple model can compensate for single packet losses trivially and still retains full timing synchronization across systems. You can periodically send pings to determine what the minimal input delay should be.

But, as we all know, input delay models suck. What we really want is rollbacks, which is the model used by GGPO, Supercade, and RollCaster. This is a system where it runs the game ahead of the opponent’s input, and then when new input data is received it ‘rolls back’ the game and reruns it with the new input. Think of it like retconning things that didn’t really happen.

This graph suspiciously looks very similar to our input delay model. That’s because it’s basically the same except that we’re waiting for input data from a different frame! In this case, the current frame minus the rollback amount determines the frame that we acquire data from.

You might notice this is parallel here: Input delay is done by adding to the frame before it is sent, rollbacks are done by subtracting from the current frame before getting the input. This way, not only can you mix the two as you see fit, as long as the input delay and amount of rollbacks sum to at least the necessary amount of input delay, the performance will be smooth and clean. Magic!

I’m not going into the details of writing the rollbacks themselves into the game engine in this article, but assuming you have that part, that is literally the only change you need to make in order to make them work.

But of course, if you have rollbacks, you have what is fundamentally a partially asynchronous networking model. This means you can do some extra magic tricks on top that you couldn’t do before!

Because of the way the model works, if you’re willing to rollback a few more frames than is set, you no longer need any sort of extraneous input buffer to keep everything running along smoothly. Instead you can just keep the game running like normal and then do a couple extra frames of rollback when the data is finally received.

Obviously you want to put a reasonable limit on this so you don’t end up one second out without the correct input. RollCaster allows only one extra frame of rollback before forcing a block for input and forcing resynchronization. GGPO will run along for quite awhile as long as it can keep verifying that both systems are running at the same timing.

And if you’re wondering, the rule from above regarding keeping synchronization with the other system’s performance must be upheld. If you detect that the other computer is running slowly and has dropped a frame, then you must also wait a frame to keep synchronization, even in a rollback setup. If you don’t do this the two computers will slowly drift out of timing and nobody wants that.

That’s pretty much all I got!

Once you have all that understood the remainder is simply setting up the network connection and figuring out how you’re going to do the rollbacks themselves, if you bother with them at all. And while this is all pretty optimized for 2 players, it doesn’t scale up to more cleanly without a real hassle, so be careful.

Now start making kickass netplay for your games, people!

  1. It actually occurs to me that, for the input delay setup, you don’t really need to check the minimum delay, just check to make sure there’s no divergence after the initial time synchronization.

    Hmm.

  2. Thanks for the interesting write up. There really don’t seem to be many articles out there that talk about lockstep networking implementations, mostly client-server models for fps type things.

    I’m guessing to minimize bandwidth, mostly only control info is sent, but what kind of frame information is shared to allow each side to know if they’re still in sync or not?

    • To verify sync, Caster uses their X position and their HP value, every 20-ish frames. Other games could probably get away with savestate checksums or whatnot.

      You want to use data from a confirmed fully synchronized frame for this, though.

  3. It’s a good thing that my motivation for working on a certain project of mine is rather low right now (due in part to optimization issues that may become a little easier to fix with SDL 2.0), because if I had tried to implement netplay before reading this, I probably would’ve made quite a few mistakes.
    I can see why good netplay is so rare though, if it takes this much to get things right even before rollbacks; I never even thought about sending multiple frames of input data every frame.

    • I’ve been considering working on an open source netplay library at some point, simply because so many people have trouble getting it right.

      But not right now as I’m busy coding on other projects which do not involve netplay.

Leave a Comment