This thread is for people who like technical insights into LFS development. It is probably too technical for most people, but I know that some do want to read this type of information.
As mentioned on the 20th Anniversary page I have been working on a multithreading system. As described on that page, we have been trying a 1000Hz physics update rate. It uses a bit more CPU than the old 100Hz physics rate (nothing like 10 times as much - but more information about that on another thread which I had to close).
The new graphics system also uses a lot more CPU power, primarily because of the shadow maps. People might think that graphics is all done by the GPU, but actually there is a lot of work for the CPU, sending instructions to the GPU (which objects to draw, using which textures and shaders, etc).
So with physics and graphics both consuming more CPU power, there is a great need for multithreading. LFS has always been a single threaded program, which is a good way to make a program when your CPU only has one core. But modern CPUs have more than one core, meaning they can run two or more threads simultaneously. Like two cooks building a complex cake, if they communicate a little, but not too much, they can get that cake made more quickly.
The physics and graphics have requirements that mean they need to run at a different rate. Graphics should run ideally at the refresh rate of your monitor, and physics should run at some high rate that eliminates stability issues. So for a typical monitor, graphics will run at 60Hz (it draws the world 60 times per second) and our experimental physics runs at 1000Hz (time steps of 1ms per physics update).
This suggests an approach where the graphics runs on one thread and the physics runs on another. This way, a CPU with at least two cores can get on with rendering graphical frames and processing physics updates at the same time. There are various advantages. For example, a few cars leaving the pits, but not seen, will not cause your graphical frame rate to drop below your monitor's refresh rate. There are other ways to do multithreading but I am aiming for this traditional approach.
That's great but it's actually a very complicated process to try to get the code separated onto two threads, and never looking at the same data at the same time. As a simple example, the graphics might be half way through drawing the cars, when the physics updates some of the cars, so they glitch forwards relative to other cars. Or worse, the graphics may be drawing an image from one car, the "view car" and that player goes to the pits, suddenly the "view car" is invalid data and the graphics code accesses the memory that is no longer allocated. Or maybe the user clicks a button and some data is added to the multiplayer packet system, at exactly the same moment while the other thread is writing to that system, causing data corruption. I can't describe all the examples but there are so many ways to go wrong unless great care is taken to avoid such issues.
The general principle for the coordination of graphics and physics threads, is that the 'game code' that does the physics, can freely continue most of the time. But when the graphics code wants to draw a frame, it requests a 'snapshot' of the current game state from the physics system. It may have to wait a small amount of time (something less than 1ms) for the physics to finish its current update, then the physics must stop for a tiny slice of time while the snapshot is obtained. That contains all the positions and rotations of the cars, wheels, suspension, physics objects, smoke, skids etc. Anything that can be changed by the game code. Let's call this a 'sync point'.
I made a few false starts on separating LFS as required. One of the earliest approaches was to split the entire program into a physics section and graphics section. But it got out of hand and I reverted around two days of coding. Then I tried something where only the game setup screen and in in-game code would be multithreaded. Running into too many problems, I reverted this attempt as well. There was some other attempt as well but it's a bit of a blur, I can't really remember. But I did learn things while making these false starts. In the end I have had success with an approach that only starts a physics thread when LFS enters the in-game mode. Even the game setup screen stays unchanged. This approach causes the minimum disruption, only changing the code that actually needs to be changed.
I started by preparing parts of the code for this separation. For example, the game timer used to be part of the main loop. I realised there were now different requirements for the graphics timer and the game timer, and that the game timer would need to be run on the game thread, so that code was duplicated and specialised into separate systems. Other code was moved around and restructured in preparation for the multithreading. The part of the code that actually processes multiplayer packets and does physics updates was moved into a separate file, along with associated variables. Various functions needed to be moved over into that separate game file. It's confusing work but the aim was to move all the "game and physics" code and variables into that new file, while leaving the graphics code in the original file.
With the separated code seeming to work as before, it was time to set an option to run that code on a thread. At first that thread was extremely protected from collisions with the graphics thread. It had to be protected so much that no graphics and physics would actually be running at the same time. However, game code could continue to run after the world was drawn, while post-processing was taking place and while LFS submitted the finished image to be applied to the screen. That was enough to cause some thread-related issues and cause me to develop some protection code, that can catch some bugs before they happen. For example, if the game code does some D3D work, for example deleting the mesh of a physics object that stops moving, or loading a car, while the graphics system is also asking D3D to do something, there is a problem and this will sometimes result in corruption. Not all the time, but randomly, when the timing happens to be wrong. So my protection code was designed to make sure the game code never does D3D work unless it has entered a "critical section" that allows it to do so (when the graphics code is in a moment when it is not drawing). Each time a critical section is used, there is a performance issue, because one thread has to wait for the other. So I devise ways to avoid their use when possible. For example when the game wants to delete a car or physics object, now it doesn't immediately delete it. Instead, it adds that object to a list of objects to be deleted. The graphics code will then delete that at the sync point that occurs every frame, just before getting the snapshot for the next frame.
With most issues sorted in that first, non-concurrent, version of actual multithreaded code, it was time to try to get the game code (multiplayer and physics) running simultaneously while the graphical images are drawn. It was very encouraging to see that this worked, to some extent, on the first attempt. But it displayed the expected thread-related issues. I remember the car flicking forward and backward a bit, and a strange effect in the mirror, where the displayed image in the mirror actually showed the back of the mirror, as the mirror view was somehow detached from the actual location of the car. As I say, entirely expected and so I set to work on the "snapshot" code.
The first stage of the snapshot code was to create a list of all the car objects visible at the sync point that occurs every graphical frame. And in all these car objects, store the position and rotation of the main car body. Then make all the graphical code refer only to that list of cars, and only use the "snapshot copy" of position and rotation, and never to the "physics version" of those values. This had the desired effect of stopping the cars jerking around and stabilising the mirrors. Then I saw that there was a tear in the wheel of the car I was viewing. This must be as some values of that wheel were changing, while it was being drawn. To me as a programmer, these bugs are actually quite exciting as they show the threads are really happening. Well I did know that something was going to go wrong with the wheels as they had not been included in the snapshot, and they have their own rotation, position, suspension compression, contact point, contact patch deflection, etc. In fact much more than the actual car body. I started to extract the necessary wheel variables into the snapshot, but reverted my code when that all got out of hand. The following morning I simply included a snapshot of the entire wheel structures, in the car's snapshot, and the wheel problem disappeared.
It seemed that the next step is to do the same for the physics objects, but then I started to encounter some rarer bugs that came up occasionally when I entered the garage screen. Most of the time things worked, but when a bug comes up it's best to sort it out immediately, as with these thread-related issues the bug is not guaranteed to come up every time. The trick is also to recognise a bug as an entire "category" of bugs, and go through the program removing all of those same types of bugs. Another one was when I tried to exit using the X button on the window, and the game froze. I thought it might be a deadlock but actually graphics and physics threads were still running. After some looking around I decided it was probably something to do with adding an "exit game" packet while other packets were being added, or removed, or something... I don't really know as it happened earlier and obviously wasn't something that would be caught in the debugger. So I started work on another protection system which will catch bugs of that type, when the graphics or UI code adds multiplayer packets, that now can only be done in a relevant critical section.
So that's where I am at the moment, the next thing is to take care of the "view car" which is set by the graphics code but can might be changed mid-draw with disastrous consequences. Presumably the solution is to store the view car pointer as part of the snapshot.
As for results, I have not done any performance testing yet. I'm more interested in improving the code structure and removing all the known ways things can go wrong, and also the unknown ways, each time they come up. I got some idea that good things are happening when my frame rate was very limited by being in the debug version (nowhere near monitor refresh rate) and I did a couple of /ai calls. Then overall CPU usage went up but the frame rate stayed constant, as the AI cars were not on screen. It was encouraging to see that the physics processing had increased without affecting the frame rate.
I'm interested in detailed performance monitoring. I'd like to do a timeline for the two threads with some accurate timings, so there are two lines across the screen, and you can see the length of time the graphics thread spent drawing, and for the physics thread, how much time it spent on all those small physics steps and audio updates (which are currently done on the physics thread). And the sync point. So we can start to see how things are really happening, how steady the physics steps are and so on.
That's all for now. I'll keep the thread closed. Partly because some users chose to use previous progress threads as an outlet for their personal issues. But more positively, the next progress post can be added directly after this one and the thread will be kept clean. It's a new approach that may have good results. I don't really need to discuss anything or receive ideas and input, because there is plenty to be getting on with.
Thanks for reading!
As mentioned on the 20th Anniversary page I have been working on a multithreading system. As described on that page, we have been trying a 1000Hz physics update rate. It uses a bit more CPU than the old 100Hz physics rate (nothing like 10 times as much - but more information about that on another thread which I had to close).
The new graphics system also uses a lot more CPU power, primarily because of the shadow maps. People might think that graphics is all done by the GPU, but actually there is a lot of work for the CPU, sending instructions to the GPU (which objects to draw, using which textures and shaders, etc).
So with physics and graphics both consuming more CPU power, there is a great need for multithreading. LFS has always been a single threaded program, which is a good way to make a program when your CPU only has one core. But modern CPUs have more than one core, meaning they can run two or more threads simultaneously. Like two cooks building a complex cake, if they communicate a little, but not too much, they can get that cake made more quickly.
The physics and graphics have requirements that mean they need to run at a different rate. Graphics should run ideally at the refresh rate of your monitor, and physics should run at some high rate that eliminates stability issues. So for a typical monitor, graphics will run at 60Hz (it draws the world 60 times per second) and our experimental physics runs at 1000Hz (time steps of 1ms per physics update).
This suggests an approach where the graphics runs on one thread and the physics runs on another. This way, a CPU with at least two cores can get on with rendering graphical frames and processing physics updates at the same time. There are various advantages. For example, a few cars leaving the pits, but not seen, will not cause your graphical frame rate to drop below your monitor's refresh rate. There are other ways to do multithreading but I am aiming for this traditional approach.
That's great but it's actually a very complicated process to try to get the code separated onto two threads, and never looking at the same data at the same time. As a simple example, the graphics might be half way through drawing the cars, when the physics updates some of the cars, so they glitch forwards relative to other cars. Or worse, the graphics may be drawing an image from one car, the "view car" and that player goes to the pits, suddenly the "view car" is invalid data and the graphics code accesses the memory that is no longer allocated. Or maybe the user clicks a button and some data is added to the multiplayer packet system, at exactly the same moment while the other thread is writing to that system, causing data corruption. I can't describe all the examples but there are so many ways to go wrong unless great care is taken to avoid such issues.
The general principle for the coordination of graphics and physics threads, is that the 'game code' that does the physics, can freely continue most of the time. But when the graphics code wants to draw a frame, it requests a 'snapshot' of the current game state from the physics system. It may have to wait a small amount of time (something less than 1ms) for the physics to finish its current update, then the physics must stop for a tiny slice of time while the snapshot is obtained. That contains all the positions and rotations of the cars, wheels, suspension, physics objects, smoke, skids etc. Anything that can be changed by the game code. Let's call this a 'sync point'.
I made a few false starts on separating LFS as required. One of the earliest approaches was to split the entire program into a physics section and graphics section. But it got out of hand and I reverted around two days of coding. Then I tried something where only the game setup screen and in in-game code would be multithreaded. Running into too many problems, I reverted this attempt as well. There was some other attempt as well but it's a bit of a blur, I can't really remember. But I did learn things while making these false starts. In the end I have had success with an approach that only starts a physics thread when LFS enters the in-game mode. Even the game setup screen stays unchanged. This approach causes the minimum disruption, only changing the code that actually needs to be changed.
I started by preparing parts of the code for this separation. For example, the game timer used to be part of the main loop. I realised there were now different requirements for the graphics timer and the game timer, and that the game timer would need to be run on the game thread, so that code was duplicated and specialised into separate systems. Other code was moved around and restructured in preparation for the multithreading. The part of the code that actually processes multiplayer packets and does physics updates was moved into a separate file, along with associated variables. Various functions needed to be moved over into that separate game file. It's confusing work but the aim was to move all the "game and physics" code and variables into that new file, while leaving the graphics code in the original file.
With the separated code seeming to work as before, it was time to set an option to run that code on a thread. At first that thread was extremely protected from collisions with the graphics thread. It had to be protected so much that no graphics and physics would actually be running at the same time. However, game code could continue to run after the world was drawn, while post-processing was taking place and while LFS submitted the finished image to be applied to the screen. That was enough to cause some thread-related issues and cause me to develop some protection code, that can catch some bugs before they happen. For example, if the game code does some D3D work, for example deleting the mesh of a physics object that stops moving, or loading a car, while the graphics system is also asking D3D to do something, there is a problem and this will sometimes result in corruption. Not all the time, but randomly, when the timing happens to be wrong. So my protection code was designed to make sure the game code never does D3D work unless it has entered a "critical section" that allows it to do so (when the graphics code is in a moment when it is not drawing). Each time a critical section is used, there is a performance issue, because one thread has to wait for the other. So I devise ways to avoid their use when possible. For example when the game wants to delete a car or physics object, now it doesn't immediately delete it. Instead, it adds that object to a list of objects to be deleted. The graphics code will then delete that at the sync point that occurs every frame, just before getting the snapshot for the next frame.
With most issues sorted in that first, non-concurrent, version of actual multithreaded code, it was time to try to get the game code (multiplayer and physics) running simultaneously while the graphical images are drawn. It was very encouraging to see that this worked, to some extent, on the first attempt. But it displayed the expected thread-related issues. I remember the car flicking forward and backward a bit, and a strange effect in the mirror, where the displayed image in the mirror actually showed the back of the mirror, as the mirror view was somehow detached from the actual location of the car. As I say, entirely expected and so I set to work on the "snapshot" code.
The first stage of the snapshot code was to create a list of all the car objects visible at the sync point that occurs every graphical frame. And in all these car objects, store the position and rotation of the main car body. Then make all the graphical code refer only to that list of cars, and only use the "snapshot copy" of position and rotation, and never to the "physics version" of those values. This had the desired effect of stopping the cars jerking around and stabilising the mirrors. Then I saw that there was a tear in the wheel of the car I was viewing. This must be as some values of that wheel were changing, while it was being drawn. To me as a programmer, these bugs are actually quite exciting as they show the threads are really happening. Well I did know that something was going to go wrong with the wheels as they had not been included in the snapshot, and they have their own rotation, position, suspension compression, contact point, contact patch deflection, etc. In fact much more than the actual car body. I started to extract the necessary wheel variables into the snapshot, but reverted my code when that all got out of hand. The following morning I simply included a snapshot of the entire wheel structures, in the car's snapshot, and the wheel problem disappeared.
It seemed that the next step is to do the same for the physics objects, but then I started to encounter some rarer bugs that came up occasionally when I entered the garage screen. Most of the time things worked, but when a bug comes up it's best to sort it out immediately, as with these thread-related issues the bug is not guaranteed to come up every time. The trick is also to recognise a bug as an entire "category" of bugs, and go through the program removing all of those same types of bugs. Another one was when I tried to exit using the X button on the window, and the game froze. I thought it might be a deadlock but actually graphics and physics threads were still running. After some looking around I decided it was probably something to do with adding an "exit game" packet while other packets were being added, or removed, or something... I don't really know as it happened earlier and obviously wasn't something that would be caught in the debugger. So I started work on another protection system which will catch bugs of that type, when the graphics or UI code adds multiplayer packets, that now can only be done in a relevant critical section.
So that's where I am at the moment, the next thing is to take care of the "view car" which is set by the graphics code but can might be changed mid-draw with disastrous consequences. Presumably the solution is to store the view car pointer as part of the snapshot.
As for results, I have not done any performance testing yet. I'm more interested in improving the code structure and removing all the known ways things can go wrong, and also the unknown ways, each time they come up. I got some idea that good things are happening when my frame rate was very limited by being in the debug version (nowhere near monitor refresh rate) and I did a couple of /ai calls. Then overall CPU usage went up but the frame rate stayed constant, as the AI cars were not on screen. It was encouraging to see that the physics processing had increased without affecting the frame rate.
I'm interested in detailed performance monitoring. I'd like to do a timeline for the two threads with some accurate timings, so there are two lines across the screen, and you can see the length of time the graphics thread spent drawing, and for the physics thread, how much time it spent on all those small physics steps and audio updates (which are currently done on the physics thread). And the sync point. So we can start to see how things are really happening, how steady the physics steps are and so on.
That's all for now. I'll keep the thread closed. Partly because some users chose to use previous progress threads as an outlet for their personal issues. But more positively, the next progress post can be added directly after this one and the thread will be kept clean. It's a new approach that may have good results. I don't really need to discuss anything or receive ideas and input, because there is plenty to be getting on with.
Thanks for reading!