Just a guess but I would imagine that more than controller info is sent to the server. If just controller input was sent to the server, it would need to calculate all the physics, to figure out the positions of the cars.
I imagine the client sends speed, orientaion, postion, revs, steering, brake and throttle in each packet (probally more). This would then be sent on to clients by the server. So the server doesn't need to be that smart. ( It may even use keyframes which contain all the data above and smaller frames with just position and controller input, missing info in smaller frames would be interpolated from keyframes:shrug
The trouble with this is the position info is always late and the bit the impresses me with lfs is how well it can approximate where the car will currently be based on old data.