This articles purpose is mainly to propogate some correct information, as the tutorial "Performance Scripting in 2k3" by Fhizban
is more or less entirely incorrect (unempirical nonsense is probably more accurate). I don't suggest reading the article as you'll need to unlearn it all anyway. In the explanations to follow I will be citing my findings. If you would like to see how I worked it out please check out the appendices!Disclaimer
These calculations are true of a modern day computer, I expect with something a bit older that the upper limits of rm2k3 are a bit lower (probably < 2000Mhz single core computers). I'm running a 6 core 3.7Ghz processor, so I'm pretty sure I'm somewhere above where the performance of rm2k3 would've plateaued.Glossary
Parallel Process - the trigger condition of an event
Loop - the Event Command that performs loops
Label Loop - using 2 sets of labels you can make a loop and can jump out of the loop when a condition is true
loop - anything that performs a looping function, including both Parallel Processes, Loops and Label Loops
L - the maximum number of lines of code rm2k3 can execute a secondIntroduction
Let's start at the beginning. I'm sure all of you have experienced the god forsaken lag that ensues when you start using Loops and Parallel Processes, but very few people understand exactly what is causing this lag. The popular myth is that having too many Parallel Processes running at one time causes lag. For the most part, this is incorrect. It is not the existence of the Parallel Processes that causes this lag, but the accumulation of how many event instructions rm2k3 is executing at one time. Its sounds so obvious! It isn't. The reality is you can have thousands of Parallel Processes going at once with no lag, more on that below!Overhead
Everytime rm2k3 executes 1 line of code it makes the game lag. Not very much, but lag none-the-less. How many lines per second can rm2k3 take? So far as my data shows... somewhere between 600166 and 800222
. I'm not sure what the exact
number is, but its somewhere inbetween those (see Appendix A
Yeah thats right. Sounds fricking insane doesn't it? You'll never be doing that kind of calculation, its madness, right?! You, my friend, have fallen into the trap of underestimating rm2k3 and computers in general (if rm2k3 is making that many executions a second, how much do you think your computer is!?). Its infact extremely easy to reach that cap - if you've made your game lag, you exceeded that magical number, which I will now refer to as L (for lag, you see!).
There are a few ways to exceed L, which yes, occurs when you have a lot of Loops and Parallel Processes happening at once. BUT its not because of the number of those you have, its the combination of how fast they're doing their calculations! I ran a small test and worked out how fast each kind of loop executes.
-Parallel Processes perform about 60 iterations per second
-Loops inside an event... perform about 200055!
-Label Loops inside an event........ 300083!!
This is what essentially causes the lag. Its not the fact they exist, its that they're doing so much at once! You can have a huge ass event with no loops and it'll run almost instantly, not matter how long it is! But the second you introduce a loop, especially an infinite one, code is being executed hundreds of thousands of times a second.
Also you're probably wondering why the hell Loops and Parallel Processes are different speeds. Its because, contrary to popular belief, Parallel Processes ARE NOT
the same as Loops! When a Parallel Process reaches the end of its code, it doesn't loop back to the start like a Loop does. It calls itself! It basically goes "Call Event: This Event" indefinitely at the end of each iteration. This may or may not be a problem depending on what you're trying to code, I'll go over the implications in a later section. Basically though, when you use the "Call Event" command you introduce an overhead that takes rm2k3 0.0165 seconds to compute, whereas if you were using a proper Loop you don't have that over head and your calculations go about 333333% faster!
You never really hear people talking about Label Loops as its believed to be "bad programming". When it comes to rm2k3 the truth is that they're by far the fastest way you can calculate as they have even less overhead than a Loop! Because people don't really deal with Label Loops I won't be mentioning them again, but they're definitely a nifty tool when it comes to once off functions that are supposed to execute asap!Overcoming L
Now we're getting into something you probably knew that works, but not WHY
it works! You've probably had someone tell you that if a Loop is lagging to put a wait command of 0.0s at the end of it. Unbeknownst to you, this works because you then infact make the Loop slow down to about the same pace that a Parallel Process runs at (because a 0.0s wait command is actually a wait of 0.0165s), which is less calculations per second, which is therefore less laggy!
OMFG ITS ALL COMING TOGETHER. Are you getting excited? I'm getting excited.
So how many Parallel Processes CAN
you run at once? The answer is about 10000-13000.Yep. Thats LINEAR Parallel Processes though, ones without a Loop inside them. Once you start adding Loops ofcourse, since they execute SO much faster than Parallel Processes, you start catching up to little ol' L mighty fast. There is an easy way around this though, if you added a 0.0 wait command inside that Loop, you bring it back down to the speed of a regular Parallel Process, so you're back to having as many loops as you want. You could even put an additional 0.0s wait command in each loop which will double
the amount you can have running at once!
You ARE allowed to have Loops running rampant in the background though, but you can only have 2 of them running at once, as each one is executing 200055 lines per second you won't pass L, but you will with 3. That means you can run 2 Loops and still have room for only
3333 Parallel Processes! Ofcourse you can probably have a little more than that. 3 Loops WILL make you exceed L, though.Practical Applications
The above may be helpful for everyone if they're simply experiencing lag, but making use of these principles is for advanced users only. An implication of Parallel Processes having a 0.0165s overhead is that time related events that you've coded that use the Parallel Process itself as a loop will be running slower than you intended.
For example, say you had a Parallel Process that tracked time, and every iteration of the Parallel Process it would wait 1.0s and then add 1 to the Seconds variable of your clock. Instead you'd find that you're incrementing that clock every 1.0165s and that after 10 hours gameplay your clock would be 10 minutes slow! In the world of time, losing 10 minutes in 10 hours means a broken clock. Instead inside the Parallel Process you should have a Loop that has inside it your 1.0 and Seconds incrementer, so that way it is infact keeping accurate time. Because the Loop is only executing every second it won't slow anything down, it just means every 1.0 seconds it'll do one really fast calculation with no overhead, unlike the Parallel Process.Bonus Material! - Linear code optimization
Code doesn't need loops for it to be slow. Linear code will also cause lag if you're trying to do too much at once, but not because of how many instructions you're trying to execute, but because of if you're trying to be clever by putting the code you reuse often into its own common event. This in itself is good practice, but not if you're trying to write efficient code in a loop. Remember how I mentioned that each Call Event
command has a 0.0165s overhead? If you put that Call Event in a loop you slow it down by that much... EACH TIME YOU CALL IT! Where as if you cut and paste the code multiple times, you wouldn't get that slow down!
On the whole it is better to compartmentalize your functions into different events so you can reuse them or separate your code, but you need to break that rule when writing events that warrant quick calculations - like a damage algorithm in a battle system, or a sorting algorithm, these need to be as quick as possible and Calling Events willy nilly makes them run a hell of a lot slower!
Another common misconception is that if your nest your if statements a certain way, so that rm2k3 has to check fewer of them, that the code will run faster. This is not
true at all, for linear code. Ofcourse if inside one of those if statements there is a Loop and that if statement isn't true then that Loop won't execute and the code will run faster, but line for line rm2k3 takes the same time to execute linear code whether that code is being used or not.
But there is still hope for reducing your code. You achieve this by actually reducing the length of your linear code, but without using Call Event. You take the code that you do want to reuse and put it down the bottom of your event. Then instead of using Call Event to get to it, you use labels! It can be a little confusing, but if you're trying to get as much raw speed as you can its a must, and then you just put a label (I use label 100) at the very end of the event so you can skip to the end of the event to quit when you're done without executing the code you were hiding down there.
Another way to reduce the size of code is to invent the OR operator! This something that rm2k3 lacks, and its a shame because its one of the simplest things most programming languages have. By default in rm2k3, if you want a piece of code to execute when either one thing is true, or another thing is true, but not necessarily both, you have to paste the code in twice. Thats twice as many lines of linear code rm2k3 has to read through, and it also means you're making more work for yourself if the code is the same, long or your decide to change it later (you have to change it for each possible if statment)! The way you get around it is this simple trick:
OR = 0
IF thing 1 is equal to 5
OR = 1
IF thing 2 is equal to 6
OR = 1
IF OR is equal to 1
//code for both cases here
Its very simple, your code will run quicker, and its easier to fix later! You can also do more complicated things like making the code only run if atleast 2 out of maybe 10 things is true, that'd be very annoying to replicate without doing it this way.
And thats it for this session, read onto the Appendix if you want to see how I got my data, and please leave a comment if this was helpful to you! Appendix A
This was actually really easy to work out once I got my head around it. For the sake of simplicity I'm making the assumption that it takes ~0 seconds to execute a single instruction X.Aim
To calculate how many times each type of loop iterates per second I ran an experiment over 60 seconds using multiple loops running concurrently alongside a timer loop, and on a 60 second timer another event that was watching the timer would pop up with the number of iterations each loop performed.Method
1) Create the Time event as a Parallel Process with the following code:
Timer + 1
2) Create the Loop event as a Parallel Process with the following code:
Loop + 1
3) Create the Parallel event as a Parallel Process with the following code:
Parallel + 1
4) Create the Label event as a Parallel Process with the following code:
Labelcount + 1
Jump to Label 1
5) Create the Popup event as a Parallel Process with the following code:
If Timer is equal to 600
Message(Time: \vTimer Loop: \vLoop Parallel: \vParallel Label: \vLabelcount)
6) Execute the map and wait a minute for the results to appear and record themResults
Time = 600
Loop = 12003333
Parallel = 3601
Label = 18005000
The per second metric was obtained by dividing these values by 60. Potentionally measurement could've been compromised if the combined execution rate exceeded L, however it did not and as such the results are assumed accurate.
The value L was obtained by copying and pasting each event until lag (regular jerkiness in character movement) was observed. The following combinations induced an L state:
2*Loop + 1*Label
2*Label + 1 Loop
No amount of linear Parallel Processes could induce an L state as hypothesized.
As it was impossible to measure the execution rate of each loop with less than 1 instruction per loop, L can only be approximated as being somewhere between the highest non-L state (2*Label, 600166 lines per second) and the lowest oberserved L state (3*Loop, 800222 lines per second).