It’s Saturday morning, cats outside woke me up at 0615, everyone else is in bed. Time for some CC2 modding.
In my previous posts about profiling and CPU usage I made use of a primitive call timer. I’d forgotten at the time that I’d hacked up a basic call count profiler from lua.org examples
The lua.org example I had used the os module for the clock, but we don’t have that in CC2. I put it to one side and forgot about it.
Now, I want to focus my efforts better, there isn’t a great deal of worth in optimising a function that only gets called once every few seconds and doesn’t take very log anyway. But a function called a lot? My hacky profiler does record call counts using the lua debug library! So I have some thing I can look at!

Here we go, this is the total call count from calling the screen_vehicle_control script’s “update()” function in a tight loop for 20 seconds.
@scripts/library_vehicle.lua:1554 29445 2
That caught my eye! It’s called 29 thousand times and even with our low-res clock timer it accounts for 2 whole seconds of call time.
But why doesn’t it have a function name? hmm..

Hm. doesn’t look too scary.. oh…

It’s being called via a protected call! These are not fast. I originally added these so that bugs wouldn’t cause the whole screen to stop working. But pcalls are EXPENSIVE.
Without the call counter we get to measure the current update() performance with this pcall still in place for a 20 second loop:
calls 11436
calls/sec 571.8
Ok, lets remove the pcall wrapper and see what we get..
calls 11480
calls/sec 574.0
Not much difference.. boo.. Its’ still a better, but I had hoped for more.. What else can we look at. There are some more pcalls still there so lets kill those off.
calls 11639
calls/sec 581.95
A bit better!
Hmm. To reduce the _overall_ use of revolution, I added some randomisation and caching, so not all of the scanning/search/mapping functions all get called every time, in-fact most of the time the expensive ones aren’t called.
So, for timing only, we force the fog-of-war and radar refresh to happen every time, and here we go, this is the timing where we do everything in the update call
------- >
start timer 3 1748090317 10033
done timer 1748090337
calls 5513
calls/sec 275.65
timer armed
------- >
start timer 3 1748090351 10433
done timer 1748090371
calls 5510
calls/sec 275.5
With this, (and on this map I have) we have a stable speed of between 270-276 calls per second.
Lets look at the call counts now.

Right, we have a bit more to think about. There are maybe some things we can cache.
- get_is_vehicle_air – 12770
- _get_radar_attachment – 10496
- get_vehicle_team_id – 10676
- get_vehicle_docked – 69743
Ok, lets tackle get_vehicle_docked(), This is an intresting one, because it has to work in the HUD and in the other scripts, where there are two different ways of detecting that a unit is docked:

Left is before, and right is after. We use a global table to cache for a few seconds (2 sec – 60 ticks) the docked-state of each vehicle we check.
With these changes we are up to..
------- >
start timer 3 1748092065 31645
done timer 1748092085
calls 5576
calls/sec 278.8
timer armed
------- >
start timer 3 1748092100 32070
done timer 1748092120
calls 5568
calls/sec 278.4
timer armed
------- >
start timer 3 1748092134 32467
done timer 1748092154
calls 5590
calls/sec 279.5
So we’ve gained a bit close to 3 more calls per sec.
Ok.. lets look at “get_is_vehicle_air()” which is a very simple function:
function get_is_vehicle_air(definition_index)
return definition_index == e_game_object_type.chassis_air_wing_light
or definition_index == e_game_object_type.chassis_air_wing_heavy
or definition_index == e_game_object_type.chassis_air_rotor_light
or definition_index == e_game_object_type.chassis_air_rotor_heavy
end
At first glance, we can’t do much to make this go faster. It’s between 1 and 4 integer comparisons.
Maybe we can..
If we look at “library_enum.lua” we have:
e_game_object_type = {
chassis_carrier = 0,
chassis_carrier_broken = 1,
chassis_land_wheel_light = 2,
chassis_land_wheel_light_broken = 3,
chassis_land_wheel_medium = 4,
chassis_land_wheel_medium_broken = 5,
chassis_land_wheel_heavy = 6,
chassis_land_wheel_heavy_broken = 7,
chassis_air_wing_light = 8,
chassis_air_wing_light_broken = 9,
chassis_air_wing_heavy = 10,
chassis_air_wing_heavy_broken = 11,
chassis_air_rotor_light = 12,
chassis_air_rotor_light_broken = 13,
chassis_air_rotor_heavy = 14,
chassis_air_rotor_heavy_broken = 15,
chassis_sea_barge = 16,
<snip>
The scripts never ever get to see the _broken values, so we can make this go a bit faster by returning true for values of 8-14
This makes our script look like..
local chassis_air_min = e_game_object_type.chassis_air_wing_light
local chassis_air_max = e_game_object_type.chassis_air_rotor_heavy
function get_is_vehicle_air(definition_index)
return definition_index >= chassis_air_min and definition_index <= chassis_air_max
end
We have gone from up to 4 global table lookups and up to 4 comparisons, I don’t expect this to have made a huge difference. But lets see if it helps.
Huh.. maybe it helped more!
start timer 3 1748092857 33063
done timer 1748092877
calls 5713
calls/sec 285.65
timer armed
------- >
start timer 3 1748092892 33491
done timer 1748092912
calls 5667
calls/sec 283.35
timer armed
------- >
start timer 3 1748092932 34073
done timer 1748092952
calls 5616
calls/sec 280.8
We’re now at 280-286 calls/sec
Ok, lets do “_get_radar_attachment()”.

Right. timing..
calls 5648
calls/sec 282.4
calls 5417
calls/sec 270.85
calls 5580
calls/sec 279.0
Hm.. not obviously faster.. If anything, worse hmm.
Hmm.. I will investigate further!



One response to “More optimization!”
[…] on the heels of my earlier post More optimization! I had a […]
LikeLike