Looking at CPU/GPU Benchmark Optimizations in Galaxy S4
Running Epic Citadel - 480 MHz
Firing up GLBenchmark 2.5.1 however triggers a GPU clock not available elsewhere: 532MHz. The same is true for AnTuTu and Quadrant.
Running AnTuTu – 532 MHz SGX Clock
Interestingly enough, GFXBench 2.7.0 (formerly GLBenchmark 2.7.0) is unaffected. We confirmed with Kishonti, the makers of the benchmark, that the low level tests are identical between the two benchmarks. The results of the triangle throughput test offer additional confirmation for the frequency difference:
|GT-I9500 Triangle Throughput Performance|
|Total System Power||GPU Freq||Run 1||Run 2||Run 3||Run 4||Run 5||Average|
|GFXBench 2.7.0 (GLBenchmark 2.7.0)||480MHz||37.9M Tris/s||37.9M Tris/s||37.7M Tris/s||37.7M Tris/s||38.3M Tris/s||37.9M Tris/s|
|GLBenchmark 2.5.1||532MHz||43.1M Tris/s||43.2M Tris/s||42.8M Tris/s||43.4M Tris/s||43.4M Tris/s||43.2M Tris/s|
We should see roughly an 11% increase in performance in GLBenchmark 2.5.1 over GFXBench 2.7.0, and we end up seeing a bit more than that. The reason for the difference? GLBenchmark 2.5.1 appears to be singled out as a benchmark that is allowed to run the GPU at the higher frequency/voltage setting.
The CPU is also Affected
The original post on B3D focused on GPU performance, but I was curious to see if CPU performance responded similarly to these benchmarks.
Using System Monitor I kept an eye on CPU frequency while running the same tests. Firing up GLBenchmark 2.5.1 causes a switch to the ARM Cortex A15 cluster, with a default frequency of 1.2GHz. The CPU clocks never drop below that, even when just sitting idle at the menu screen of the benchmark.
Run GFXBench 2.7 however and the SoC switches over to the Cortex A7s running at 500MHz (250MHz virtual frequency). It would appear that only GLB2.5.1 is allowed to run in this higher performance mode.
A quick check across AnTuTu, Linpack, Benchmark Pi, and Quadrant reveals the same behavior. The CPU governor is fixed at a certain point when either of those benchmarks is launched.
Linpack for Android: Exynos 5 Octa all cores 1.6 GHz (left), Snapdragon 600 all cores 1.9 GHz (right)
Interestingly enough, the same behavior (on the CPU side) can be found on Qualcomm versions of the Galaxy S 4 as well. In these select benchmarks, the CPU is set to the maximum CPU frequency available at app launch and stays there for the duration, all cores are plugged in as well, regardless of load, as soon as the application starts.
Note that the CPU behavior is different from what we saw on the GPU side however. These CPU frequencies are available for all apps to use, they are simply forced to maximum (and in the case of Snapdragon, all cores are plugged in) in the case of these benchmarks. The 532MHz max GPU frequency on the other hand is only available to these specific benchmarks.
At this point the benchmarks allowed to run at higher GPU frequencies would seem arbitrary. AnTuTu, GLBenchmark 2.5.1 and Quadrant get fixed CPU frequencies and a 532MHz max GPU clock, while GFXBench 2.7 and Epic Citadel don’t. Poking around I came across the application changing the DVFS behavior to allow these frequency changes – TwDVFSApp.apk. Opening the file in a hex editor and looking at strings inside (or just running strings on the .odex file) pointed at what appeared to be hard coded profiles/exceptions for certain applications. The string "BenchmarkBooster" is a particularly telling one:
You can see specific Android java naming conventions immediately in the highlighted section. Quadrant standard, advanced, and professional, linpack (free, not paid), Benchmark Pi, and AnTuTu are all called out specifically. Nothing for GLBenchmark 2.5.1 though, despite its similar behavior.
We can also see the files that get touched by TwDVFSApp while it is running:
When the TwDVFSApp application grants special DVFS status to an application, the boost_mode file goes from value 0 to 1, making it easy to check if an affected application is running. For example, launching and closing Benchmark Pi:
shell@android:/sys/class/thermal/thermal_zone0 $ cat boost_mode1shell@android:/sys/class/thermal/thermal_zone0 $ cat boost_mode0
There are strings for Fusion3 (the Snapdragon 600 + MDM9x15 combo) and Adonis (the codename for Exynos 5 Octa):
What's even more interesting is the fact that it seems as though TwDVFSApp seems to have an architecture for other benchmark applications not specifically in the whitelist to request for BenchmarkBoost mode as an intent, since the application is also a broadcast receiver.
So we not only can see the behavior and empirically test to see what applications are affected, but also have what appears to be the whitelist and how the TwDVFSApp application grants special DVFS to certain applications.
Why this Matters & What’s Next
None of this ultimately impacts us. We don’t use AnTuTu, BenchmarkPi or Quadrant, and moved off of GLBenchmark 2.5.1 as soon as 2.7 was available (we dropped Linpack a while ago). The rest of our suite isn’t impacted by the aggressive CPU governor and GPU frequency optimizations on the Exynos 5 Octa based SGS4s. What this does mean however is that you should be careful about comparing Exynos 5 Octa based Galaxy S 4s using any of the affected benchmarks to other devices and drawing conclusions based on that. This seems to be purely an optimization to produce repeatable (and high) results in CPU tests, and deliver the highest possible GPU performance benchmarks.
We’ve said for years now that the mobile revolution has/will mirror the PC industry, and thus it’s no surprise to see optimizations like this employed. Just because we’ve seen things like this happen in the past however doesn’t mean they should happen now.
It's interesting that this is sort of the reverse of what we saw GPU vendors do in FurMark. For those of you who aren't familiar, FurMark is a stress testing tool that tries to get your platform to draw as much power as possible. In order to avoid creating a situation where thermals were higher than they'd be while playing a normal game (and to avoid damaging graphics cards without thermal protection), we saw GPU vendors limit the clock frequency of their GPUs when they detected these power-virus style of apps. In a mobile device I'd expect even greater sensitivity to something like this. I suspect we'll eventually get to that point. I'd also add that just like we've seen this sort of thing many times in the PC space, the same is likely true for mobile. The difficulty is in uncovering when something strange is going on.
What Samsung needs to do going forward is either open up these settings for all users/applications (e.g. offer a configurable setting that fixes the CPU governor in a high performance mode, and unlocks the 532MHz GPU frequency) or remove the optimization altogether. The risk of doing nothing is that we end up in an arms race between all of the SoC and device makers where non-insignificant amounts of time and engineering effort is spent on gaming the benchmarks rather than improving user experience. Optimizing for user experience is all that’s necessary, good benchmarks benefit indirectly - those that don’t will eventually become irrelevant.