Mobile benchmark fraud has a long history dating back a long time (at least – at least in the smartphone industry years), and has also been a controversial topic in AnandTech for several years now.

I remember in 2013 when I gave Brian and Anand a tip on some of the shenanigans that Samsung was doing on the Exynos chipset GPU on the Galaxy S4, only to detonate a broader analysis of the practice among many of the mobile device vendors of then – with all of them found guilty. The Samsung case eventually ended with a $ 13.4 million successful lawsuit against the company – with your real AnandTech even mentioned in the court case.

Naming and shame worked in the following years as sellers quickly abandoned these methods out of fear of media backlash: the negatives far outweighed the positives.

In recent years, however, we have witnessed a strong upswing in these methods, particularly by Chinese sellers. Especially for our westerner audience, this only happened to Huawei a couple of generations ago with mechanisms that essentially disabled the thermal throttling of the phones – letting the most demanding benchmarks essentially reduce the SoC to the maximum until the thermal shutdowns. The naming and shame again helped here, as the company had gone from using invisible mechanisms to something much more honest and transparent and much less problematic for follow-up devices.

The problem is that the Chinese supplier market is still huge and we are unable to analyze every single device and supplier out there. Cheating the benchmarks here has continued to be a very real problem and a common practice. Huawei’s rationale at the time was that they felt they needed to do it because others also did it – and they didn’t want to lose the face of competition with regards to the marketing power of reference numbers.

The only big difference here is that there has always been a bit of a firewall in our coverage between what a device vendor has done and what the chip sellers have allowed them to do, and this is where we come to the MediaTek’s behavior in recent years. In most cases, we have always blamed device vendors for cheating since it was their mechanisms and initiative – we hadn’t had any qualifying evidence from chipset vendors, at least until now.

Has Helio P95 outperformed Dimensity 1000L?!

It all came to mind when I first received Oppo’s new Reno3 Pro, the European version with MediaTek’s Helio P95 chipset. At first the phone surprised me a bit, as in the benchmarks of systems like PCMark it was punching far above its weight and what I expected from a Cortex-A75 class SoC. Things got stranger when I got a Chinese Reno3 with MediaTek Dimensity 1000L – a much more powerful and recent chip, but that for some reason worked worse than its brother P95. It is when you see such strange results that the alarm bells ring because there is something wrong.

It all ended like a trip to the rabbit hole.

Real performance vs performance betrayed

(Oppo Reno3 Pro P95)

Of course, and unfortunately, my first thought was that there was going to be some sort of cheating going on. We had contacted our UL friends for an anonymous version of PCMark: the teams in the past had also been of great help in discouraging cheating behavior in the industry. To the surprise, the two versions of the benchmark differed in their scores, but I was still amazed at the magnitude of the score delta: a 30% difference in the overall score, with a difference of up to 75% in major subtests such as workload of writing.

A bit of background on PCMark and why we use it: it’s not actually a benchmark that is usually targeted for detection and fraud, because it’s a system benchmark that tries to be representative of workloads of the real world and the responsiveness of a device. While hardware here definitely plays a role here in the benchmark score, it is primarily influenced by software and mechanisms such as DVFS and programmers. There’s also the fact that it’s an all-in-one performance and battery benchmark – if you’re betraying an aspect of the test by increasing performance, you’re simply handicapping yourself on the battery test. It is therefore unusual for the benchmark to be manipulated as in a sense you are also shooting yourself in the foot at the same time.

I also have a Snapdragon 765G variant of Reno3 Pro, the Chinese model of the phone (while sharing the same name, they are still quite different devices). If Oppo were to be the cause of this mechanism, surely this device would detect and cheat in PCMark. But in reality it is not so: the device apparently works in the benchmarks just like in any other app.

Digging a little deeper for information on the MediaTek versions of Reno3, the whole cheat mechanism seemed to have been in plain sight to users for several years:

Reno3 Pro – Sport Mode Benchmark White List

In the device firmware files, there is a power_whitelist_cfg.xml file, which is most commonly found in the phones / vendors / etc folders. Inspect the file, among what appear to be a list of popular applications with various energy saving changes applied, with lo and lo, here is also a list of various benchmarks. We find the APK ID for PCMark and we see that there are some power management tips configured for it, a common one called “Sport Mode”.

The list of benchmarks here is not very exhaustive but contains the most popular benchmarks in the industry today: GeekBench, AnTuTu and 3DBench, PCMark and some older ones like Quadrant or the popular Chinese benchmark 鲁 大师 / Master Lu. There is also a benchmark storage like AndroBench2 which is a bit strange – more details on this later.

The latest news here are a number of AI benchmarks including the Master Lu AIBench and the ZTH AI Benchmark test, both of which we actively use here in AnandTech to cover those aspects of SoC and devices.

Reno3 Pro – Targeting non-public benchmarks

What really shocked me was the inclusion of a corporate version of Kishonti’s GFXBench. It did not have the sport mode power suggestion configured in the list, but obviously it alters the default DVFS, thermal and scheduler settings when the app is used. This is a huge red flag because at this point we are not simply talking about the list of benchmarks intended for general public benchmarks, but also about the variants that are actually used only by a small group of people, including media publications like us. This is something to keep in mind for later in the piece.

Sport mode on Reno 3 (Dimensity 1000L)

Sports mode on Reno 3 Pro (P95)

So what does this “sport mode” really do? First, it appears to solve some DVFS features of the SoC such as running the memory controller at full frequency all the time. The scheduler is also set to be much more aggressive in load monitoring, which means it is easier for workloads to have CPU cores increase the frequency faster and stay there for a longer period of time by applying some family enhancement mechanisms.

I’m not sure if the _FPS_ entries do, but given their obvious denomination they are altering something to improve the reference numbers. The strangest thing here are the rumors that are increasing the speed of the filesystem on F2FS devices, probably because benchmarks like AndroBench are also targeted.

They are (mainly) all MediaTek devices

Here is the real kicker though: those files are not only present on OPPO devices, they are very present in a whole series of phones from various suppliers across the spectrum. I was able to get my hands on some firmware excerpts from various devices out there (I didn’t actually own all the phones here), with each one having a similar power_whitelist_cfg.xml present in their vendor partition, with almost identical entries of the reference lists. Here is a breakdown:

MediaTek: devices and fraud benchmarks

seller

Oppo

Oppo

Oppo

Oppo

I live

Xiaomi

The real me

iVoomi

Sony

Device

Reno3 Pro

Reno Z

F15

F9 Pro

S1

Note 8 Pro

C3

i2 Lite

XA1

SoC

P95

P90

P70

P60

P65

G90

G70

A22

P20

AndroBench2

✓

✓

*

✓

✓

✓

✓

✓

✓

PCMark

✓

✓

*

✓

✓

✓

✓

✓

✓

Antutu

✓

✓

*

✓

✓

✓

✓

✓

✓

Antutu 3DBench

✓

✓

*

✓

✓

✓

✓

✓

✓

GeekBench

✓

✓

*

✓

✓

✓

✓

✓

✓

Clock face

✓

✓

*

✓

✓

✓

✓

✓

✓

Quadrant Professional

✓

✓

*

✓

✓

✓

✓

✓

✓

鲁 大师 / Master Lu

✓

✓

✓

✓

✓

✓

✓

✓

✗

鲁 大师 / AIMark

✓

✓

✓

✓

✓

✓

✓

✗

✗

AI Benchmark (ZTH)

✓

✓

✓

✓

✓

✓

✓

✗

✗

NeuralScope Benchmark

✓

✓

✓

✓

✓

✓

✓

✗

✗

GFXBench 4 Corporate

✓

✓

✗

✗

✓

✓

✓

✗

✗

* Present but commented

What is shocking here is only the wide variety of devices on which it is present. The oldest device here is a Sony XA1 with a 2016 P20, pointing out that it has probably been around for some time now. That device apparently also had the least complete list of benchmarks, particularly lacking the most recent AI tests.

The fact that Sony had this in the files is very worrying as it should be a “clean” supplier and avoid such practices. What is clear here is that this mechanism does not originate from individual suppliers, but comes from MediaTek and is integrated into the BSP of the SoC (Board Support Package).

Oppo Reno3 Pro (P95) – New firmware vs initial firmware (lists gone)

What is actually even more suspicious and we are very lucky here in terms of catching this is that these lists seem to be hidden. I had extracted the files from my Reno3 Pro onto its ready-to-use initial firmware. Over the past few weeks OPPO has sent a firmware update to the phone – and when at some point, when I double-checked something in the file, I was surprised to see the rumors of the benchmark disappear.

Has the mechanism been disabled? Did they stop cheating? Unfortunately no. I don’t know where the rumors have been moved now, but the phone has still activated its Sport mode in the benchmarks with the same big performance boost. The rumors were not simply removed, they were simply hidden elsewhere.

Reach MediaTek and their answer

We were extremely concerned about all these results and contacted MediaTek several weeks ago. We explained our findings and concerns we had about a SoC provider who actually provided this mechanism. We recently received an official response from them, quoted as follows:

MediaTek statement for AnandTech

MediaTek follows accepted industry standards and is confident that benchmarking tests accurately represent the capabilities of our chipsets. We work closely with global device manufacturers when it comes to testing and benchmarking devices based on our chipsets, but in the end brands have the flexibility to configure their devices as they see fit. Many companies design devices capable of operating at the highest possible levels of performance when performing benchmarking tests in order to show all the features of the chipset. This reveals what the upper limit of performance capabilities is on a given chipset.

Obviously, in the real world scenarios there are a multitude of factors that will determine the performance of the chipsets. MediaTek chipsets are designed to optimize power and performance to deliver the best possible user experience, maximizing battery life. If someone runs a computationally intensive program like a demanding game, the chipset will intelligently adapt to processing models to deliver sustained performance. This means that a user will see different levels of performance from different apps as the chipset dynamically manages CPU, GPU and memory resources based on the power and performance required for a great user experience. In addition, some brands have different types of modes activated in different regions, so the performance of the device may vary according to the needs of the regional market.

We believe that showing the full capabilities of a chipset in benchmarking tests is in line with other companies’ practices and provides consumers with an accurate picture of the device’s performance.

The statement is generally disappointing, but let’s go over some key points that the company is trying to make.

The statement tries to say that by forcing the various configurable knobs, the reference values ​​will better represent the hardware capabilities of the SoC. In a sense, this is actually true and has been a controversial point of discussion regarding the entire debacle cheat benchmark over the years with various vendors. It is only when a benchmark provider suddenly opens otherwise unattainable performance states in these benchmarks where the argument is no longer valid. At least at first glance, this doesn’t seem to be the case with MediaTek, although I don’t have more detailed technical information on what some of the “Sport mode” configuration options do.

The problem with that argument, however, is that it breaks down in the face of reference tricks that aren’t just about the actual hardware components of a SoC, such as how GeekBench is testing CPU speeds or how GFXBench tests the speed at which a GPU can be, but also benchmarks that actively seek to be user experience benchmarks, such as PCMark. This is a real-world workload that tries to convey the responsiveness of a phone as a whole, not just the chipset.

The fact that MediaTek cheats such a test goes directly against the notion of the second paragraph of chipsets which offers optimized real-world performance. If so, wouldn’t it be better to let the chipset and software demonstrate it honestly? What do benchmarks and cheat storage filesystems have to do with chipset capabilities?

MediaTek’s claim towards suppliers offering dedicated performance methods is correct. In particular, this was introduced, at least for vendors like Huawei, as a direct result of us calling them on their devices’ default opaque fraud behavior.

High performance mode required on OPPO devices.

On the Oppo devices and on many other devices of the Chinese distributors, a “High performance mode” option has been set in the settings. This actually differs somewhat from the usual “High Performance” modes that we are used by vendors like Samsung or more recently Huawei, in that this is essentially just a switch to have DVFS and optimizable performance. It is also present in Snapdragon phones and we talked about it in our review of the Reno 10x last year. The phone essentially enters a high power mode eliminating any attempt to be efficient; it is a senseless mode unusable in everyday use cases as well as obtaining high reference scores.

The fact is that we educated users, and MediaTek as a SoC provider, shouldn’t be concerned with these operating modes.

I still consider it a good compromise between delivering the phones in an honest “default” state and still giving people (and reviewers) the opportunity to reach unlimited and super high reference figures if they wish. The difference here is the transparency of the mechanism: Oppo, for example, tells you that your device overheats. The detection of MediaTek benchmarks is hidden.

MediaTek also refers to the “market requirements” that cause them to do this and is an “industry standard”, and unfortunately this is again true and addresses the crux of the matter.

These mechanisms would not exist if suppliers had not requested MediaTek to provide such solutions. From MTK’s point of view, they are just trying to meet a customer’s needs and make them happy. There is the question of who actually got there first: was MTK developing the survey alone or was it a customer who requested it from them at some point in the past?

In the absence of evidence from other SoC vendors that allow similar mechanisms for device vendors, what is clear is that MediaTek should have simply stayed out of chaos, as they have more to lose than to gain.

All that has been achieved now is the impression that the company’s chipset software is not optimized enough to be able to offer consistent performance and efficiency by default, with the need for manual push to be able to correctly meet the expectations of the chipset benchmark.

I’ve definitely lost a lot of faith in the numbers and in general I’m just more skeptical than the reference figures I’m running, especially at a time when I was excited to see MediaTek return to the highest levels with Dimensity 1000 (which apparently is a great chipset – review to follow in the future).

With the cat out of the bag and the evidence out there, I’m sure other media with access to multiple MediaTek devices will be able to check whether they are cheating or not. Pointing and shame has worked in the past for Samsung and other vendors, and it has worked for Huawei’s misjudgments a few years ago – both are now on a more correct path. I just hope MediaTek will be able to correct their trajectory here too, take the high road, remove the mechanisms – and say “no” to their customers when they request that function again.