How to benchmark a smartphone
So benchmarking a smartphone is something that would be both useful from a buying point of view and interesting for the technically inclined - the question is how to go about it. What sort of benchmarks are even useful for a smartphone?
Benchmarks should reflect how you're actually going to use the device:
Regardless of whether you're using a synthetic or real world test, the aim of any benchmark is not to generate pretty graphs and big numbers, but to provide insight into how a device is going to perform and which product, out of a field of very similar devices, better.
You can divide almost all benchmarks into two types, synthetic and real world. We always prefer the latter; if you're looking to buy a graphics card, it's far more useful to know whether a graphics card A or B will make Bad Company 2
faster than which has the higher fill rate. That's not to say synthetic benchmarks are completely without use. If you're looking at a chip for the first time and want to get an idea of how it might perform in the future, or want to test a company's assertion about its design, then they can be handy - we use synthetic benchmarks in some of our CPU tests, for instance.
There are other elements of browser performance you could test - HTML5’s canvas element allows you to use HTML to draw graphics – and it would be fairly easy to create a webpage with some canvas elements and an FPS counter.
As all smartphones and tablets allow you to access the web, it’s a good way of creating a cross-platform test – here’s a video of canvas performance
on my iPhone 3G (412MHz Samsung made Coretex A8 CPU), the iPad (1GHz ARM Apple made Coretex A8 CPU) and the JooJoo pad (1.6GHz Intel Atom CPU).
It's been a while since smartphones have just been phones
Like Sunspsider though, this test is only semi-real world; this isn’t a real website you’d actually access day-to-day. And of course, the web isn’t everything – certainly not on the iPhone, where apps are hugely important. This brings me on to the second point:
Benchmarks need to be comparable:
We stick with the same tests for other components over a number of months, so you can see how a new CPU compares to one from a few months ago, but the variety in the smartphone market makes this difficult. Different operating systems (including one that won’t let you load your own code) mean only in-browser tests are cross-platform compatible. Does that even matter, though? Do you need to see the new iPhone 4GB benchmarked against, say, the latest Android phone such as the Droid X, or should the iPhone only be tested against other iOS devices?
Benchmarks should be fair:
Benchmarks need to be repeatable, reliable and not give any one system an inherent advantage. The page I used for canvas performance above features multiple examples – and they all run at different speeds across the different devices. In particular, the JooJoo’s performance seems very variable, depending on which element it’s drawing.
There are some things you can’t benchmark:
Particularly with smartphones, where user experience and hardware are more tightly bound together than with desktop PCs, there are many elements that seem resistant to benchmarking. How do you test a device’s responsiveness? How easy or pleasant it is to use? Can you test these things?
Perhaps – how about measuring how many clicks it takes to send an email, or using a camera to time how long common apps (browser, music player) take to open up? How this differs when the phone has been on for an hour and on for five days? Are download speed tests useful? What about how quickly the GPS can plot a route?
How about how long it takes to sync a “typical” load of files? Since it’s bit-tech
, how about how easy a smartphone is to mod?
There are some apps available across a wide variety of mobile operating systems – Spotify is a good example, Twitter another – but they don’t always use the same codebase. Even so, should we still compare their performance? Arguably, this is a real world test, as you’re testing apps people actually use.
Really though, this brings me to the final point, and one where I’d like your input. I’d like to know if you think we’re on the right track thinking about these things – and what sort of tests you’d like to see if and when we do start running more smartphone coverage? Tell me in the forums