Apple iPhone 6 (Apple A8) performance review: CPU and GPU compared to the best Android phones out there
Looking at numbers without fully understanding them, though, is a dangerous business. This iPhone 6 performance review aims to clear some of the widespread misunderstandings and give a more detailed overview of the state of mobile CPUs, and how Apple’s efforts compare to that of the main rival: the mostly Qualcomm-powered Android fleet.
Apple A8 and ARM's architecture license
When it comes to the CPU, it’s worth starting off with a quick refresh on the facts. The overwhelming majority of mobile devices - be it Android, Windows Phone, or iOS ones - are based on ARM-derived architectures. ARM offers two types of licenses to its clients: a processor license and an architecture license.
Most manufacturers use the processor license that grants them the right to take an ARM-designed core and use it in their SoC. An example for ARM-designed cores include the battery-optimized Cortex A7 (and its newer, 64-bit Cortex A53 successor) and the Cortex A15 (with its newer, Cortex A57 64-bit heir). Phone makers like Samsung, for instance, take those two cores and combine them in various big.LITTLE combinations to come with SoCs like the Exynos 5430 in the Galaxy Alpha where the company combines four power-efficient A53s running at lower clock speeds and four performance-driven A57 that can go up to higher clocks, but also draw more battery.
The other type of licensees, those under ARM’s architecture license program, take a totally different approach by just using the ARM instruction set, while building their own CPU core. The most prominent companies that do that are Qualcomm and… Apple. Apple used to operate under an ARM processor license all the way until the iPhone 4s, but decided to switch to an architecture license for the iPhone 5, and has building its own CPU cores ever since then.
The state of 64-bit
Looking over to the Android camp, we’re seeing that the platform lags behind a full year and more. To this date, in late 2014, the biggest Android vendors like Samsung, HTC, LG, and others, are all releasing their flagships with 32-bit chips like the Snapdragon 805 and Snapdragon 801. Both those chips are based on the now 3-year old Krait core (with some tweaks, of course), and later on in this article you’d be able to spot the difference in compute power. Naturally, using the 32-bit 805 translates into those flagships not being able to benefit from ART optimizations in Android L.
The earliest this could (and likely would) change is in spring of 2015 when the first wave of Android flagships for next year is expected to arrive. Some (and hopefully most) of those devices are said to feature the Snapdragon 810, Qualcomm’s first top-level 64-bit SoC. In just over a year time, Qualcomm has overhauled its portfolio to consist of 64-bit chips on practically all levels, from the low to the high-end. However, the Snapdragon 810 does not ship with a custom Qualcomm core (such a core would likely take more time for development) - instead, the company goes back to using an ARM processor license and equips the 810 with a big.LITTLE setup with four low-power Cortex A53 and four performance-driven Cortex A57 cores.
Given the long period of time it takes for the Android install base to switch to an ART-enabled version of the platform in meaningful numbers (let’s keep in mind that we don’t have a minimum target for ART, and chances are that it won’t be KitKat, but Android L), it is clear that Android is in a much less favorable position in terms of 64-bit-readiness.
Apple A8 die break-down
We’re not completely in the dark, though: in the past two release cycles, Apple has been disclosing the number of transistors in the Apple A8: there’s now a whopping 2 billion of them, double the number from the A7. As far as we can tell, this is the most ever in a smartphone chip - in comparison, some estimates claim that the Snapdragon 805 chip features 700 million transistors.
From here on, the journey towards a better understanding of the Apple A8 starts with a teardown of the iPhone 6 and images of the A8 die from Chipworks. Those images give us a detailed breakdown of the Apple A8 die and the location of its various components.
Despite (or rather because of) the doubling of transistor count, the die size has grown smaller and comes in at 89mm2 in the A8, down from 102mm2 in the A7. Apple has switched the places of components on the die, and the CPU is now on the bottom left (it was on the bottom right), with a large block of L3 cache above it. Despite a 20% decrease in the size of the SRAM block (cells have shrunk in third from 0.12µm to 0.08µm), it’s likely that more advanced circuitry makes up for the difference and we’re still dealing with 4MB of L3 cache memory. At the time of this writing, we have seen the first benchmarks showing that memory latency has indeed improved by a hefty 20ns when we go out to L2 $ and further.
The most drastic change in size, however, seems to be in the CPU die size: the new CPU measures 12.2mm2, nearly 30% smaller than the 17.1mm2 CPU die in the Apple A7. By all visible clues, the rest of the architecture remains the same: we have 64KB/64KB of L1 instruction/data $ (L1 is the fastest cache, located on the CPU die), and a 1MB block of L2 cache shared between the cores.
Apple has provided a few important details about the CPU performance of its new A8: first, the company says the new CPU comes a 25% performance improvement, and illustrates this with a chart showing generational improvement all the way since the 2G iPhone (the 25% number is derived by comparing the iPhone 5s’s 40x CPU overhead over the 2G iPhone and the 50x peek in the iPhone 6).
On clock speeds and deceptive marketing
With a modest boost in CPU clock speeds from 1.3GHz to 1.4GHz (an 8% speed-up), the 25% improvement obviously comes from various other tweaks and tricks. Before diving deeper in benchmarks, though, here is the place for a quick insert about clock speeds and the state of the industry. Commentators in forums are quick to point out the apparent inferiority of Apple clock speeds in comparison to the much faster speeds declared in rival Snapdragon and Exynos chips, for instance. The most up-to-date example is the Snapdragon 805 with a declared clock speed of ‘up to 2.7GHz’. At first sight, Apple’s Cyclone core looks like a sore loser with its declaration for just half that at 1.4GHz.
Most people would call it a day at this point - the Snapdragon outperforms the A8 hugely, case closed. This, however, would be naïve: running real-world applications and games shows instantly that the 2.7GHz speeds can only be achieved for a very short periods of time, but after those short outbursts, the chip quickly throttles back to the much more sane ~1.3GHz. Put simply, the 2.7GHz number that you read about is not the nominal frequency, but maxed out turbo speeds that are not sustainable for the long term. In fact, Apple is being much more truthful as it declares actual nominal (and not turbo) speeds for its chip, plus, the company goes on to disclose a second big thing about its chip: sustained performance times. Apple actually claims its A8 is capable of running flat at its nominal speeds for (at least) 20 minutes.
This is the right place to note that ARM, the licensee company for both the Snapdragon and the Apple A8 CPU cores, has actually claimed that the current generation of its processors works best in terms of thermal output/performance at around 1.2GHz. Going up above that ensues big consequences - AnandTech has earlier shared estimates that going above the 1.5GHz threshold by just 100MHz brings up a shocking, quadratic increase in voltage and power consumed by the chip.