Ten years ago, if you were to buy the best CPU for gaming or otherwise, you'd have chosen AMD's Athlon 64. My, how times have changed. While AMD has struggled to rekindle its glory days as the CPU-performance leader, Intel's CPUs have gone from strength to strength over the past decade. Today, Intel's CPUs perform best, and use the least amount of power, scaling admirably from powerhouse gaming PCs all the way down to thin and light notebooks and tablets--segments that didn't even exist a decade ago. But this return to CPU dominance might never have happened had it not been for the innovations taking place at AMD back in the early half of the 2000s, which makes the company's fall from grace all the more galling.
The 64-bit extensions of AMD's Athlon 64 meant it could run 64-bit operating systems, which could address more than 4GB of RAM, while still being able to run 32-bit games and applications at full speed--all important considerations for PC players at the time. These extensions proved so successful that Intel eventually ended up licensing them for its own compatible x86-64 implementation. Two years after the launch of the Athlon 64, AMD introduced the Athlon 62 X2, the first consumer multicore processor. Its impact on today's CPUs cannot be overstated: everything from huge gaming rigs to tiny mobile phones now use CPUs with two or more cores. It's a change that even Intel's Gaming Ecosystem Director, Randy Stude, cited when I asked him what had the biggest impact on CPU design over the last decade.
This focus on cores has dominated the last decade of CPU development. Prior to the introduction of multicore CPUs, the focus was very much on increasing clock speeds. This gave games and applications an instant performance boost, with very little effort required from developers to take advantage of it. Moore's Law--which states that the number of transistors in a dense integrated circuit would double around every two years--was in full swing in the 90s and early 2000s. In the period from 1994 to 1998, CPU clock speeds rose by a massive 300 per cent. However, by the mid 2000s, power consumption and clock speed improvements collapsed, with both Intel and AMD fighting the laws of physics. The solution was to introduce more cores, so that multiple tasks could be executed simultaneously by individual CPUs, thus increasing performance.
The trouble is, unlike increasing clock speed, increasing the number of cores requires developers to change the way their code is written in order to see a performance increase. And, in the case of games development, that's been a slow process.
Games like Battlefield 4 that make use of multiple CPU cores are still the exception, rather than the rule."[Multicore CPUs] have required that the software industry come along with us and understand the notion of threading," says Stude. "For gaming, it's been challenging. Threading on gaming is a much more difficult scenario that both us and AMD have experienced. In general, you've got one massive workload thread for everything, and up until now that's been handled by, let's say, the zero core. The rest of the workload, whatever it might be for a particular game, goes off to the other cores. Today, game engine success is a bit hit and miss. You have some games, the typical console games that come over, that don't really push performance at all, and isn't threaded or lightly threaded."
"[Multicore CPUs] have required that the software industry come along with us and understand the notion of threading. For gaming, it's been challenging." - Intel
"The nature of development work for those platforms, especially in the early years, is that you'd get your game running and publish it and you'd rely heavily on the game engines that you as a publisher own, or that you acquire from third parties like Crytek and Epic," Stude continued. "If Epic and its Unreal engine on console don't have a threaded graphics pipeline--which to date they don't--then you're looking at the same issue that you see on the PC, which is a heavily emphasised single-core performance workload, and then everything else that happens like physics and AI happens on the other cores. It's not a completely balanced scenario, because by far the biggest workload is that render pipeline."
The problem has been more pronounced for AMD. Its Bulldozer CPU architecture (which all of its recent processors are based on modified versions of) tried to both ramp up clock speeds by lengthening the CPU’s pipeline, increasing latency (an approach not too dissimilar to the disastrous Prescott Pentium 4 from Intel), and by increasing the number of cores by sharing resources like the scheduler and floating point unit, rather than by duplicating them like in a standard multicore CPU. Unfortunately for AMD, Bulldozer's high power consumption meant that clock speeds were limited, leaving the CPU dependent on software that made use of those multiple cores to reach acceptable performance. I asked Richard Huddy, AMD's Gaming Scientist and former Intel and Nvidia employee, whether chasing more cores was the right decision. After all, to this day, Intel's Core series of CPUs consistently outperform AMD's.
"So if you talk to games programmers--there are other markets as well--they have typically found it easy to share their work over two, four cores," says AMD's Huddy. People have changed the way they program for multi-core stuff recently over the last five years to cope with six-eight cores. They understand this number is the kind of thing they need to target. It's actually genuinely difficult to build work-maps of the kind of tasks you have with games to run on something 32 cores or more efficiently."
AMD's Richard Huddy had a hand in creating Direct X, as well as stints at ATI, Intel, and Nvidia."The more cores you have, the harder it gets, so there is a practical limit," continued Huddy. "If we produced 1000-core CPUs then people would find it very hard to drive those efficiently. You'll end up with a lot of idle cores at times and it's difficult. From a programmer's point of view it's super-easy to drive one core. So yeah, if we could produce a 100 GHz single-core processor, we'd have a fantastic machine on our hands. But it's mighty difficult to clock up silicon that fast, as we're up against physical laws here, which make it very difficult. There's only so much you can do that ignores the real world, and in the end you need to help programmers understand the kind of constraints they're building to."
"I'd love for us to build a single-core CPU. Truth is, if you built a single-core CPU, that just took all of the power of the CPU and scaled up in the right kind of way, then no programmer would find it difficult to program, but we have to deal with the real world."
"For the last decade--which is a strong portion of our existence, the dominant decade in terms of our revenues and unit sales--we were told Moore's law was dead and that the physics wouldn't allow us to continue to make those advances, and we've proven everyone wrong," says Intel's Stude. "I'm a futurist as a hobby, and I've learned a lot being at Intel. The day I started we had introduced the Pentium and even then the conversation was about what was possible from a die shrink perspective. I'm not ever going to believe in my mind that the pace of innovation will outstrip the human brain."
"I just don't subscribe the concept that there isn't a better way. I think that evidence of the last 50 years would argue that we've got a long way to go on silicon engineering. What we think is possible may completely be eclipsed tomorrow if we find a new element or a new process that would just flip everything on its head. I'm not going to play the Moore's Law is dead game, because I don't think it will be dead. Maybe the timeline slows down, but I just can't subscribe it dying based on what I've seen at my time at Intel."
Intel's "tick-tock" strategy has helped the company stick to Moore's Law, but how just long can it last?AMD's Huddy shares a similar viewpoint: "Moore's Law looks alive and well, doesn't it? It's always five years from dying. For all practical purposes, I expect us to live on something very much like Moore's Law up until 2020. Our biggest problem is feeding the beast, it's getting memory bandwidth into these designs. I want the manufacturers of DRAM to just keep up with us, and give us not only the higher density--and they do a spectacular job of giving us more memory--but also make that memory work faster. That's a real problem, and if we could just get a lot of super fast memory and not pay the price of that wretched real world physics that gets in the way all the time. I blame them, it's all down to DRAM!"
But when it comes to integrated graphics, AMD is far and away the performance leader. The company's purchase of ATI in 2004--despite some integration issues at the time--has given the company quite the performance lead; AMD's APU range of CPUs with built-in Radeon graphics are the best choice for building a small gaming PC without a discrete GPU. It might be just a small win for the company on the CPU side, but it's one that has had a significant impact on the company's focus.
"We took a decision 18 months ago to focus heavily on graphics IP," says Darren Grasby, AMD's VP of EMEA. "Driving the APU first, first with Llano, and fast forward to where we are today with Kaveri. Kaveri is the most complex APU ever built, and if you look at the graphics performance within that, you're not going to get the high-end gamers with that. But if you look at mainstream and even performance gaming, an A10 Kaveri is your product to get in there. And you don't have to go spend $1500 or $2000 dollars on a very high-spec gaming rig, that quite frankly, a mainstream or performance gamer isn't going to be using to its full capability."
"If you think about it from a gaming aspect, what are gamers looking for? They're looking for the compute power from the graphics card. The CPU almost becomes secondary to it in my mind." - AMD
"So you're right on the ‘halo effect’ on the CPU side," continued Grasby. "Obviously we can't talk about forward-looking roadmaps, but it's leaning into where the graphics IP is, and where that broader market is, and where the real revenue opportunities sit within that. That's why, if you look at Kaveri, if you look at the mass market and gaming market you're getting right up there. Then you start to get into 295 X2, and then you're talking about where the gamers are. If you think about it from a gaming aspect, what are gamers looking for? They're looking for the compute power from the graphics card. The CPU almost becomes secondary to it in my mind."
AMD, meanwhile, took a different path and signed an ARM license to begin developing its own ARM processors. The question is--with the vast majority of the company's experience being in X86 architecture--why?
Phones and tablets like the Nvidia Shield mostly make use of ARM processors, rather than the traditional X86-based designs that AMD and Intel produce."Did you see Intel's earning results yesterday? [Note: this interview took place on July 17, 2014] Just go and have a look at their losses on mobile division," says AMD's Grasby. "I would suggest at some stage their shareholders are going to have a challenge around it. I can’t remember the exact number, it’s on public record, but I think it was 1.1 billion dollars they lost on 80 million dollars of turnover. Our clients suggest that isn’t the best strategy. I encourage them to keep doing it, because if they keep losing that amount of money, it’s definitely not good...the primary reason why we signed the ARM license was because two years ago we bought a company called SeaMicro. We were basically after its Freedom Fabric [storage servers], and that’s why we signed the ARM licence, to go after that dense, power server opportunity that’s out there. It’s a huge opportunity."
"As soon as we got the ARM 64-bit license, other opportunities opened up on the side. Think embedded, for example. Embedded from an AMD perspective had always been an X86 Play. Just to give you an idea, ARM and X86 are a nine to ten billion dollar business. Take ARM out of that it comes to around four to five billion dollars. It’s to exercise the opportunity."
Despite a decline in recent years, overclocking is still alive and well."The overclocker market certainly is relevant," says Intel's Stude. "Every time we come out with a part there's a fraction of a fraction of people that are the utmost enthusiasts. They care about every last aspect of that processor and they want to want to push it to the limits. They are tinkerers, they don't mind buying a handful of processors to blow 'em up just to see what they can do, and to make their own living, be it working in Taiwan for the ODMs who make motherboards, or be it in other capacities in the media to submit their opinions on Intel's top end parts."
"We love the boutique nature of it," continued Stude, "because the people in that seat typically have very interesting compute perspectives that influence the decisions that others make. So, if you're very overclockable, you have a very influential position...so we do the best we can to feed this community our best story and we'll continue to that."
While there's no doubt AMD CPUs offer excellent value for money (we used one to great effect in our budget PC build), they still lag behind Intel when it comes to outright performance and performance per watt; to stay in the PC market, AMD has a much tougher job ahead of it then its rival.
"From an engineering perspective, performance per watt becomes the limiting factor in a lot of situations so there's no doubt that we need to do a better job," says AMD"s Huddy. "It's very clear that Intel and Nvidia, and everyone that competes in the silicon market has to be more aware of this. If you go back 10 and in particular 20 years ago, performance per watt, wasn't a big issue, but it increasingly is, and we aim to do better. I have absolutely no doubt about that. There's a lot of attention being paid to that. There are limits over how much we control our own destiny, but particularly for us where we use companies such as TSMC as others do, then those companies work with the same constraints as us and we should be able to just match them."
"From an engineering perspective, performance per watt becomes the limiting factor in a lot of situations so there's no doubt that we need to do a better job." - AMD
"It's very clear people have seen there's an artificial limitation that really needs to be fixed, and it's not just about giving you more gigahertz on your CPU," says AMD's Huddy. "We can be extremely proud of Mantle, getting the CPU out of the way when there was an artificial bottleneck. There's no doubt that people will use the extra CPU horsepower for good stuff, and we're seeing that in the demos that we're already able to show. However, let's not get hung up on gigahertz, sometimes it's smarts that get you there, and if you're looking for the fastest throughput API on the planet, then you'd have to say it is Mantle, and you'd have to say 'okay, now I get why AMD is leading the way', don't just count the CPU gigahertz, but look at the technology innovation that we're coming up with."
"Amusingly, and I don't know how relevant it is, you can make your own decision on that, for me it's entertaining: one of the companies that approached us [about Mantle] was Intel, and we said to Intel, 'You know what, can you give us some time, to fully stabilise this because this has to be future proof, but we'll publish the API spec before the end of the year.’ And if Intel want to do their own Mantle driver and want to contribute to that they can build their own. We're trying to build a better future."
For more on AMD's Mantle, and why the company thinks Nvidia is doing "something exceedingly worrisome" with Gameworks technology, check back later in the week for our look at the developing war between PC graphics most prolific companies.