AMD’s Ryzen 3000 CPUs have been in the market for more than two years, offering as much as twice the performance compared to Intel CPUs at unbeatable prices. This has resulted in the red CPUs taking the lead over big blue in many markets such as Japan, Korea, Germany, Netherlands, etc. The newly announced Epyc Rome processors are also expected to give Intel a tough time in the server space. However, if you look at AMD’s performance over the past decade, it has been dismal, to say the least. Ever since the Core architecture landed, they have essentially been playing catch up.
How did AMD go from “manufacturer of cheap, budget CPUs” to being
For starters, let’s clear up a few things. Zen is good but it wouldn’t have been such a step up for AMD if the older Bulldozer design wasn’t so inferior with major major design flaws. After the third-gen K10 architecture, team red due to lack of funds decided to invest in a narrow, low-IPC, high-clock design. And they ended up paying dearly for it.
You needed high core counts and higher operating clocks to make this work which isn’t something AMD was able to pull off and the rest is history. To give a clearer picture of how disadvantaged the Bulldozer chips were compared to their predecessors, here’s an example: To offer performance in line with the older Phenom II processors, Bulldozer needed to have a 50% higher operating speed (on an average). This, of course, didn’t happen, and instead resulted in power-hungry CPUs that ran hot and unstable.
Have a look at the above IPC chart. Instead of going up, the
This finally resulted in the brand new Zen architecture that ditched all the bottlenecks of Bulldozer and its successors, and here we are, with the 3rd Gen Ryzen lineup leveraging the Zen 2 architecture based on the 7nm node.
Bulldozer Core vs Zen Core
Let’s put the two architectures side-by-side and see how Zen varies from Bulldozer. Notice how the Excavator (last Bulldozer) design essentially makes one core into two by throwing in an extra Integer Scheduler and decoder. This may sound like a good idea on paper but it didn’t quite pan out. Yes, it did make the design simpler and cheaper, but the IPC went down the drain. Due to this shared logic design, the resources available to each core were severely compromised.
An integer cluster basically counted as a “core” in Bulldozer and shared the same L2 cache and FP scheduler with the adjacent core. This prevented the decoders from performing efficiently and instead of increasing the level of instruction-level parallelism, it effectively crippled the decoders by keeping one of them idle most of the time. By stuffing two decoders side by side, there was a bottleneck leaving the lone FP scheduler to juggle between the workload of the two so-called cores, leading to especially poor floating
Have a look at Zen. Each core has two 256-bit FMACs under the FP scheduler, that’s four times as much as Bulldozer. The pipeline is also wider, with dedicated L2 cache per core, boosting single
This is actually a form of Hyperthreading. Where Intel’s CPUs merge two threads into a single, large ALU cluster, AMD had two separate partitions. This is called Clustered Multi-Threading (CMT) while Intel leverages Simultaneous Multi-Threading (SMT), something which the Zen and Zen 2 designs also have. So there you have it. What do you think of AMD’s remarkable recovery in the past 4-5 years