2019 has been the most exciting year in the consumer CPU history for the better part of a decade. AMD’s Ryzen architecture has finally come into its own. The 7nm Zen 2 core radically improves on Zen+, offering the high clockspeeds and IPC to take Intel down, not just in multithreaded productivity workloads, but in mainstream gaming as well. Meanwhile, Intel hasn’t exactly been standing still. Team Blue, who’ve effectively been warming over Skylake leftovers for the past half-decade have taken some of their most radical price-performance steps in years.
Their response to Ryzen 3000 has been to shift the entire Core i line up two performance tiers: The 10th gen Core i3 will offer the same 4 cores and 8 threads that the Kaby Lake 7700K delivered just two years ago. Yes, these are exciting times. Our best PC build features keep getting updated every time AMD or Intel outdoes the other with even better value. But it’s not just price-performance that’s interesting. What’s fascinating is that both companies have taken very different technical approaches to their products. Ryzen and Intel Core i both offer high performance and (at times) great value but accomplish these in very different ways. Let’s find out just how.
Intel vs AMD: Monolithic vs Chiplet
What’s better, a single, gigantic apple or a whole bushel of oranges? This is analogy to consider when thinking about AMD’s and Intel’s approaches to large, multi-core designs. Intel’s been building CPUs on a monolithic die for quite some time. Well, at least since the infamous Pentium D which 2005 observers ridiculed as two processors stuck together with glue.
The term “monolithic” should tell you all you need to know. Every Intel CPU chip is just that a chip. Singular. Each design is distinct: You have a design for your dual cores, your quad cores, your hexacores, all the way up to your 28-core enterprise Xeons. Everything, the cache, the processor cores themselves, share the same substrate. There’s a high-speed, low latency interface called the Ring Interconnect, connecting cores to resources like L3 and graphics at very low latency. Because everything’s on a single die, the design is inherently simpler. There is a big drawback, though: cost.
We need to talk about silicon yields for a moment. As with any industry, chipset manufacturing isn’t a perfect process. When foundries are dealing with new process nodes, yields for a given chip can be as low as 40 percent. This means that more than half of the chips being produced are defective. Even on a mature process node, you’ll likely have double-digit defect rates. Cost per unit obvious increases as yield goes down, since you have to offset the burden of the defective units. The problem with a monolithic die is that larger core counts can cause cost to increase exponentially. There’s a reason Intel’s high core-count parts have cost so much–and it’s not just because of monopolistic pricing. When you build a large monolithic die, every core has to be functional.
A massive, 28-core Xeon chip costs much more to fabricate than a quad-core i3, even if it’s fully functional. But the problem with a monolithic design is that even if a single core is defective, you’ll have to throw the whole thing out. When yields are below 90 percent, the chances of at least one in 28 cores being defective are almost certain. This means Intel effectively fabricates 2, 3 or even more Xeon chips before getting a single working one that can be sold. Not only is it cheaper to build a chip with fewer cores, statistically you’re likely to have to throw out a given unit.
This means that Intel’s margins on smaller chips–their quad core i7s are far higher than those on their higher-core count parts. This is the main reason why we’ve made do with quad-core designs in the PC space for such a long time: AMD wasn’t a worthwhile contender until Ryzen came out, and it was just cheaper for Intel to deliver fewer cores. The limitations here are clear, though: as manufacturing costs escalate exponentially, it’s not possible for Intel to offer more cores at a competitive price without taking a hit to their bottom line. AMD, on the other hand, has a fundamentally different approach that allows them to do exactly that. Let’s find out how:
AMD Zen CPU Architecture Explained: Chips and Chiplets
Remember what we said about Pentium D being two CPUs glued together? AMD takes that idea and runs with it, to great effect. A Ryzen CPU is made up of multiple CCXs (core complexes). Each core complex is, effectively, its own standalone quad-core eight-thread CPU, with its own shared L3 cache. If we take Ryzen 9 3900X, for instance, you have 64 MB of L3, which is split into 16 MB for each CCX. A pair of CCXs is connected together via Infinity fabric to form a CCD (core complex die). These CCDs also talk to each other via Infinity Fabric. Up to 8 dies (each with up to 8 cores) are combined with a separate 14nm die for I/O onto what’s called an MCM, or multi-chip module. The magic sauce here is Infinity Fabric.
This is a high-speed interconnect that allows cores on multiple discrete dies to talk to each other. While it’s efficient, the Infinity Fabric’s biggest enemy is physics: there is some physical distance to be covered between dies that just doesn’t exist in a monolithic architecture. This means that a latency penalty is unavoidable. Earlier Ryzen models suffered from this. Because Infinity Fabric speeds in first and second-gen Ryzen were directly tied to RAM speed, increasing DDR4 speeds as high as they would go actually resulted in a noticeable boost to CPU performance, as the faster Infinity Fabric let different chiplets talk to each other faster.
With Ryzen 3000, AMD’s adopted a different approach. They’ve vastly increased the L3 cache allocation. The 3900X has a whopping 64 MB of L3 cache. To put this into perspective, the 9700K has a mere 12 MB of L3. AMD’s older Bulldozer architecture featured 8 MB of L3 at the most for top-end designs. The tremendous amount of cache (which AMD’s cheekily branded “Gamecache”) means that CPU cores can intelligently buffer data to offset the Infinity Fabric’s latency penalty. The results are clear to see with Ryzen 3000: there’s a huge uplift to IPC relative to earlier designs and performance per clock is nearly on par with Intel’s Coffee Lake.
The biggest advantage to a chiplet-based design is that costs scale linearly, not exponentially. AMD doesn’t need every core to work perfectly. As a matter of fact, almost every Ryzen processor sold has some defective cores that’ve been disabled. The 3900X, for instance, has 12 functional cores, with one core per CCX soldered off. This does mean that the minimum cost to fab smaller Ryzen SKUs can be high since a single chiplet has 8 cores. This partly explains why the Ryzen 3000 series starts with the 6-core 3500. The real benefit to the chiplet design is in the server space.
AMD’s EPYC Rome and Milan designs have single-handedly enabled the company to capture 10 percent of the server market in just two years. The reason is simple: Chiplets allow AMD to offer more cores (and therefore significantly more multithreaded performance) at a lower cost. They also allow AMD to build designs at volume that are larger than what’s feasible for Intel. For instance, Intel technically has a 56-core SKU, the Xeon Platinum 9200. But because of the aforementioned reasons, this particular model costs in excess of $25,000 dollars. Yeah, you read that right. In contrast, the 64-core EPYC Rome 7742 costs $6950. That’s the power of AMD’s chiplet approach.
Which is Better: Intel or AMD?
The big question, then, is how does Intel plan on competing with AMD if its dies cost more to make? Reports indicate that 10-gen Comet Lake parts will cost half as much as Coffee Lake refresh for an equivalent number of cores. The answer is simple: Intel will have to sell at very thin margins or at a loss, and eventually transition from monolithic fabrication. They’ve already taken steps in the right direction. The Foveros design featured in the Lakefield processors is a glaring example of an MCM chip from Intel.
Considering that their cash reserves are several times larger than AMD’s net worth, this is a viable short-term (or medium-term) strategy for Intel. It will hurt their bottom line, though, and shareholders in the next couple years might not be happy with it. You know who will be happy? Anyone in the market for a CPU. This is the best time to buy one, manufacturing methods be damned.