Intel Meteor Lake Technical Deep Dive 60

Intel Meteor Lake Technical Deep Dive

(60 Comments) »

Introduction

Intel Logo

Intel provided us with a close look at its upcoming "Meteor Lake" microarchitecture powering the next generation of Intel Core processors for PCs. "Meteor Lake" is Intel's very first processor to fully realize Intel's IDM 2.0 Strategy of redesigning its future processors such that it can maximize the utilization of its latest in-house semiconductor foundry capacity for the specific components on the processor that need it the most; and then carefully disaggregating the various components onto slightly older foundry nodes—some even from external foundries—and building an integrated device that's greater than the sum of its parts.

Intel attaches great importance to the success of "Meteor Lake," not just because it is Intel's first disaggregated chiplet-based processor, but also because it powers Intel's biggest push for consumer AI hardware acceleration, with Intel AI Boost. With this, Intel hopes to mainstream AI in the client space, and get the PC ecosystem to embrace on-device accelerated AI for consumer software applications spanning from everything between limited function apps to complex productivity suites. "Meteor Lake" is hardly the first device to accelerate AI for the client—GPUs with AI acceleration hardware have been around for close to 5 years now—but Intel still holds the reins to the PC ecosystem with an over 80% market-share in PC processors, and the fact that a majority of PCs don't use discrete GPUs. With Intel taking the lead in consumer AI, software vendors would finally have the confidence to invest big in new AI-accelerated features since they know there's a sizable userbase.



2023 is a very different point in time from 2008. Back then, Intel had just perfected its 45 nm HKMG foundry node, and had a credible path toward 14 nm, and could execute the famous "Tick-Tock" product development strategy, where the company would introduce a new foundry node every two years, and a new microarchitecture every intervening two years. Since the company had industrial leadership over foundry nodes, it could build processors on monolithic dies. The 1st Gen Core "Nehalem" was pivotal, in that it saw the first aggregation effort by Intel in a decade, with the PC's northbridge being integrated with the processor package. Over subsequent generations, the aggregation trend would continue, driven by clear performance incentives—first the memory controller, followed by the integrated graphics, and then platform I/O. The company's last monolithic processor die, "Raptor Lake," integrates nearly every device of the modern PC onto a single die built on the Intel 7 process, except the platform I/O.

Fast forward to 2023, and Intel has just emerged from a very slow jump from 14 nm to Intel 7 (10 nm Enhanced SuperFin), the prices of wafers on the latest EUV-based nodes are significantly higher than what they used to be, and so Intel is incentivized to disaggregate—to identify the specific components on the processor that don't benefit as much from being built on the latest foundry node, and spin them off into chiplets, or tiles as Intel likes to call them, on older foundry nodes. Alongside "Meteor Lake," Intel is introducing the new Intel 4 node, the company's first to utilize EUV lithography, and offer both transistor densities and energy efficiency rivaling 4 nm-class nodes by TSMC. Besides Intel 4, the company has its mature Intel 7 node, and even third party nodes for specific tiles.

The secret sauce here is the Foveros chip packaging technology, a combination of high-density inter-die and via-substrate electronic connections between the various tiles, which make them work as if they were a single monolithic die. The effort here is to provide both bandwidth and latencies close to those of on-die connections, or else the processor is essentially an MCM and disaggregated as things were before Nehalem.

In this article, we dive deep into the fascinating world of "Meteor Lake" and how Intel plans to make this chiplet-based processor greater than the sum of its parts and realize CEO Pat Gelsinger's vision of IDM 2.0 manufacturing, which will redefine the way chips will be built for the foreseeable future. This is purely an architecture-focused article, Intel has not detailed any specific processor models based on "Meteor Lake," nor what its next-gen Core lineup will look like; and so obviously we have no performance numbers in this review besides anything Intel would have claimed in its presentations to us.

The Meteor Lake Philosophy


Intel "Meteor Lake" is the company's first consumer disaggregated processor architecture, which is not exactly an antithesis of the aggregation effort Intel undertook over the past two decades. Beginning with 2003, Intel aggregated 2P (multiprocessor) into a single package to develop multi-core processors. In 2007, it aggregated a big portion of the northbridge (aka core-logic) into the package, bringing the memory controller next to the CPU cores for superior latencies and performance, than having them on a discrete northbridge. The next step for aggregation was bringing the integrated graphics (iGPU) into the package, where it could have low-latency access to the memory controller, since graphics is a memory-sensitive device. This was done in two steps. "Clarkdale" briefly put the iGPU and memory controller into a separate 45 nm die from the 32 nm CPU complex die; and with the 2011 "Sandy Bridge," the iGPU and memory controllers were fully merged into a monolithic 32 nm silicon alongside the CPU cores.


This would go on to become the basic construct of Intel client processors for the next 12 years, right up to the current "Raptor Lake." 2023 is a different world. Intel's foundry technology leadership is lost to TSMC, there is tremendous demand for the latest foundry nodes to build chips that pretty much power our modern civilization, and although Intel has been able to keep its new Intel 4 node on track, the company cannot hope to build large monolithic processors on the node anymore. This calls for Intel to disaggregate the processor, not back into discrete devices scattered across the motherboard, but something else.

The Need to Disaggregate and How to Go About it


A disaggregated processor is not a multi-chip module (MCM) in the strictest sense of the term. In an MCM, you are bringing independent devices that can otherwise exist on their own packages, together on a single substrate. An example of an MCM would be the mobile Intel Core package that combines the processor die and PCH die on a single package. Here, the PCH can exist as a discrete device like it does on desktop motherboards, without any performance loss. In a disaggregated processor (or chiplet device), the individual devices residing on separate chiplets are located not just on the same package (like in an MCM), but in extremely close vicinity to each other, with high performance interconnects running between them. These chiplets cannot exist on separate packages, because then there would be prohibitive amounts of latency introduced, and the processor cannot attain its desired performance—it cannot be a sum of its parts.

AMD has been disaggregating its processors since Ryzen 3000 "Zen 2," where the CPU cores had been spun off into separate 7 nm chiplets talking to a 12 nm I/O die that contained the rest of the processor with memory controllers, PCIe interface, and an integrated SoC. It would have cost AMD a lot more to build a monolithic 16-core processor on 7 nm at the time, compared to this approach. AMD discovered that the components on the 12 nm die didn't really benefit much from the switch to 7 nm to warrant building a monolithic die. The resulting consumer processor would use up to two 8-core CPU complex dies, each barely the size of a fingernail, but place up to eight of these on the larger EPYC server processors, minimizing R&D costs.


Intel's approach to the disaggregated processor is a lot more complex, and is driven by the fact that "Meteor Lake" has three distinct logic devices—the CPU, the iGPU, and the NPU (neural processing unit). Each of these is a bandwidth hungry device that sits on a separate die, and we'll try to explain the need to arrange them the way they are.


The "Meteor Lake" Processor is a collection of four distinct tiles (chiplets), and an base tile that serves as an intelligent interposer, facilitating high-density, low-latency wiring between them. The four tiles are Compute, Graphics, SoC, and I/O.


The Compute tile contains the processor's CPU cores, or its main compute machinery. The SoC tile contains the all important NPU (neural processing unit) that forms the hardware backend of Intel AI Boost, besides the processor's media accelerator and display controller. It also contains the processor's memory controllers and PCI-Express root complex. Besides these, the SoC tile has a surprise component called the Island E-cores (a lot more on this fascinating component later). The Graphics tile, as its name suggests, contains the iGPU, specifically the graphics rendering and graphics compute machinery, but minus the display and media accelerators. The I/O tile, although physically separate from the SoC tile, is an extension of it, and contains all the physical layer interfaces of the processor.

Our Patreon Silver Supporters can read articles in single-page format.
Discuss(60 Comments)
May 12th, 2024 13:37 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts