Please note that this post is tagged as a rumor.
NVIDIA GeForce RTX 4090 with over 100 TFLOPS of power
NVIDIA next-gen flagship GPU is rumored to deliver over 2.5 times more raw compute power over RTX 3090 Ti.
While the rumors about AMD RDNA3 flagship processor indicate it will offer over 4 times more single-precision compute power over RDNA2, NVIDIA is supposedly doing a similar upgrade to its upcoming flagship. The AD102 GPU, based on Ada Lovelace architecture, is expected to deliver over 100 TFLOPS of power, which is 2.5 more than 40 TFLOPS offered by RTX 3090 Ti and 2.8 times more than RTX 3090. The FP32 (single-precision) power does not automatically guarantee better gaming performance, though.
To be honest, I don't have much information about AMD. Maybe Lisa and Jensen's competition will give us a 100TFLOPS gaming war in a few months.
— kopite7kimi (@kopite7kimi) April 29, 2022
Both Greymon55 and Kopite7kimi agree that the TFLOPS war will only be a part of the battle for the fastest desktop graphics. There is more at play, such as raytracing acceleration, supported super-resolution tech and other features that may tip the scales in favor of any architecture.
https://twitter.com/greymon55/status/1520473548782927872
To achieve 100 TFLOPS of power, the AD102 GPU with 18432 CUDA cores would have to be clocked at 2.7 GHz, but it’s almost certain that RTX 4090 will ship with a partially disabled GPU. Therefore, the clock speed will automatically be higher. According to Greymon55, next-gen flagship cards might ship with very similar clock speeds, which in the case of AMD Navi 31 GPU means 3.0 GHz and that’s assuming full GPU is used.
It is true that we know a lot about next-gen GPUs already, some rumors have been around for months. But this does not mean that we know everything yet. The FP32 CUDA/Stream Processor count can still change. What all leakers appear to agree with is that next-gen GPUs will require a lot of power.
Next-gen Flagship GPU Comparison (RUMORED) | |||
---|---|---|---|
VideoCardz.com | GeForce RTX 3090 TI | AD102 | NAVI 31 |
Fabrication Node | SAMSUNG 8N | TSMC N5 | TSMC N5/N6 |
Architecture | NVIDIA Ampere | NVIDIA Ada | AMD RDNA3 |
GPU Package | Monolithic | Monolithic | Multi-Chip-Module (MCM) |
Estimated GPU Size | 628mm² | ~600mm² | ~800mm² |
Graphics Dies | 1 | 1 | 2 GCD + 4 MCD + 1 IOD |
GPU Mega Clusters | 7 Graphics Processing Clusters (GPC) | 12 Graphics Processing Clusters (GPC) | 2×3 Shader Engines |
GPU Super Clusters | 42 Texture Processing Clusters (TPC) | 72 Texture Processing Clusters (TPC) | 2×30 RDNA Workgroups (WGP) |
GPU Clusters | 84 Streaming Multiprocessors (SM) | 144 Streaming Multiprocessors (SM) | 120 Compute Units |
FP32 Cores | 10752 CUDAs | 18432 CUDAs | 15360 Stream Processors |
GPU Clock | 2.6 GHz | ~ 2.7 GHz | ~ 3.0 GHz |
Memory Type | 24 GB GDDR6X | 24 GB GDDR6X | TBC GB GDDR6 |
Memory & Bus | 21 Gbps 384-bit | 21 Gbps 384-bit | TBC Gbps 256-bit |
Cache | 6MB (L2 Cache) | 96MB (L2 Cache) | 256 or 512MB Infinity Cache |
Power Consumption | 450W | 600W | TBC |
Release Date | Q1 2022 | Q3/Q4 2022 | Q3/Q4 2022 |
FP32 Performance | 40 TFLOPS | ~ 100 TFLOPs | ~ 92 TFLOPs |
Source: @kopite7kimi, @greymon55 via Wccftech