When NVIDIA unveiled the GeForce RTX 4000 Series graphics cards as the big announcement of the GTC 2022 GeForce Beyond special broadcast, it was immediately clear that DLSS 3 played an important role in achieving the unprecedented generational performance jump (2x-4x) claimed by NVIDIA.
Almost all of the benchmarks shared by the manufacturer included the new DLSS 3 technology, and the few that didn't showed performance improvements over the GeForce RTX 3000 Series that were more in line with what we have come to expect from a new generation of graphics cards.
Now that the GeForce RTX 4090, the flagship GPU (at least until the inevitable Ti model) and also the first from the brand new Ada Lovelace architecture to launch, has been in reviewers' hands for a while, we've been able to verify just how much DLSS 3 supercharges performance. First things first, though, let's take a look at what's behind the hood.
The new GeForce RTX graphics cards are equipped with fourth-generation Tensor Cores, which include a new 8-Bit Floating Point (FP8) Tensor Engine, increasing throughput by up to 5X to an estimated 1.32 Tensor-petaFLOPS on the RTX 4090.
However, with DLSS 3, NVIDIA is taking one step beyond DLSS Super Resolution. There is now a new DLSS Frame Generation convolutional autoencoder that generates an entire frame on its own based on optical flow fields calculated with the Optical Flow Accelerator.
Optical Flow Accelerators have been available in NVIDIA GPUs since the Turing architecture. However, as previously explained by VP of Applied Deep Learning Research Bryan Catanzaro, the new graphics cards are equipped with a significantly faster and more advanced version of the OFA, which is why DLSS 3 is currently an exclusive of GeForce RTX 4000 graphics cards.
The generated frame sits in-between frames reconstructed with DLSS Super Resolution. As such, NVIDIA claims that in every two frames, only one-eighth of the displayed pixels were rendered normally, while the rest were reconstructed between Super Resolution and Frame Generation, delivering massive frame rate improvements.
To account for the increased latency caused by Frame Generation, NVIDIA has embedded its latency-lowering Reflex technology to ensure responsiveness would remain optimal.
Our Hassan has been able to test the GeForce RTX 4090 with all the DLSS 3 compatible games that NVIDIA shared with reviewers. He chose the Quality preset (at 4K resolution, obviously) because he felt that the new graphics card already ran most games fast enough that it wouldn't make sense to drop the base rendering resolution by lowering DLSS presets.
4090 DLSS2
3090 Ti DLSS2
4090 SMAA
3090 Ti TAA
40
80
120
160
200
240
40
80
120
160
200
240
Avg 171
112
90
92
55
Min 158
94
74
80
42
Latency 8
9
10
10
12
DLSS 3 Screenshots (Click To Zoom In):
Native Screenshots (Click To Zoom In):
With the current game, though, DLSS 3 only improved average FPS by 16.1% and one percentile frame rate by 15.3% over DLSS 2.
4090 DLSS2
3090 Ti DLSS2
4090 Native
3090 Ti Native
40
80
120
160
200
240
40
80
120
160
200
240
Avg 170
141
85
88
52
Min 153
125
78
80
45
Latency 5
7
11
11
14
DLSS 3 Screenshots (Click To Zoom In):
2 of 9
Native Screenshots (Click To Zoom In):
2 of 9
In this case, DLSS 3 provides a 29% performance increase over DLSS 2 in average FPS and a 39.1% improvement in one percentile frame rate. The boost will likely be greater once ray tracing is enabled, though.
4090 DLSS2
3090 Ti DLSS2
4090 Native
3090 Ti Native
40
80
120
160
200
240
40
80
120
160
200
240
Avg 142
110
64
74
41
1% Low 128
92
52
60
30
Latency 7
10
15
15
19
DLSS 3 Screenshots (Click To Zoom In):
2 of 9
Native Screenshots (Click To Zoom In):
2 of 9
As such, in this year's edition of the officially licensed Formula 1 game, DLSS 3 can only further boost average FPS by 20.5% and minimum FPS by 22.4%.
4090 DLSS2
3090 Ti DLSS2
4090 Native
3090 Ti Native
40
80
120
160
200
240
40
80
120
160
200
240
Avg 170
141
85
88
52
Min 153
125
78
80
45
Latency 5
7
11
11
14
DLSS 3 Screenshots (Click To Zoom In):
2 of 9
Native Screenshots (Click To Zoom In):
2 of 9
As such, there is a massive 106% increase in average FPS and an even greater 115% improvement in minimum FPS over the DLSS 2 implementation.
4090 DLSS2
3090 Ti DLSS2
4090 TAA
3090 Ti TAA
40
80
120
160
200
240
40
80
120
160
200
240
Avg 128
62
32
58
31
Min 112
52
24
50
22
Latency 9
19
30
20
33
DLSS 3 Screenshots (Click To Zoom In):
2 of 9
Native Screenshots (Click To Zoom In):
2 of 9
4090 Native
20
40
60
80
100
120
20
40
60
80
100
120
Avg 94
28
1% Low 88
21
Latency 10
35
DLSS 3 Screenshots (Click To Zoom In):
2 of 9
Native Screenshots (Click To Zoom In):
2 of 9
When tested in titles that already run at very high frame rates, its boost compared to regular DLSS 2 is more limited (at least when using the Quality preset - I reckon the Performance and Ultra Performance preset may widen the gap). That's mostly because the RTX 4090 is a beast of its own, delivering substantial performance gains over the previous generation's top cards even when using DLSS 2 or native rendering. If you've ever wanted to play games at 4K, 144+FPS with all graphics settings turned to the max, RTX 4090 and DLSS 3 can easily deliver that.
As first noted in Digital Foundry's initial hands-on with the technology, the Frame Generation component can sometimes introduce artifacts. However, those are really hard to notice during regular gameplay. It's also possible that the Frame Generation algorithm will be improved over time to diminish these glitches, much like NVIDIA did with DLSS Super Resolution.
Last but not least, I must admit that I was most impressed by the latency measurements. During press presentations, NVIDIA engineers had kind of hinted that the lowest latency would be obtained by a combination of DLSS 2 and Reflex rather than DLSS 3 due to its Frame Generation component. However, the data shows DLSS 3 coming out on top in all cases, sometimes with a meaningful difference over DLSS 2 + Reflex. More testing will be required, but it seems like RTX 4000 Series owners may not have a reason to turn off Frame Generation.