The new Nvidia GeForce RTX 2080 series cards are the first mainstream cards to be using the Ray-Tracing technology and it’s something developers are eager to use. Other than that, these cards have been quite popular for its mesh shaders and Deep Learning Super Sampling (DLSS).
Here we’ll be taking a look at the OpenCL/CUDA benchmark scores for the Maxwell, Pascal, and Turing along with an AMD Radeon RX Vega 64 for reference. We’ll also be adding power consumption, performance-per-watt, and performance-per-dollar stats to provide an overall comparison.
For this benchmarking, various OpenCL and CUDA workloads were used. GeForce GTX 980 Ti, GTX 1080 Ti, and RTX 2080 Ti each will be used for benchmarking with NVIDIA 410.57 Linux driver and CUDA 10.0 in order to highlight the generational difference and overall performance. For further reference, an AMD card is also added to the compare list which is the Radeon RX Vega 64 on the latest ROCm 1.9 open-source compute stack using the Linux 4.18 mainline kernel.
Phoronix Test Suite was used to monitor the overall AC system power consumption using a WattUp Pro power meter to give out precise performance-per-watt metrics. The performance-per-dollar statistics are based on the current pricing.
To start off, let’s look at some of the CUDA benchmarks on the tested NVIDIA cards. Radeon Cards weren’t used in this test due to some technical issues.
You can observe that in the Gridding test with ASKAP, the RTX 2080 Ti is about 40% faster than the GTX 1080 Ti. In the Degridding test, however, the performance improvement seen in RTX 2080 Ti over GTX 1080 Ti is even more at about 64% faster. This shows that there is a more prominent generational improvement going from Pascal to Turing than from Maxwell to Pascal.
The RTX 2080 Ti was twice as fast as the GTX 1080 Ti in the basic CUDA Mini-Nbody test.
The SHOC OpenCL benchmark showed a 25% improvement over the GTX 1080 Ti as RTX 2080 Ti reached 16.5 TFLOPS of single-precision compute power.
During the SP FLOPS test, the GeForce RTX 2080 Ti was, on average, drawing about 20 W more power than the GTX 1080 Ti.
If the Performance-per-watt metric is considered, the RTX 2080 Ti is ahead by 14% than the GTX 1080 Ti.
Finally, the RX Vega 64 was able to run the texture read bandwidth test via SHOC. RX Vega 64 was above the GTX 980 Ti but lagged behind the GTX 1080 Ti. The RTX 2080 Ti was twice as fast as the GTX 1080 Ti in this test, mostly due to its GDDR6 graphics memory.
Performance-per-watt metrics show quite a big improvement for RTX 2080 Ti over the GTX 1080 Ti while RX Vega 64 power efficiency was comparable to that of the GTX 980 Ti.
RX Vega 64 closed the gap with the GTX 1080 Ti in the SHOC FFT benchmark test. RTX 2080 Ti makes an improvement of 48% in performance. This test too showed a larger generational improvement going from Pascal to Turing than Maxwell to Pascal.
The MD5 hashing performance also saw an upgrade with the RTX 2080 Ti.
Rodinia + IndigoBench
Rodinia OpenCL test saw the AMD RX Vega 64 lagging behind the NVIDIA GPUs in performance stats. The RX Vega 64 didn’t work on other Rodinia OpenCL benchmarks like the particle filter test.
Indigo’s Supercar scene was rendered about 70% faster on the RTX 2080 Ti than the GTX 1080 Ti. GTX 1080 Ti itself was about 45% faster than GTX 980 Ti.
Even here RTX 2080 Ti has a huge advantage over GTX 1080 Ti in terms of performance-per-watt statistics.
Significant improvement was seen in the bedroom scene with IndigoBench on the RTX 2080 Ti over the previous generation card.
V-RAY + Parboil + Finance Bench
As expected, RTX 2080 Ti was the fastest on the CUDA V-RAY renderer.
Significant improvements were seen on the RTX 2080 Ti over GTX 1080 Ti on Parboil and FinanceBench while RX Vega 64 was not able to run with the current ROCm release. The generational improvement was more going from Pascal to Turing than Maxwell to Pascal.
Darktable + clpeak
Darktable RAW photography processing with OpenCL was experiencing bottlenecking unless the image resolution was increased for processing. The ROCm 1.9 OpenCL stack worked on the AMD RX Vega 64, but was slower than the NVIDIA GPUs.
In terms of maximum global memory bandwidth, the RX Vega 64 overtakes the GTX 1080 Ti because of its HBM2 memory but falls behind the RTX 2080 Ti which uses GDDR6 memory.
The single-precision float performance stats show RTX 2080 Ti again leading the comparison with RX Vega 64 behind it on second.
cl-mem + LuxMark
GeForce RTX 2080 Ti with its GDDR6 memory easily takes the pole position in the OpenCL memory tests ahead of the RX Vega 64 and GTX 1080 Ti.
The Radeon GPUs are known to perform quite good in the LuxMark OpenCL tests, but when pitted against the RTX 2080 Ti, it was found to be about 26% slower than Nvidia’s card. Moreover, the RTX 2080 Ti had twice the speed of the GTX 1080 Ti.
RTX 2080 Ti also tops the performance-per-watt metric.
The ROCm Radeon OpenCL stack experienced some problems with the two other LuxMark scenes during the compiling of the OpenCL kernels. For NVIDIA, RTX 2080 Ti still stays ahead of the other two cards from previous generations in terms of raw performance and performance-per-watt metric.
Power Consumption + Performance-per-dollar
The total AC system power consumption over the course of all the CUDA/OpenCL benchmarking process is showcased over here. The Radeon RX Vega 64 was not included in this due to the differences in the test selection around CUDA and OpenCL tests where ROCm 1.9 was not working.
On an average, the RTX 2080 Ti consumed about 20 Watts more power than the GTX 1080 Ti, but the Turing based GPU still managed to deliver better performance-per-watt. The maximum power consumed by the RTX 2080 Ti Founder’s Edition card along with an Intel Core i7 8086K system came out to be 344 Watts.
The $1199 USD price tag of the RTX 2080 Ti Founder’s Edition card is quite steep for the majority of users, but if the OpenCL/CUDA performance-per-dollar stats are considered, it’s actually not that bad. The price of $1199 USD could be worth it if you’re someone who’ll be constantly using the system with OpenCL/CUDA workloads. Additionally, technologies like RTX/ray-tracing, DLSS, mesh shaders, and the Turing tensor cores are an added benefit.
In the OpenCL tests where the RX Vega 64 was working properly with the new ROCm 1.9 compute stack, it managed to give out better performance-per-dollar than all three of the NVIDIA cards.
That’s it for the first look at the GeForce RTX 2080 Ti compute performance on Ubuntu Linux with CUDA 10.0 and the NVIDIA 410.57 driver and its comparison with older generations of NVIDIA’s flagship cards and a Radeon GPU.