Deep Learning

RTX 2080 Ti Deep Learning Performance Benchmarks for TensorFlow

February 20, 2019
7 min read
TF-RTX2080Ti-Benchmarks-Hero.jpg

NOTE: View our latest 2080 Ti Benchmark Blog with FP16 & XLA Numbers here.

In this blog, we benchmark test the NVIDIA GeForce RTX 2080 Ti GPU on the TensorFlow deep learning framework. Our results show that the RTX 2080 Ti provides incredible value for the price. We ran the tests on one of our deep learning workstations (see system specs below), with multiple GPU configurations (1,2,4). Given it's entry level price point, the results of the Turing powered RTX 2080 Ti are truly remarkable for deep learning training. This is further evident when comparing performance to the Volta Powered TITAN V (see our blog here), where performance is nearly on par.

Benchmark Snapshot: Nasnet, VGG16, Inception V3, ResNET-50

Overall4-TF-RTX2080Ti-Charts.jpg

Methodology

The configuration used for TensorFlow was unchanged from beginning to end with the exception of the number of GPU's utilized in a specific benchmark run.

NOTE: The image-retrain function within TensorFlow was used to import the real data (imagenet) into the Nasnet model, which contained real data consisting of jpeg images of flowers from the imagenet dataset. The other models Resnet50, VGG16, InceptionV3, used synthetic data, measured "as-is" without any modifications made throughout the testing cycle. Experiments ran using the python-pip package within the Anaconda run-time as prescribed in the TensorFlow installation documentation.

The benchmark scripts were downloaded from the official TensorFlow github, along with the pre-constructed models. In each case, the only variables which changed from run to run were: num_gpus and model. All other parameters were unchanged throughout the duration of these experiments. Batch size used was 64 for all training, running at the default, single precision (fp32).

Exxact RTX 2080 Ti Workstation System Specs

CPU2 x Intel Xeon Gold 6148 2.4GHz CPU
RAM192GB DDR4-2666
SSD500 GB SSD
GPU1, 2, 4x NVIDIA GeForce RTX 2080 Ti (blower model)
OSUbuntu Server 16.04
DRIVERNVIDIA version 410.48
CUDACUDA Toolkit 10.0
Pythonv 2.7, pip v8, Anaconda
TensorFlow1.12

ebook deep learning

NVIDIA GeForce RTX 2080 Ti Deep Learning Benchmarks for TensorFlow: 1, 2, 4 GPU Configuration

Nasnet Images/Sec (Real Data)

Nasnet_TF-RTX2080Ti-Chart.jpg

ResNet-50 Images/Sec (Synthetic Data)

Inception V3 Images/Sec (Synthetic Data)

Resnet-50_TF-RTX2080Ti-Chart.jpg

VGG16 Images/Sec (Synthetic Data)

VGG16_TF-RTX2080Ti-Chart.jpg

Nasnet Benchmark Commands & Outputs for TensorFlow

Below the specific commands to run each of the scenarios is documented above the benchmark results. To change the neural network model, simply change the model flag i.e. model=resnet50 to train the Resnet50 model with all other variables the same.

Nasnet 1GPU

python tf_cnn_benchmarks.py --data_format=NCHW --batch_size=64 --model=nasnet --optimizer=momentum -- variable_update=replicated --nodistortions --gradient_repacking=8 --num_gpus=1 --num_epochs=90 --weight_decay=1e-4 --data_dir=/data --train_dir=/data/scratch --data_name=imagenet

Step Img/sec total_loss
1 images/sec: 156.8 +/- 0 (jitter = 0.0) 7.496
10 images/sec: 156.7 +/- 0.5 (jitter = 2.5) 7.345
20 images/sec: 156.5 +/- 0.4 (jitter = 1.9) 7.52
30 images/sec: 156 +/- 0.3 (jitter = 2.0) 7.41
40 images/sec: 156.6 +/- 0.3 (jitter = 2.1) 7.473
50 images/sec: 156.5 +/- 0.3 (jitter = 2.1) 7.504
60 images/sec: 156.2 +/- 0.3 (jitter = 2.0) 7.509
70 images/sec: 156.4 +/- 0.3 (jitter = 2.1) 7.54
80 images/sec: 156.3 +/- 0.2 (jitter = 2.0) 7.332
90 images/sec: 156.3 +/- 0.2 (jitter = 1.9) 7.583
100 images/sec: 156.3 +/- 0.2 (jitter = 2.1) 7.43
----------------------------------------------------------------
total images/sec: 156.2
----------------------------------------------------------------

Nasnet 2 GPU

python tf_cnn_benchmarks.py --data_format=NCHW --batch_size=64 --model=nasnet --optimizer=momentum -- variable_update=replicated --nodistortions --gradient_repacking=8 --num_gpus=2 --num_epochs=90 --weight_decay=1e-4 --data_dir=/data --train_dir=/data/scratch --data_name=imagenet

Step Img/sec total_loss
1 images/sec: 285.8 +/- 0 (jitter = 0.0) 7.482
10 images/sec: 288.2 +/- 1.4 (jitter = 4.6) 7.537
20 images/sec: 281.7 +/- 2.5 (jitter = 7.7) 7.488
30 images/sec: 282.1 +/- 1.7 (jitter = 6.7) 7.508
40 images/sec: 282.3 +/- 1.4 (jitter = 7.7) 7.367
50 images/sec: 281.1 +/- 1.2 (jitter = 8.5) 7.498
60 images/sec: 281.4 +/- 1.1 (jitter = 8.4) 7.465
70 images/sec: 280.5 +/- 1.1 (jitter = 8.6) 7.387
80 images/sec: 280.3 +/- 1 (jitter = 9.0) 7.433
90 images/sec: 279.8 +/- 0.9 (jitter = 8.5) 7.442
100 images/sec: 279.9 +/- 0.8 (jitter = 8.0) 7.353
----------------------------------------------------------------
total images/sec: 279.85
----------------------------------------------------------------

Nasnet 4 GPU

python tf_cnn_benchmarks.py --data_format=NCHW --batch_size=64 --model=nasnet --optimizer=momentum -- variable_update=replicated --nodistortions --gradient_repacking=8 --num_gpus=4 --num_epochs=90 --weight_decay=1e-4 --data_dir=/data --train_dir=/data/scratch --data_name=imagenet

Step Img/sec total_loss
1 images/sec: 507.3 +/- 0 (jitter = 0.0) 7.549
10 images/sec: 475.4 +/- 7.1 (jitter = 29.7) 7.487
20 images/sec: 467.7 +/- 7 (jitter = 16.8) 7.525
30 images/sec: 470.8 +/- 4.9 (jitter = 11.8) 7.459
40 images/sec: 473 +/- 4.1 (jitter = 18.5) 7.419
50 images/sec: 474.6 +/- 3.4 (jitter = 18.4) 7.458
60 images/sec: 477.1 +/- 3 (jitter = 17.1) 7.481
70 images/sec: 476.7 +/- 3.1 (jitter = 18.0) 7.454
80 images/sec: 477.3 +/- 2.8 (jitter = 15.7) 7.499
90 images/sec: 477.9 +/- 2.5 (jitter = 15.4) 7.411
100 images/sec: 478.8 +/- 2.3 (jitter = 15.6) 7.485
----------------------------------------------------------------
total images/sec: 478.68
----------------------------------------------------------------

Other Benchmarks Coming Soon...

RTX 2080 Ti Deep Learning Benchmarks (with RTX Bridge)

RTX 2080 Deep Learning Benchmarks

TITAN RTX Deep Learning Benchmarks

Deep Learning Workstations from Exxact Starting at $7,999

Powered by NVIDIA GeForce RTX 2080 Ti GPU's, Exxact Deep Learning Workstations offer powerful computational power for affordable prices.

RTX_DevBox.png

Have any questions? Contact us directly here.

More Deep Learning Benchmarks

Topics

TF-RTX2080Ti-Benchmarks-Hero.jpg
Deep Learning

RTX 2080 Ti Deep Learning Performance Benchmarks for TensorFlow

February 20, 20197 min read

NOTE: View our latest 2080 Ti Benchmark Blog with FP16 & XLA Numbers here.

In this blog, we benchmark test the NVIDIA GeForce RTX 2080 Ti GPU on the TensorFlow deep learning framework. Our results show that the RTX 2080 Ti provides incredible value for the price. We ran the tests on one of our deep learning workstations (see system specs below), with multiple GPU configurations (1,2,4). Given it's entry level price point, the results of the Turing powered RTX 2080 Ti are truly remarkable for deep learning training. This is further evident when comparing performance to the Volta Powered TITAN V (see our blog here), where performance is nearly on par.

Benchmark Snapshot: Nasnet, VGG16, Inception V3, ResNET-50

Overall4-TF-RTX2080Ti-Charts.jpg

Methodology

The configuration used for TensorFlow was unchanged from beginning to end with the exception of the number of GPU's utilized in a specific benchmark run.

NOTE: The image-retrain function within TensorFlow was used to import the real data (imagenet) into the Nasnet model, which contained real data consisting of jpeg images of flowers from the imagenet dataset. The other models Resnet50, VGG16, InceptionV3, used synthetic data, measured "as-is" without any modifications made throughout the testing cycle. Experiments ran using the python-pip package within the Anaconda run-time as prescribed in the TensorFlow installation documentation.

The benchmark scripts were downloaded from the official TensorFlow github, along with the pre-constructed models. In each case, the only variables which changed from run to run were: num_gpus and model. All other parameters were unchanged throughout the duration of these experiments. Batch size used was 64 for all training, running at the default, single precision (fp32).

Exxact RTX 2080 Ti Workstation System Specs

CPU2 x Intel Xeon Gold 6148 2.4GHz CPU
RAM192GB DDR4-2666
SSD500 GB SSD
GPU1, 2, 4x NVIDIA GeForce RTX 2080 Ti (blower model)
OSUbuntu Server 16.04
DRIVERNVIDIA version 410.48
CUDACUDA Toolkit 10.0
Pythonv 2.7, pip v8, Anaconda
TensorFlow1.12

ebook deep learning

NVIDIA GeForce RTX 2080 Ti Deep Learning Benchmarks for TensorFlow: 1, 2, 4 GPU Configuration

Nasnet Images/Sec (Real Data)

Nasnet_TF-RTX2080Ti-Chart.jpg

ResNet-50 Images/Sec (Synthetic Data)

Inception V3 Images/Sec (Synthetic Data)

Resnet-50_TF-RTX2080Ti-Chart.jpg

VGG16 Images/Sec (Synthetic Data)

VGG16_TF-RTX2080Ti-Chart.jpg

Nasnet Benchmark Commands & Outputs for TensorFlow

Below the specific commands to run each of the scenarios is documented above the benchmark results. To change the neural network model, simply change the model flag i.e. model=resnet50 to train the Resnet50 model with all other variables the same.

Nasnet 1GPU

python tf_cnn_benchmarks.py --data_format=NCHW --batch_size=64 --model=nasnet --optimizer=momentum -- variable_update=replicated --nodistortions --gradient_repacking=8 --num_gpus=1 --num_epochs=90 --weight_decay=1e-4 --data_dir=/data --train_dir=/data/scratch --data_name=imagenet

Step Img/sec total_loss
1 images/sec: 156.8 +/- 0 (jitter = 0.0) 7.496
10 images/sec: 156.7 +/- 0.5 (jitter = 2.5) 7.345
20 images/sec: 156.5 +/- 0.4 (jitter = 1.9) 7.52
30 images/sec: 156 +/- 0.3 (jitter = 2.0) 7.41
40 images/sec: 156.6 +/- 0.3 (jitter = 2.1) 7.473
50 images/sec: 156.5 +/- 0.3 (jitter = 2.1) 7.504
60 images/sec: 156.2 +/- 0.3 (jitter = 2.0) 7.509
70 images/sec: 156.4 +/- 0.3 (jitter = 2.1) 7.54
80 images/sec: 156.3 +/- 0.2 (jitter = 2.0) 7.332
90 images/sec: 156.3 +/- 0.2 (jitter = 1.9) 7.583
100 images/sec: 156.3 +/- 0.2 (jitter = 2.1) 7.43
----------------------------------------------------------------
total images/sec: 156.2
----------------------------------------------------------------

Nasnet 2 GPU

python tf_cnn_benchmarks.py --data_format=NCHW --batch_size=64 --model=nasnet --optimizer=momentum -- variable_update=replicated --nodistortions --gradient_repacking=8 --num_gpus=2 --num_epochs=90 --weight_decay=1e-4 --data_dir=/data --train_dir=/data/scratch --data_name=imagenet

Step Img/sec total_loss
1 images/sec: 285.8 +/- 0 (jitter = 0.0) 7.482
10 images/sec: 288.2 +/- 1.4 (jitter = 4.6) 7.537
20 images/sec: 281.7 +/- 2.5 (jitter = 7.7) 7.488
30 images/sec: 282.1 +/- 1.7 (jitter = 6.7) 7.508
40 images/sec: 282.3 +/- 1.4 (jitter = 7.7) 7.367
50 images/sec: 281.1 +/- 1.2 (jitter = 8.5) 7.498
60 images/sec: 281.4 +/- 1.1 (jitter = 8.4) 7.465
70 images/sec: 280.5 +/- 1.1 (jitter = 8.6) 7.387
80 images/sec: 280.3 +/- 1 (jitter = 9.0) 7.433
90 images/sec: 279.8 +/- 0.9 (jitter = 8.5) 7.442
100 images/sec: 279.9 +/- 0.8 (jitter = 8.0) 7.353
----------------------------------------------------------------
total images/sec: 279.85
----------------------------------------------------------------

Nasnet 4 GPU

python tf_cnn_benchmarks.py --data_format=NCHW --batch_size=64 --model=nasnet --optimizer=momentum -- variable_update=replicated --nodistortions --gradient_repacking=8 --num_gpus=4 --num_epochs=90 --weight_decay=1e-4 --data_dir=/data --train_dir=/data/scratch --data_name=imagenet

Step Img/sec total_loss
1 images/sec: 507.3 +/- 0 (jitter = 0.0) 7.549
10 images/sec: 475.4 +/- 7.1 (jitter = 29.7) 7.487
20 images/sec: 467.7 +/- 7 (jitter = 16.8) 7.525
30 images/sec: 470.8 +/- 4.9 (jitter = 11.8) 7.459
40 images/sec: 473 +/- 4.1 (jitter = 18.5) 7.419
50 images/sec: 474.6 +/- 3.4 (jitter = 18.4) 7.458
60 images/sec: 477.1 +/- 3 (jitter = 17.1) 7.481
70 images/sec: 476.7 +/- 3.1 (jitter = 18.0) 7.454
80 images/sec: 477.3 +/- 2.8 (jitter = 15.7) 7.499
90 images/sec: 477.9 +/- 2.5 (jitter = 15.4) 7.411
100 images/sec: 478.8 +/- 2.3 (jitter = 15.6) 7.485
----------------------------------------------------------------
total images/sec: 478.68
----------------------------------------------------------------

Other Benchmarks Coming Soon...

RTX 2080 Ti Deep Learning Benchmarks (with RTX Bridge)

RTX 2080 Deep Learning Benchmarks

TITAN RTX Deep Learning Benchmarks

Deep Learning Workstations from Exxact Starting at $7,999

Powered by NVIDIA GeForce RTX 2080 Ti GPU's, Exxact Deep Learning Workstations offer powerful computational power for affordable prices.

RTX_DevBox.png

Have any questions? Contact us directly here.

More Deep Learning Benchmarks

Topics