Difference between revisions of "GPUs comparison"

From KlayGE
Jump to: navigation, search
(Desktop)
(Computing)
 
Line 136: Line 136:
 
|-
 
|-
 
| Tesla K40 || 28 || PCIe 3.0 x16 || 2880 FP32, 960 FP64 || 745 || 6000 || 12G || 288 || 384 || GDDR5 || 4291 || 1430 || 235 || 18.3 || 6.1 || D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
 
| Tesla K40 || 28 || PCIe 3.0 x16 || 2880 FP32, 960 FP64 || 745 || 6000 || 12G || 288 || 384 || GDDR5 || 4291 || 1430 || 235 || 18.3 || 6.1 || D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
 +
|-
 +
| Tesla M40 || 28 || PCIe 3.0 x16 || 3072 FP32, ??? FP64 || 948-1114 || 6000 || 12G || 288 || 384 || GDDR5 || 6884 || 213.9 || 250 || 27.5 || 0.9 || D3D 12.0, OpenGL 4.5, CUDA 6.5, OpenCL 1.2
 +
|-
 +
| Tesla P100 || 16 || PCIe 3.0 x16 || 3584 FP32, ??? FP64 || 1328-1480 || ??? || 16G || 720 || 4096 || GDDR5 || 10600 || 5300 || 300 || 35.3 || 17.7 || D3D 12.1, OpenGL 4.5, CUDA 7.0, OpenCL 1.2
 
|}
 
|}
  

Latest revision as of 13:53, 9 May 2016

Those data are collected from wikipedia and vendor websites.

AMD

Desktop

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Fusion APU 8670D 32 Integrated 384:24:8 844 2133 N/A 34.1 128 DDR3 648.2 0 100 (SoC) 6.5 0 D3D 11.0, OpenGL 4.3, OpenCL 1.2
Fusion APU Radeon R7 28 Integrated 512:32:8 720 2133 N/A 34.1 128 DDR3 737.3 46.1 95 (SoC) 7.8 0.5 D3D 11.2, OpenGL 4.3, OpenCL 1.2, Mantle
Fusion APU Mobile Radeon R7 28 Integrated 512:32:8 600-686 2133 N/A 34.1 128 DDR3 702.5 43.9 35 (SoC) 20.0 1.3 D3D 11.2, OpenGL 4.3, OpenCL 1.2, Mantle
Radeon HD 8970 28 PCIe 3.0 x16 2048:128:32 1000-1050 6000 3G 288 384 GDDR5 4300 1075 250 17.2 4.3 D3D 11.1, OpenGL 4.3, OpenCL 1.2
Radeon R9 290X 28 PCIe 3.0 x16 2816:176:64 800-1000 5000 4G 320 512 GDDR5 5632 704 290 19.4 2.4 D3D 11.2, OpenGL 4.3, OpenCL 1.2, Mantle
Radeon R9 Fury X 28 PCIe 3.0 x16 4096:256:64 1050 5000 4G 512 4096 HBM 8602 537 275 31.3 2.0 D3D 12.0, OpenGL 5.0, OpenCL 2.1, Mantle

Mobile

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Fusion APU Ultra-Mobile Radeon R6 28 Integrated 128:8:4 500 1333 N/A 10.7 64 DDR3L 128.0 8.0 4.5 (SoC) 28.4 1.8 D3D 11.2, OpenGL 4.3, OpenCL 1.2, Mantle

Professional

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Radeon W9000 28 PCIe 3.0 x16 2048:128:32 975 5500 6G 264 384 GDDR5 3993.6 998.4 274 12.4 3.6 D3D 11.1, OpenGL 4.2, OpenCL 1.2
Radeon W9100 28 PCIe 3.0 x16 2816:176:64 930 5000 16G 320 512 GDDR5 5237.8 2618.9 275 19.0 9.5 D3D 11.1, OpenGL 4.3, OpenCL 2.0

ARM

Mobile

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Mali-T604 MP4 32 (Exynos 5 Dual) Integrated 64 533 1600 N/A 12.8 64 LPDDR3 68 0 4 (SoC) 17.0 0 D3D 9.1, OpenGL ES 3.0, OpenCL 1.1
Mali-T760 MP4 28 (MT6752) Integrated  ??? 700 1600 N/A 6.4 64 LPDDR3  ???  ???  ???  ???  ??? D3D 11.1, OpenGL ES 3.0, OpenCL 1.1

Imagination

Mobile

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
PowerVR SGX554 MP4 32 (A6X) Integrated 128 300 1066 N/A 17.1 128 LPDDR2 76.8 0 4 (SoC) 19.2 0 D3D 9.3, OpenGL 2.1, OpenGL ES 2.0, OpenCL 1.1
G6430 28 (A7) Integrated 256 450 1600 N/A 12.8 64 LPDDR3 115.2 0 4 (SoC) 28.8 0 D3D 10.0, OpenGL 3.2, OpenGL ES 3.1
GX6450 20 (A8) Integrated 256  ??? 1600 N/A 12.8 64 LPDDR3  ???  ???  ???  ???  ??? D3D 10.0, OpenGL 3.2, OpenGL ES 3.1
GXA6850 20 (A8X) Integrated 512 533 1600 N/A 25.6 128 LPDDR3 272.9  ???  ???  ???  ??? D3D 10.0, OpenGL 3.2, OpenGL ES 3.1
 ??? 16 (A9X) Integrated  ???  ???  ??? N/A 51.2  ??? LPDDR4 545.8  ???  ???  ???  ??? D3D 10.0, OpenGL 3.3, OpenGL ES 3.1

Intel

Computing

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Xeon Phi 3100P 22 PCIe 2.0 x16 228 x86 1100 3750 6G 240 512 GDDR5 2000 1000 300 6.7 3.3 OpenMP, OpenCL, MKL
Xeon Phi 5110P 22 PCIe 2.0 x16 240 x86 1053 5000 8G 320 512 GDDR5 2022 1011 225 9.0 4.5 OpenMP, OpenCL, MKL
Xeon Phi 7120P 22 PCIe 2.0 x16 244 x86 1238-1333 5500 16G 352 512 GDDR5 2416 1208 300 8.1 4.0 OpenMP, OpenCL, MKL

Desktop

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Iris Pro Graphics 5200 22 Integrated 160:8:4 200-1300 1600 N/A 25.6 128 DDR3 832 208 65 (SoC) 12.8 3.2 D3D 11.1, OpenGL 4.2, OpenCL 1.2

Mobile

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
HD Graphics 22 Integrated  ??? 311-667 1066 N/A 17.1 128 LPDDR3  ???  ??? 4 (SoC)  ???  ??? D3D 11.1, OpenGL 4.0, OpenCL 1.2

NVIDIA

Computing

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Tesla K20X 28 PCIe 3.0 x16 2688 FP32, 896 FP64 732 5200 6G 250 384 GDDR5 3935 1312 235 16.7 5.6 D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
Tesla K40 28 PCIe 3.0 x16 2880 FP32, 960 FP64 745 6000 12G 288 384 GDDR5 4291 1430 235 18.3 6.1 D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
Tesla M40 28 PCIe 3.0 x16 3072 FP32, ??? FP64 948-1114 6000 12G 288 384 GDDR5 6884 213.9 250 27.5 0.9 D3D 12.0, OpenGL 4.5, CUDA 6.5, OpenCL 1.2
Tesla P100 16 PCIe 3.0 x16 3584 FP32, ??? FP64 1328-1480  ??? 16G 720 4096 GDDR5 10600 5300 300 35.3 17.7 D3D 12.1, OpenGL 4.5, CUDA 7.0, OpenCL 1.2

Desktop

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
GeForce GTX 680 28 PCIe 3.0 x16 1536:128:32 1006-1110 6008 2G 192 256 GDDR5 3090 128.8 195 15.8 0.7 D3D 11.0, OpenGL 4.5, CUDA 6.5, OpenCL 1.2
GeForce GTX 780 28 PCIe 3.0 x16 2304:192:48 863-1002 6008 3G 288 384 GDDR5 3977 165.7 250 15.9 0.7 D3D 11.0, OpenGL 4.5, CUDA 6.5, OpenCL 1.2
GeForce GTX Titan 28 PCIe 3.0 x16 2688:224:48 836-993 6008 6G 288 384 GDDR5 4500 1300-1500 250 18.0 6.0 D3D 11.0, OpenGL 4.5, CUDA 6.5, OpenCL 1.2
GeForce GTX Titan Black 28 PCIe 3.0 x16 2880:240:48 889-980 7000 6G 336 384 GDDR5 5120 1707 250 20.5 6.8 D3D 11.0, OpenGL 4.5, CUDA 6.5, OpenCL 1.2
GeForce GTX 780 Ti 28 PCIe 3.0 x16 2880:240:48 876-928 7000 3G 336 384 GDDR5 5046 210 250 20.2 0.8 D3D 11.0, OpenGL 4.5, CUDA 6.5, OpenCL 1.2
GeForce GTX 980 28 PCIe 3.0 x16 2048:128:64 1126-1216 7000 4G 224 256 GDDR5 4612 144 165 28.0 0.9 D3D 12.0, OpenGL 4.5, CUDA 6.5, OpenCL 1.2
GeForce GTX Titan X 28 PCIe 3.0 x16 3072:192:96 1000-1089 7010 4G 336 384 GDDR5 6144 192 250 24.6 0.8 D3D 12.0, OpenGL 4.5, CUDA 6.5, OpenCL 1.2
GeForce GTX 1080 16 PCIe 3.0 x16 2560:160:80 1607-1733 10240 8G 320 256 GDDR5X 8228  ??? 180 45.7  ??? D3D 12.1, OpenGL 4.5, CUDA 7.0, OpenCL 1.2

Mobile

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Tegra 3 40 Integrated 12 520 1600 N/A 6.4 32 LPDDR3 12.5 0 2 (SoC) 6.3 0 D3D 9.1, OpenGL ES 2.0
Tegra 4 28 Integrated 72 672 1866 N/A 14.9 64 LPDDR3 96.8 0 4 (SoC) 24.2 0 D3D 9.1, OpenGL ES 2.0
Tegra K1 28 Integrated 192:8:4 900 2133 N/A 17.1 64 LPDDR3 365.0 15.2 5 (SoC) 73.0 3.0 D3D 11.2, OpenGL 4.4, OpenGL ES 3.1, CUDA 6.0, OpenCL 1.2
Tegra X1 20 Integrated 256:16:16 1000 3200 N/A 25.6 64 LPDDR4 512.0  ??? 10 (SoC) 51.2  ??? D3D 12.0, OpenGL 4.5, OpenGL ES 3.1, CUDA 6.0, OpenCL 1.2

Professional

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Quadro K6000 28 PCIe 3.0 x16 2880:240:48 901.5 6008 12G 288 384 GDDR5 5196 1732 225 23.1 7.7 D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
Quadro M6000 28 PCIe 3.0 x16 3072:192:96 988 6612 12G 317 384 GDDR5 6070  ??? 250 24.3  ??? D3D 12.0, OpenGL 4.5, CUDA 6.5, OpenCL 1.2

Qualcomm

Mobile

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Adreno 330 28 (Snapdragon 800) Integrated 128 450 1600 N/A 12.8 64 LPDDR3 129.6 0 4 (SoC) 32.4 0 D3D 9.3, OpenGL ES 3.0, OpenCL 1.2
Adreno 420 28 (Snapdragon 805) Integrated 128 600 1600 N/A 25.6 128 LPDDR3 172.8  ???  ???  ???  ??? D3D 11.2, OpenGL ES 3.1, OpenCL 1.2
Adreno 430 20 (Snapdragon 810) Integrated 192 600 3200 N/A 25.6 64 LPDDR4 388.0  ???  ???  ???  ??? D3D 11.2, OpenGL ES 3.1, OpenCL 1.2
Adreno 530 14 (Snapdragon 820) Integrated 256 650-736 3732 N/A 29.8 64 LPDDR4  ???  ???  ???  ???  ??? D3D 12.1, OpenGL ES 3.2, OpenCL 2.0

Vivante

Mobile

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
GC4000 40 (K3V2) Integrated 32 480 1000 N/A 8.0 64 LPDDR2 30.7 0 4 (SoC) 7.7 0 D3D 9.3, OpenGL ES 3.0, OpenGL 3.0, OpenCL 1.2
GC5000  ??? Integrated 32  ???  ??? N/A  ???  ???  ???  ???  ???  ???  ???  ??? D3D 11.0, OpenGL ES 3.0, OpenGL 3.0, OpenCL 1.2
GC6000  ??? Integrated 64  ???  ??? N/A  ???  ???  ???  ???  ???  ???  ???  ??? D3D 11.0, OpenGL ES 3.0, OpenGL 3.0, OpenCL 1.2
GC7000  ??? Integrated 128  ???  ??? N/A  ???  ???  ???  ???  ???  ???  ???  ??? D3D 11.0, OpenGL ES 3.1, OpenGL 3.0, OpenCL 1.2

See Also

GPU GFLOPS

GPUs Comparison: ARM Mali vs Vivante GCxxx vs PowerVR SGX vs Nvidia Geforce ULP