Difference between revisions of "GPUs comparison"

From KlayGE
Jump to: navigation, search
(Separate into small tables)
Line 1: Line 1:
 
Those data are collected from [http://en.wikipedia.org wikipedia] and vendor websites.
 
Those data are collected from [http://en.wikipedia.org wikipedia] and vendor websites.
 +
 +
== AMD ==
 +
 +
=== Desktop ===
  
 
{| class="wikitable sortable"
 
{| class="wikitable sortable"
 
|-
 
|-
! !! !! !! !! !! !! ! colspan="2" | Clock rate (MHz) !! ! colspan="4" | Memory !! ! colspan="2" | GFLOPS !! !! ! colspan="2" | GFLOPS/W !!
+
! !! !! !! !! ! colspan="2" | Clock rate (MHz) !! ! colspan="4" | Memory !! ! colspan="2" | GFLOPS !! !! ! colspan="2" | GFLOPS/W !!
 
|-
 
|-
! Vendor !! Type !! Model !! Fab (nm) !! Bus interface !! Core config !! Core !! Memory !! Size !! Bandwidth (GB/s) !! Bus (bit) !! Type !! Float !! Double !! TDP (watts) !! Float !! Double !! API
+
! Model !! Fab (nm) !! Bus interface !! Core config !! Core !! Memory !! Size !! Bandwidth (GB/s) !! Bus (bit) !! Type !! Float !! Double !! TDP (watts) !! Float !! Double !! API
 
|-
 
|-
| NVIDIA || Desktop || GeForce GTX 680 || 28 || PCIe 3.0 x16 || 1536:128:32 || 1006-1110 || 6008 || 2G || 192 || 256 || GDDR5 || 3090 || 128.8 || 195 || 15.8 || 0.7 || D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
+
| Fusion APU 8670D || 32 || Integrated || 384:24:8 || 844 || 2133 || N/A || 34.1 || 128 || DDR3 || 648.2 || 0 || 100 (SoC) || 6.5 || 0 || D3D 11.0, OpenGL 4.3, OpenCL 1.2
 
|-
 
|-
| NVIDIA || Desktop || GeForce GTX 780 || 28 || PCIe 3.0 x16 || 2304:192:48 || 863-1002 || 6008 || 3G || 288 || 384 || GDDR5 || 3977 || 165.7 || 250 || 15.9 || 0.7 || D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
+
| Fusion APU R7 || 28 || Integrated || 512:32:8 || 720 || 2133 || N/A || 34.1 || 128 || DDR3 || 737.3 || 46.1 || 95 (SoC) || 7.8 || 0.5 || D3D 11.2, OpenGL 4.3, OpenCL 1.2, Mantle
 
|-
 
|-
| NVIDIA || Desktop || GeForce GTX Titan || 28 || PCIe 3.0 x16 || 2688:224:48 || 836-993 || 6008 || 6G || 288 || 384 || GDDR5 || 4500 || 1300-1500 || 250 || 18.0 || 6.0 || D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
+
| Radeon HD 8970 || 28 || PCIe 3.0 x16 || 2048:128:32 || 1000-1050 || 6000 || 3G || 288 || 384 || GDDR5 || 4300 || 1075 || 250 || 17.2 || 4.3 || D3D 11.1, OpenGL 4.3, OpenCL 1.2
 
|-
 
|-
| NVIDIA || Desktop || GeForce GTX Titan Black || 28 || PCIe 3.0 x16 || 2880:240:48 || 889-980 || 7000 || 6G || 336 || 384 || GDDR5 || 5120 || 1707 || 250 || 20.5 || 6.8 || D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
+
| Radeon R9 290X || 28 || PCIe 3.0 x16 || 2816:176:64 || 800-1000 || 5000 || 4G || 320 || 512 || GDDR5 || 5632 || 704 || 290 || 19.4 || 2.4 || D3D 11.2, OpenGL 4.3, OpenCL 1.2, Mantle
 +
|}
 +
 
 +
=== Professional ===
 +
 
 +
{| class="wikitable sortable"
 
|-
 
|-
| NVIDIA || Desktop || GeForce GTX 780 Ti || 28 || PCIe 3.0 x16 || 2880:240:48 || 876-928 || 7000 || 3G || 336 || 384 || GDDR5 || 5046 || 210 || 250 || 20.2 || 0.8 || D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
+
! !! !! !! !! ! colspan="2" | Clock rate (MHz) !! ! colspan="4" | Memory !! ! colspan="2" | GFLOPS !! !! ! colspan="2" | GFLOPS/W !!
 
|-
 
|-
| NVIDIA || Professional || Quadro K6000 || 28 || PCIe 3.0 x16 || 2880:240:48 || 901.5 || 6008 || 12G || 288 || 384 || GDDR5 || 5196 || 1732 || 225 || 23.1 || 7.7 || D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
+
! Model !! Fab (nm) !! Bus interface !! Core config !! Core !! Memory !! Size !! Bandwidth (GB/s) !! Bus (bit) !! Type !! Float !! Double !! TDP (watts) !! Float !! Double !! API
 
|-
 
|-
| NVIDIA || Computing || Tesla K20X || 28 || PCIe 3.0 x16 || 2688 FP32, 896 FP64 || 732 || 5200 || 6G || 250 || 384 || GDDR5 || 3935 || 1312 || 235 || 16.7 || 5.6 || D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
+
| Radeon W9000 || 28 || PCIe 3.0 x16 || 2048:128:32 || 975 || 5500 || 6G || 264 || 384 || GDDR5 || 3993.6 || 998.4 || 274 || 12.4 || 3.6 || D3D 11.1, OpenGL 4.2, OpenCL 1.2
 +
|}
 +
 
 +
== ARM ==
 +
 
 +
=== Mobile ===
 +
 
 +
{| class="wikitable sortable"
 
|-
 
|-
| NVIDIA || Mobile || Tegra 3 || 40 || Integrated || 12 || 520 || 1600 || N/A || 6.4 || 32 || LPDDR3 || 12.5 || 0 || 2 (SoC) || 6.3 || 0 || D3D 9.1, OpenGL ES 2.0
+
! !! !! !! !! ! colspan="2" | Clock rate (MHz) !! ! colspan="4" | Memory !! ! colspan="2" | GFLOPS !! !! ! colspan="2" | GFLOPS/W !!
 
|-
 
|-
| NVIDIA || Mobile || Tegra 4 || 28 || Integrated || 72 || 672 || 1866 || N/A || 14.9 || 64 || LPDDR3 || 96.8 || 0 || 4 (SoC) || 24.2 || 0 || D3D 9.1, OpenGL ES 2.0
+
! Model !! Fab (nm) !! Bus interface !! Core config !! Core !! Memory !! Size !! Bandwidth (GB/s) !! Bus (bit) !! Type !! Float !! Double !! TDP (watts) !! Float !! Double !! API
 
|-
 
|-
| NVIDIA || Mobile || Tegra K1 || 28 || Integrated || 192:8:4 || 900 || 2133 || N/A || 17.1 || 64 || LPDDR3 || 365.0 || 15.2 || 5 (SoC) || 73.0 || 3.0 || D3D 11.2, OpenGL 4.4, OpenGL ES 3.1, CUDA 6.0, OpenCL 1.2
+
| Mali-T604 MP4 || 32 (Exynos 5 Dual) || Integrated || 64 || 533 || 1600 || N/A || 12.8 || 64 || LPDDR3 || 68 || 0 || 4 (SoC) || 17.0 || 0 || D3D 9.1, OpenGL ES 3.0, OpenCL 1.1
 
|-
 
|-
| AMD || Desktop || Fusion APU 8670D || 32 || Integrated || 384:24:8 || 844 || 2133 || N/A || 34.1 || 128 || DDR3 || 648.2 || 0 || 100 (SoC) || 6.5 || 0 || D3D 11.0, OpenGL 4.3, OpenCL 1.2
+
| Mali-T760 MP16 || ??? || Integrated || ??? || 600 || ??? || N/A || ??? || ??? || LPDDR3 || 326.4 || 0 || ??? || ??? || 0 || D3D 11.1, OpenGL ES 3.0, OpenCL 1.1
 +
|}
 +
 
 +
== Imagination ==
 +
 
 +
=== Mobile ===
 +
 
 +
{| class="wikitable sortable"
 
|-
 
|-
| AMD || Desktop || Fusion APU R7 || 28 || Integrated || 512:32:8 || 720 || 2133 || N/A || 34.1 || 128 || DDR3 || 737.3 || 46.1 || 95 (SoC) || 7.8 || 0.5 || D3D 11.2, OpenGL 4.3, OpenCL 1.2, Mantle
+
! !! !! !! !! ! colspan="2" | Clock rate (MHz) !! ! colspan="4" | Memory !! ! colspan="2" | GFLOPS !! !! ! colspan="2" | GFLOPS/W !!
 
|-
 
|-
| AMD || Desktop || Radeon HD 8970 || 28 || PCIe 3.0 x16 || 2048:128:32 || 1000-1050 || 6000 || 3G || 288 || 384 || GDDR5 || 4300 || 1075 || 250 || 17.2 || 4.3 || D3D 11.1, OpenGL 4.3, OpenCL 1.2
+
! Model !! Fab (nm) !! Bus interface !! Core config !! Core !! Memory !! Size !! Bandwidth (GB/s) !! Bus (bit) !! Type !! Float !! Double !! TDP (watts) !! Float !! Double !! API
 
|-
 
|-
| AMD || Professional || Radeon W9000 || 28 || PCIe 3.0 x16 || 2048:128:32 || 975 || 5500 || 6G || 264 || 384 || GDDR5 || 3993.6 || 998.4 || 274 || 12.4 || 3.6 || D3D 11.1, OpenGL 4.2, OpenCL 1.2
+
| PowerVR SGX554 MP4 || 32 (A6X) || Integrated || 128 || 300 || 1066 || N/A || 17.1 || 128 || LPDDR2 || 76.8 || 0 || 4 (SoC) || 19.2 || 0 || D3D 9.3, OpenGL 2.1, OpenGL ES 2.0, OpenCL 1.1
 
|-
 
|-
| AMD || Desktop || Radeon R9 290X || 28 || PCIe 3.0 x16 || 2816:176:64 || 800-1000 || 5000 || 4G || 320 || 512 || GDDR5 || 5632 || 704 || 290 || 19.4 || 2.4 || D3D 11.2, OpenGL 4.3, OpenCL 1.2, Mantle
+
| G6430 || 28 (A7) || Integrated || 256 || 450 || 1333 || N/A || 10.7 || 64 || LPDDR3 || 115.2 || 0 || 4 (SoC) || 28.8 || 0 || D3D 10.0, OpenGL 3.2, OpenGL ES 3.0
 +
|}
 +
 
 +
== Intel ==
 +
 
 +
=== Computing ===
 +
 
 +
{| class="wikitable sortable"
 
|-
 
|-
| Intel || Desktop || Iris Pro Graphics 5200 || 22 || Integrated || 160:8:4 || 200-1300 || 1600 || N/A || 25.6 || 128 || DDR3 || 832 || 208 || 65 (SoC) || 12.8 || 3.2 || D3D 11.1, OpenGL 4.2, OpenCL 1.2
+
! !! !! !! !! ! colspan="2" | Clock rate (MHz) !! ! colspan="4" | Memory !! ! colspan="2" | GFLOPS !! !! ! colspan="2" | GFLOPS/W !!
 
|-
 
|-
| Intel || Computing || Xeon Phi 3100P || 22 || PCIe 2.0 x16 || 228 x86 || 1100 || 3750 || 6G || 240 || 512 || GDDR5 || 2000 || 1000 || 300 || 6.7 || 3.3 || OpenMP, OpenCL, MKL
+
! Model !! Fab (nm) !! Bus interface !! Core config !! Core !! Memory !! Size !! Bandwidth (GB/s) !! Bus (bit) !! Type !! Float !! Double !! TDP (watts) !! Float !! Double !! API
 
|-
 
|-
| Intel || Computing || Xeon Phi 5110P || 22 || PCIe 2.0 x16 || 240 x86 || 1053 || 5000 || 8G || 320 || 512 || GDDR5 || 2022 || 1011 || 225 || 9.0 || 4.5 || OpenMP, OpenCL, MKL
+
| Xeon Phi 3100P || 22 || PCIe 2.0 x16 || 228 x86 || 1100 || 3750 || 6G || 240 || 512 || GDDR5 || 2000 || 1000 || 300 || 6.7 || 3.3 || OpenMP, OpenCL, MKL
 
|-
 
|-
| Intel || Computing || Xeon Phi 7120P || 22 || PCIe 2.0 x16 || 244 x86 || 1238-1333 || 5500 || 16G || 352 || 512 || GDDR5 || 2416 || 1208 || 300 || 8.1 || 4.0 || OpenMP, OpenCL, MKL
+
| Xeon Phi 5110P || 22 || PCIe 2.0 x16 || 240 x86 || 1053 || 5000 || 8G || 320 || 512 || GDDR5 || 2022 || 1011 || 225 || 9.0 || 4.5 || OpenMP, OpenCL, MKL
 
|-
 
|-
| Qualcomm || Mobile || Adreno 330 || 28 (Snapdragon 800) || Integrated || 128 || 450 || 1600 || N/A || 12.8 || 64 || LPDDR3 || 129.6 || 0 || 4 (SoC) || 32.4 || 0 || D3D 9.3, OpenGL ES 3.0, OpenCL 1.2
+
| Xeon Phi 7120P || 22 || PCIe 2.0 x16 || 244 x86 || 1238-1333 || 5500 || 16G || 352 || 512 || GDDR5 || 2416 || 1208 || 300 || 8.1 || 4.0 || OpenMP, OpenCL, MKL
 +
|}
 +
 
 +
=== Desktop ===
 +
 
 +
{| class="wikitable sortable"
 
|-
 
|-
| Qualcomm || Mobile || Adreno 420 || 28 (Snapdragon 805) || Integrated || ??? || 500 || 1600 || N/A || 25.6 || 128 || LPDDR3 || ??? || ??? || ??? || ??? || ??? || D3D 11.2, OpenGL ES 3.1, OpenCL 1.2
+
! !! !! !! !! ! colspan="2" | Clock rate (MHz) !! ! colspan="4" | Memory !! ! colspan="2" | GFLOPS !! !! ! colspan="2" | GFLOPS/W !!
 
|-
 
|-
| Qualcomm || Mobile || Adreno 430 || 20 (Snapdragon 810) || Integrated || ??? || ??? || 3200 || N/A || 25.6 || 64 || LPDDR4 || ??? || ??? || ??? || ??? || ??? || D3D 11.2, OpenGL ES 3.1, OpenCL 1.2
+
! Model !! Fab (nm) !! Bus interface !! Core config !! Core !! Memory !! Size !! Bandwidth (GB/s) !! Bus (bit) !! Type !! Float !! Double !! TDP (watts) !! Float !! Double !! API
 
|-
 
|-
| Imagination || Mobile || PowerVR SGX554 MP4 || 32 (A6X) || Integrated || 128 || 300 || 1066 || N/A || 17.1 || 128 || LPDDR2 || 76.8 || 0 || 4 (SoC) || 19.2 || 0 || D3D 9.3, OpenGL 2.1, OpenGL ES 2.0, OpenCL 1.1
+
| Iris Pro Graphics 5200 || 22 || Integrated || 160:8:4 || 200-1300 || 1600 || N/A || 25.6 || 128 || DDR3 || 832 || 208 || 65 (SoC) || 12.8 || 3.2 || D3D 11.1, OpenGL 4.2, OpenCL 1.2
 +
|}
 +
 
 +
== NVIDIA ==
 +
 
 +
=== Computing ===
 +
 
 +
{| class="wikitable sortable"
 
|-
 
|-
| Imagination || Mobile || G6430 || 28 (A7) || Integrated || 256 || 450 || 1333 || N/A || 10.7 || 64 || LPDDR3 || 115.2 || 0 || 4 (SoC) || 28.8 || 0 || D3D 10.0, OpenGL 3.2, OpenGL ES 3.0
+
! !! !! !! !! ! colspan="2" | Clock rate (MHz) !! ! colspan="4" | Memory !! ! colspan="2" | GFLOPS !! !! ! colspan="2" | GFLOPS/W !!
 +
|-
 +
! Model !! Fab (nm) !! Bus interface !! Core config !! Core !! Memory !! Size !! Bandwidth (GB/s) !! Bus (bit) !! Type !! Float !! Double !! TDP (watts) !! Float !! Double !! API
 +
|-
 +
| Tesla K20X || 28 || PCIe 3.0 x16 || 2688 FP32, 896 FP64 || 732 || 5200 || 6G || 250 || 384 || GDDR5 || 3935 || 1312 || 235 || 16.7 || 5.6 || D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
 +
|}
 +
 
 +
=== Desktop ===
 +
 
 +
{| class="wikitable sortable"
 +
|-
 +
! !! !! !! !! ! colspan="2" | Clock rate (MHz) !! ! colspan="4" | Memory !! ! colspan="2" | GFLOPS !! !! ! colspan="2" | GFLOPS/W !!
 +
|-
 +
! Model !! Fab (nm) !! Bus interface !! Core config !! Core !! Memory !! Size !! Bandwidth (GB/s) !! Bus (bit) !! Type !! Float !! Double !! TDP (watts) !! Float !! Double !! API
 +
|-
 +
| GeForce GTX 680 || 28 || PCIe 3.0 x16 || 1536:128:32 || 1006-1110 || 6008 || 2G || 192 || 256 || GDDR5 || 3090 || 128.8 || 195 || 15.8 || 0.7 || D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
 +
|-
 +
| GeForce GTX 780 || 28 || PCIe 3.0 x16 || 2304:192:48 || 863-1002 || 6008 || 3G || 288 || 384 || GDDR5 || 3977 || 165.7 || 250 || 15.9 || 0.7 || D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
 +
|-
 +
| GeForce GTX Titan || 28 || PCIe 3.0 x16 || 2688:224:48 || 836-993 || 6008 || 6G || 288 || 384 || GDDR5 || 4500 || 1300-1500 || 250 || 18.0 || 6.0 || D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
 +
|-
 +
| GeForce GTX Titan Black || 28 || PCIe 3.0 x16 || 2880:240:48 || 889-980 || 7000 || 6G || 336 || 384 || GDDR5 || 5120 || 1707 || 250 || 20.5 || 6.8 || D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
 +
|-
 +
| GeForce GTX 780 Ti || 28 || PCIe 3.0 x16 || 2880:240:48 || 876-928 || 7000 || 3G || 336 || 384 || GDDR5 || 5046 || 210 || 250 || 20.2 || 0.8 || D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
 +
|}
 +
 
 +
=== Mobile ===
 +
 
 +
{| class="wikitable sortable"
 +
|-
 +
! !! !! !! !! ! colspan="2" | Clock rate (MHz) !! ! colspan="4" | Memory !! ! colspan="2" | GFLOPS !! !! ! colspan="2" | GFLOPS/W !!
 +
|-
 +
! Model !! Fab (nm) !! Bus interface !! Core config !! Core !! Memory !! Size !! Bandwidth (GB/s) !! Bus (bit) !! Type !! Float !! Double !! TDP (watts) !! Float !! Double !! API
 +
|-
 +
| Tegra 3 || 40 || Integrated || 12 || 520 || 1600 || N/A || 6.4 || 32 || LPDDR3 || 12.5 || 0 || 2 (SoC) || 6.3 || 0 || D3D 9.1, OpenGL ES 2.0
 +
|-
 +
| Tegra 4 || 28 || Integrated || 72 || 672 || 1866 || N/A || 14.9 || 64 || LPDDR3 || 96.8 || 0 || 4 (SoC) || 24.2 || 0 || D3D 9.1, OpenGL ES 2.0
 +
|-
 +
| Tegra K1 || 28 || Integrated || 192:8:4 || 900 || 2133 || N/A || 17.1 || 64 || LPDDR3 || 365.0 || 15.2 || 5 (SoC) || 73.0 || 3.0 || D3D 11.2, OpenGL 4.4, OpenGL ES 3.1, CUDA 6.0, OpenCL 1.2
 +
|}
 +
 
 +
=== Professional ===
 +
 
 +
{| class="wikitable sortable"
 +
|-
 +
! !! !! !! !! ! colspan="2" | Clock rate (MHz) !! ! colspan="4" | Memory !! ! colspan="2" | GFLOPS !! !! ! colspan="2" | GFLOPS/W !!
 +
|-
 +
! Model !! Fab (nm) !! Bus interface !! Core config !! Core !! Memory !! Size !! Bandwidth (GB/s) !! Bus (bit) !! Type !! Float !! Double !! TDP (watts) !! Float !! Double !! API
 +
|-
 +
| Quadro K6000 || 28 || PCIe 3.0 x16 || 2880:240:48 || 901.5 || 6008 || 12G || 288 || 384 || GDDR5 || 5196 || 1732 || 225 || 23.1 || 7.7 || D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
 +
|}
 +
 
 +
== Qualcomm ==
 +
 
 +
=== Mobile ===
 +
 
 +
{| class="wikitable sortable"
 +
|-
 +
! !! !! !! !! ! colspan="2" | Clock rate (MHz) !! ! colspan="4" | Memory !! ! colspan="2" | GFLOPS !! !! ! colspan="2" | GFLOPS/W !!
 +
|-
 +
! Model !! Fab (nm) !! Bus interface !! Core config !! Core !! Memory !! Size !! Bandwidth (GB/s) !! Bus (bit) !! Type !! Float !! Double !! TDP (watts) !! Float !! Double !! API
 +
|-
 +
| Adreno 330 || 28 (Snapdragon 800) || Integrated || 128 || 450 || 1600 || N/A || 12.8 || 64 || LPDDR3 || 129.6 || 0 || 4 (SoC) || 32.4 || 0 || D3D 9.3, OpenGL ES 3.0, OpenCL 1.2
 +
|-
 +
| Adreno 420 || 28 (Snapdragon 805) || Integrated || ??? || 500 || 1600 || N/A || 25.6 || 128 || LPDDR3 || ??? || ??? || ??? || ??? || ??? || D3D 11.2, OpenGL ES 3.1, OpenCL 1.2
 +
|-
 +
| Adreno 430 || 20 (Snapdragon 810) || Integrated || ??? || ??? || 3200 || N/A || 25.6 || 64 || LPDDR4 || ??? || ??? || ??? || ??? || ??? || D3D 11.2, OpenGL ES 3.1, OpenCL 1.2
 +
|}
 +
 
 +
== Vivante ==
 +
 
 +
=== Mobile ===
 +
 
 +
{| class="wikitable sortable"
 
|-
 
|-
| ARM || Moblie || Mali-T604 MP4 || 32 (Exynos 5 Dual) || Integrated || 64 || 533 || 1600 || N/A || 12.8 || 64 || LPDDR3 || 68 || 0 || 4 (SoC) || 17.0 || 0 || D3D 9.1, OpenGL ES 3.0, OpenCL 1.1
+
! !! !! !! !! ! colspan="2" | Clock rate (MHz) !! ! colspan="4" | Memory !! ! colspan="2" | GFLOPS !! !! ! colspan="2" | GFLOPS/W !!
 
|-
 
|-
| ARM || Mobile || Mali-T760 MP16 || ??? || Integrated || ??? || 600 || ??? || N/A || ??? || ??? || LPDDR3 || 326.4 || 0 || ??? || ??? || 0 || D3D 11.1, OpenGL ES 3.0, OpenCL 1.1
+
! Model !! Fab (nm) !! Bus interface !! Core config !! Core !! Memory !! Size !! Bandwidth (GB/s) !! Bus (bit) !! Type !! Float !! Double !! TDP (watts) !! Float !! Double !! API
 
|-
 
|-
| Vivante || Mobile || GC4000 || 40 (K3V2) || Integrated || 32 || 480 || 1000 || N/A || 8.0 || 64 || LPDDR2 || 30.7 || 0 || 4 (SoC) || 7.7 || 0 || D3D 9.3, OpenGL ES 3.0, OpenGL 3.0, OpenCL 1.2
+
| GC4000 || 40 (K3V2) || Integrated || 32 || 480 || 1000 || N/A || 8.0 || 64 || LPDDR2 || 30.7 || 0 || 4 (SoC) || 7.7 || 0 || D3D 9.3, OpenGL ES 3.0, OpenGL 3.0, OpenCL 1.2
 
|-
 
|-
| Vivante || Mobile || GC5000 || ??? || Integrated || 32 || ??? || ??? || N/A || ??? || ??? || ??? || ??? || ??? || ??? || ??? || ??? || D3D 11.0, OpenGL ES 3.0, OpenGL 3.0, OpenCL 1.2
+
| GC5000 || ??? || Integrated || 32 || ??? || ??? || N/A || ??? || ??? || ??? || ??? || ??? || ??? || ??? || ??? || D3D 11.0, OpenGL ES 3.0, OpenGL 3.0, OpenCL 1.2
 
|-
 
|-
| Vivante || Mobile || GC6000 || ??? || Integrated || 64 || ??? || ??? || N/A || ??? || ??? || ??? || ??? || ??? || ??? || ??? || ??? || D3D 11.0, OpenGL ES 3.0, OpenGL 3.0, OpenCL 1.2
+
| GC6000 || ??? || Integrated || 64 || ??? || ??? || N/A || ??? || ??? || ??? || ??? || ??? || ??? || ??? || ??? || D3D 11.0, OpenGL ES 3.0, OpenGL 3.0, OpenCL 1.2
 
|-
 
|-
| Vivante || Mobile || GC7000 || ??? || Integrated || 128 || ??? || ??? || N/A || ??? || ??? || ??? || ??? || ??? || ??? || ??? || ??? || D3D 11.0, OpenGL ES 3.1, OpenGL 3.0, OpenCL 1.2
+
| GC7000 || ??? || Integrated || 128 || ??? || ??? || N/A || ??? || ??? || ??? || ??? || ??? || ??? || ??? || ??? || D3D 11.0, OpenGL ES 3.1, OpenGL 3.0, OpenCL 1.2
 
|}
 
|}
  

Revision as of 09:55, 6 June 2014

Those data are collected from wikipedia and vendor websites.

AMD

Desktop

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Fusion APU 8670D 32 Integrated 384:24:8 844 2133 N/A 34.1 128 DDR3 648.2 0 100 (SoC) 6.5 0 D3D 11.0, OpenGL 4.3, OpenCL 1.2
Fusion APU R7 28 Integrated 512:32:8 720 2133 N/A 34.1 128 DDR3 737.3 46.1 95 (SoC) 7.8 0.5 D3D 11.2, OpenGL 4.3, OpenCL 1.2, Mantle
Radeon HD 8970 28 PCIe 3.0 x16 2048:128:32 1000-1050 6000 3G 288 384 GDDR5 4300 1075 250 17.2 4.3 D3D 11.1, OpenGL 4.3, OpenCL 1.2
Radeon R9 290X 28 PCIe 3.0 x16 2816:176:64 800-1000 5000 4G 320 512 GDDR5 5632 704 290 19.4 2.4 D3D 11.2, OpenGL 4.3, OpenCL 1.2, Mantle

Professional

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Radeon W9000 28 PCIe 3.0 x16 2048:128:32 975 5500 6G 264 384 GDDR5 3993.6 998.4 274 12.4 3.6 D3D 11.1, OpenGL 4.2, OpenCL 1.2

ARM

Mobile

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Mali-T604 MP4 32 (Exynos 5 Dual) Integrated 64 533 1600 N/A 12.8 64 LPDDR3 68 0 4 (SoC) 17.0 0 D3D 9.1, OpenGL ES 3.0, OpenCL 1.1
Mali-T760 MP16  ??? Integrated  ??? 600  ??? N/A  ???  ??? LPDDR3 326.4 0  ???  ??? 0 D3D 11.1, OpenGL ES 3.0, OpenCL 1.1

Imagination

Mobile

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
PowerVR SGX554 MP4 32 (A6X) Integrated 128 300 1066 N/A 17.1 128 LPDDR2 76.8 0 4 (SoC) 19.2 0 D3D 9.3, OpenGL 2.1, OpenGL ES 2.0, OpenCL 1.1
G6430 28 (A7) Integrated 256 450 1333 N/A 10.7 64 LPDDR3 115.2 0 4 (SoC) 28.8 0 D3D 10.0, OpenGL 3.2, OpenGL ES 3.0

Intel

Computing

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Xeon Phi 3100P 22 PCIe 2.0 x16 228 x86 1100 3750 6G 240 512 GDDR5 2000 1000 300 6.7 3.3 OpenMP, OpenCL, MKL
Xeon Phi 5110P 22 PCIe 2.0 x16 240 x86 1053 5000 8G 320 512 GDDR5 2022 1011 225 9.0 4.5 OpenMP, OpenCL, MKL
Xeon Phi 7120P 22 PCIe 2.0 x16 244 x86 1238-1333 5500 16G 352 512 GDDR5 2416 1208 300 8.1 4.0 OpenMP, OpenCL, MKL

Desktop

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Iris Pro Graphics 5200 22 Integrated 160:8:4 200-1300 1600 N/A 25.6 128 DDR3 832 208 65 (SoC) 12.8 3.2 D3D 11.1, OpenGL 4.2, OpenCL 1.2

NVIDIA

Computing

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Tesla K20X 28 PCIe 3.0 x16 2688 FP32, 896 FP64 732 5200 6G 250 384 GDDR5 3935 1312 235 16.7 5.6 D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2

Desktop

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
GeForce GTX 680 28 PCIe 3.0 x16 1536:128:32 1006-1110 6008 2G 192 256 GDDR5 3090 128.8 195 15.8 0.7 D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
GeForce GTX 780 28 PCIe 3.0 x16 2304:192:48 863-1002 6008 3G 288 384 GDDR5 3977 165.7 250 15.9 0.7 D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
GeForce GTX Titan 28 PCIe 3.0 x16 2688:224:48 836-993 6008 6G 288 384 GDDR5 4500 1300-1500 250 18.0 6.0 D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
GeForce GTX Titan Black 28 PCIe 3.0 x16 2880:240:48 889-980 7000 6G 336 384 GDDR5 5120 1707 250 20.5 6.8 D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2
GeForce GTX 780 Ti 28 PCIe 3.0 x16 2880:240:48 876-928 7000 3G 336 384 GDDR5 5046 210 250 20.2 0.8 D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2

Mobile

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Tegra 3 40 Integrated 12 520 1600 N/A 6.4 32 LPDDR3 12.5 0 2 (SoC) 6.3 0 D3D 9.1, OpenGL ES 2.0
Tegra 4 28 Integrated 72 672 1866 N/A 14.9 64 LPDDR3 96.8 0 4 (SoC) 24.2 0 D3D 9.1, OpenGL ES 2.0
Tegra K1 28 Integrated 192:8:4 900 2133 N/A 17.1 64 LPDDR3 365.0 15.2 5 (SoC) 73.0 3.0 D3D 11.2, OpenGL 4.4, OpenGL ES 3.1, CUDA 6.0, OpenCL 1.2

Professional

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Quadro K6000 28 PCIe 3.0 x16 2880:240:48 901.5 6008 12G 288 384 GDDR5 5196 1732 225 23.1 7.7 D3D 11.0, OpenGL 4.4, CUDA 6.0, OpenCL 1.2

Qualcomm

Mobile

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
Adreno 330 28 (Snapdragon 800) Integrated 128 450 1600 N/A 12.8 64 LPDDR3 129.6 0 4 (SoC) 32.4 0 D3D 9.3, OpenGL ES 3.0, OpenCL 1.2
Adreno 420 28 (Snapdragon 805) Integrated  ??? 500 1600 N/A 25.6 128 LPDDR3  ???  ???  ???  ???  ??? D3D 11.2, OpenGL ES 3.1, OpenCL 1.2
Adreno 430 20 (Snapdragon 810) Integrated  ???  ??? 3200 N/A 25.6 64 LPDDR4  ???  ???  ???  ???  ??? D3D 11.2, OpenGL ES 3.1, OpenCL 1.2

Vivante

Mobile

Clock rate (MHz) Memory GFLOPS GFLOPS/W
Model Fab (nm) Bus interface Core config Core Memory Size Bandwidth (GB/s) Bus (bit) Type Float Double TDP (watts) Float Double API
GC4000 40 (K3V2) Integrated 32 480 1000 N/A 8.0 64 LPDDR2 30.7 0 4 (SoC) 7.7 0 D3D 9.3, OpenGL ES 3.0, OpenGL 3.0, OpenCL 1.2
GC5000  ??? Integrated 32  ???  ??? N/A  ???  ???  ???  ???  ???  ???  ???  ??? D3D 11.0, OpenGL ES 3.0, OpenGL 3.0, OpenCL 1.2
GC6000  ??? Integrated 64  ???  ??? N/A  ???  ???  ???  ???  ???  ???  ???  ??? D3D 11.0, OpenGL ES 3.0, OpenGL 3.0, OpenCL 1.2
GC7000  ??? Integrated 128  ???  ??? N/A  ???  ???  ???  ???  ???  ???  ???  ??? D3D 11.0, OpenGL ES 3.1, OpenGL 3.0, OpenCL 1.2

See Also

GPU GFLOPS

GPUs Comparison: ARM Mali vs Vivante GCxxx vs PowerVR SGX vs Nvidia Geforce ULP