Hardware Museum

Over 20 years of PC history

Logo

The Ultimate GPU Benchmark (2006 - 2010)

Published: (last update )


Two years later, part 2 is ready. New games added and of course all important video cards of selected era. This time ending with Radeon 5000 series and GeForce 400 series.

Introduction

Welcome to the second roud of ultimate GPU benchmarks. This time I focused on the DX10 era and also the first generation of the DX11 GPUs. To make full use of the test system, many multi-GPU setups are included, even 3-way SLI and CF. To add something special to this article, I decided to benchmark Nvidia Tesla C2075 card. It doesn't exactly fit the set limits of first generation Fermi, but is too interesting to leave it out. The default "Tesla mode" is very picky about drivers, so with some effort I managed to mod the BIOS to change it to Quadro 7000. Of course the GPU is still partially locked, no change here.

The test system got slight upgrade in the memory department - old DDR3 1600 was switched for DDR3 2133. Still 4x16 GB. The performance impact is almost zero... but just for the sake of completeness. :)





Test System

Test System - Hardware

Test System - OS and Drivers

Test System - Games




Nvidia Quadro FX 5600


Tested Video Cards

Radeon HD 2900 XTFireGL V8650 OCRadeon HD 3850Radeon HD 38702 × Radeon HD 38703 × Radeon HD 3870Radeon HD 4670Radeon HD 4770
GPUR600R600RV670RV6702 × RV6703 × RV670RV730RV740
ArchitectureTerascaleTerascaleTerascaleTerascaleTerascaleTerascaleTerascaleTerascale
Technology80 nm80 nm55 nm55 nm55 nm55 nm55 nm40 nm
Die Size420 mm2420 mm2192 mm2192 mm22 × 192 mm23 × 192 mm2145 mm2137 mm2
Transistor Count720 mil.720 mil.666 mil.666 mil.2 × 666 mil.3 × 666 mil.514 mil.826 mil.
Transistor Density1.71 mil. / mm21.71 mil. / mm23.47 mil. / mm23.47 mil. / mm23.47 mil. / mm23.47 mil. / mm23.54 mil. / mm26.03 mil. / mm2
GPU Clock743 MHz850 MHz670 MHz775 MHz775 MHz775 MHz750 MHz750 MHz
Shader Clock743 MHz850 MHz670 MHz775 MHz775 MHz775 MHz750 MHz750 MHz
ROPs161616162 × 163 × 16816
TMUs161616162 × 163 × 163232
Compute Units44442 × 43 × 448
Shaders320 Unified320 Unified320 Unified320 Unified2 × 320 Unified3 × 320 Unified320 Unified640 Unified
L1 Cache16 × 32 kB (Tex.)16 × 32 kB (Tex.)16 × 32 kB (Tex.)16 × 32 kB (Tex.)2 × 16 × 32 kB (Tex.)3 × 16 × 32 kB (Tex.)4 × 16 kB8 × 16 kB
L2 Cache256 kB (Tex.)256 kB (Tex.)256 kB (Tex.)256 kB (Tex.)2 × 256 kB (Tex.)3 × 256 kB (Tex.)128 kB128 kB
Memory512 MB GDDR32048 MB GDDR4512 MB GDDR3512 MB GDDR4512 MB GDDR4512 MB GDDR4512 MB GDDR3512 MB GDDR5
Memory Clock1660 MHz2000 MHz1660 MHz2250 MHz2250 MHz2250 MHz2000 MHz3600 MHz
Bus Width512 bit512 bit256 bit256 bit256 bit256 bit128 bit128 bit
Memory Bandwidth106.2 GB/s128 GB/s53.1 GB/s72 GB/s2 × 72 GB/s3 × 72 GB/s32 GB/s57.6 GB/s
Fillrate (Pixel)11840 MP/s13600 MP/s10720 MP/s12400 MP/s2 × 12400 MP/s3 × 12400 MP/s6000 MP/s12000 MP/s
Fillrate (Texel)11840 MT/s13600 MT/s10720 MT/s12400 MT/s2 × 12400 MT/s3 × 12400 MT/s24000 MT/s24000 MT/s
Compute Power (FP32)476 GFLOPS544 GFLOPS428 GFLOPS496 GFLOPS2 × 496 GFLOPS3 × 496 GFLOPS480 GFLOPS960 GFLOPS
Compute Power (FP64)--85.6 GFLOPS99 GFLOPS2 × 99 GFLOPS3 × 99 GFLOPS-192 GFLOPS
Bus TypePCI-E 1.0PCI-E 1.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0
TDP215 W~300 W75 W106 W2 × 106 W3 × 106 W59 W80 W
DirectX101010.110.110.110.110.110.1
OpenGL3.33.33.33.33.33.33.33.3
Launch Year20072007200720072007200720082009


Radeon HD 4830Radeon HD 4850Radeon HD 4850 X2Radeon HD 4870 (512 MB) @ 625 MHzRadeon HD 4870 (512 MB)Radeon HD 4870 (1 GB)Radeon HD 4890 OC2 × Radeon HD 4890 OC
GPURV770RV7702 × RV770RV770RV770RV770RV7902 × RV790
ArchitectureTerascaleTerascaleTerascaleTerascaleTerascaleTerascaleTerascaleTerascale
Technology55 nm55 nm55 nm55 nm55 nm55 nm55 nm55 nm
Die Size256 mm2256 mm22 × 256 mm2256 mm2256 mm2256 mm2282 mm22 × 282 mm2
Transistor Count956 mil.956 mil.2 × 956 mil.956 mil.956 mil.956 mil.959 mil.2 × 959 mil.
Transistor Density3.73 mil. / mm23.73 mil. / mm23.73 mil. / mm23.73 mil. / mm23.73 mil. / mm23.73 mil. / mm23.4 mil. / mm23.4 mil. / mm2
GPU Clock575 MHz625 MHz625 MHz625 MHz750 MHz750 MHz900 MHz900 MHz
Shader Clock575 MHz625 MHz625 MHz625 MHz750 MHz750 MHz900 MHz900 MHz
ROPs16162 × 16161616162 × 16
TMUs32402 × 40404040402 × 40
Compute Units8102 × 10101010102 × 10
Shaders640 Unified800 Unified2 × 800 Unified800 Unified800 Unified800 Unified800 Unified2 × 800 Unified
L1 Cache8 × 16 kB10 × 16 kB2 × 10 × 16 kB10 × 16 kB10 × 16 kB10 × 16 kB10 × 16 kB2 × 10 × 16 kB
L2 Cache256 kB256 kB256 kB256 kB256 kB256 kB256 kB256 kB
Memory512 MB GDDR3512 MB GDDR31024 MB GDDR3512 MB GDDR5512 MB GDDR51024 MB GDDR51024 MB GDDR51024 MB GDDR5
Memory Clock1800 MHz2000 MHz2000 MHz3600 MHz3600 MHz3600 MHz4000 MHz4000 MHz
Bus Width256 bit256 bit2 ×256 bit256 bit256 bit256 bit256 bit2 × 256 bit
Memory Bandwidth57.6 GB/s64 GB/s2 × 64 GB/s115.2 GB/s115.2 GB/s115.2 GB/s128 GB/s2 × 128 GB/s
Fillrate (Pixel)9200 MP/s10000 MP/s2 × 10000 MP/s10000 MP/s12000 MP/s12000 MP/s14400 MP/s2 × 14400 MP/s
Fillrate (Texel)18400 MT/s25000 MT/s2 × 25000 MT/s25000 MT/s30000 MT/s30000 MT/s36000 MT/s2 × 36000 MT/s
Compute Power (FP32)736 GFLOPS1000 GFLOPS2 × 1000 GFLOPS1000 GFLOPS1200 GFLOPS1200 GFLOPS1440 GFLOPS2 × 1440 GFLOPS
Compute Power (FP64)147 GFLOPS200 GFLOPS2 × 200 GFLOPS200 GFLOPS240 GFLOPS240 GFLOPS288 GFLOPS2 × 288 GFLOPS
Bus TypePCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0
TDP95 W110 W250 W150 W150 W150 W190 W2 × 190 W
DirectX10.110.110.110.110.110.110.110.1
OpenGL3.33.33.33.33.33.33.33.3
Launch Year20082008200820082008200820092009


Radeon HD 5550Radeon HD 5670Radeon HD 5770Radeon HD 5770 OCRadeon HD 5850Radeon HD 5850 @ 850 MHzRadeon HD 5870Radeon HD 5870 OC
GPURedwoodRedwoodJuniperJuniperCypressCypressCypressCypress
ArchitectureTerascale 2Terascale 2Terascale 2Terascale 2Terascale 2Terascale 2Terascale 2Terascale 2
Technology40 nm40 nm40 nm40 nm40 nm40 nm40 nm40 nm
Die Size104 mm2104 mm2170 mm2170 mm2334 mm2334 mm2334 mm2334 mm2
Transistor Count627 mil.627 mil.1040 mil.1040 mil.2154 mil.2154 mil.2154 mil.2154 mil.
Transistor Density6.03 mil. / mm26.03 mil. / mm26.12 mil. / mm26.12 mil. / mm26.45 mil. / mm26.45 mil. / mm26.45 mil. / mm26.45 mil. / mm2
GPU Clock550 MHz775 MHz860 MHz1050 MHz725 MHz850 MHz850 MHz980 MHz
Shader Clock550 MHz775 MHz860 MHz1050 MHz725 MHz850 MHz850 MHz980 MHz
ROPs88161632323232
TMUs1620404072728080
Compute Units45101018182020
Shaders320 Unified400 Unified800 Unified800 Unified1440 Unified1440 Unified1600 Unified1600 Unified
L1 Cache4 × 8 kB5 × 8 kB10 × 8 kB10 × 8 kB18 × 8 kB18 × 8 kB20 × 8 kB20 × 8 kB
L2 Cache256 kB256 kB256 kB256 kB512 kB512 kB512 kB512 kB
Memory512 MB GDDR51024 MB GDDR51024 MB GDDR51024 MB GDDR51024 MB GDDR51024 MB GDDR51024 MB GDDR51024 MB GDDR5
Memory Clock4000 MHz4000 MHz4800 MHz5800 MHz4000 MHz4800 MHz4800 MHz5200 MHz
Bus Width128 bit128 bit128 bit128 bit256 bit256 bit256 bit256 bit
Memory Bandwidth64 GB/s64 GB/s76.8 GB/s92.8 GB/s128 GB/s153.6 GB/s153.6 GB/s166.4 GB/s
Fillrate (Pixel)4400 MP/s6200 MP/s13760 MP/s16800 MP/s23200 MP/s27200 MP/s27200 MP/s31360 MP/s
Fillrate (Texel)8800 MT/s15500 MT/s34400 MT/s42000 MT/s52200 MT/s61200 MT/s68000 MT/s78400 MT/s
Compute Power (FP32)352 GFLOPS620 GFLOPS1376 GFLOPS1680 GFLOPS2088 GFLOPS2448 GFLOPS2720 GFLOPS3136 GFLOPS
Compute Power (FP64)----418 GFLOPS490 GFLOPS544 GFLOPS627 GFLOPS
Bus TypePCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0
TDP39 W64 W108 W~180 W151 W~180 W188 W~250 W
DirectX1111111111111111
OpenGL4.54.54.54.54.54.54.54.5
Launch Year20102010200920092009200920092009


Radeon HD 5970GeForce 8800 GTS (640 MB)GeForce 8800 GTX2 × GeForce 8800 GTX3 × GeForce 8800 GTXGeForce 8800 Ultra OCQuadro FX 5600 OCGeForce 8800 GT
GPU2 × CypressG80G802 × G803 × G80G80G80G92
ArchitectureTerascale 2TeslaTeslaTeslaTeslaTeslaTeslaTesla
Technology40 nm90 nm90 nm90 nm90 nm90 nm90 nm65 nm
Die Size2 × 334 mm2484 mm2484 mm22 × 484 mm23 × 484 mm2484 mm2484 mm2324 mm2
Transistor Count2 × 2154 mil.681 mil.681 mil.2 × 681 mil.3 × 681 mil.681 mil.681 mil.754 mil.
Transistor Density6.45 mil. / mm21.41 mil. / mm21.41 mil. / mm21.41 mil. / mm21.41 mil. / mm21.41 mil. / mm21.41 mil. / mm22.33 mil. / mm2
GPU Clock725 MHz513 MHz575 MHz575 MHz575 MHz650 MHz600 MHz600 MHz
Shader Clock725 MHz1188 MHz1350 MHz1350 MHz1350 MHz1620 MHz1350 MHz1500 MHz
ROPs2 × 3220242 × 243 × 24242416
TMUs2 × 8024322 × 323 × 32323256
Compute Units2 × 2012162 × 163 × 16161614
Shaders2 × 1600 Unified96 Unified128 Unified2 × 128 Unified3 × 128 Unified128 Unified128 Unified112 Unified
L1 Cache2 × 20 × 8 kB6 × 16 kB8 × 16 kB2 × 8 × 16 kB3 × 8 × 16 kB8 × 16 kB8 × 16 kB7 × 16 kB
L2 Cache2 × 512 kB96 kB96 kB2 × 96 kB3 × 96 kB96 kB96 kB64 kB
Memory1024 MB GDDR5640 MB GDDR3768 MB GDDR3768 MB GDDR3768 MB GDDR3768 MB GDDR31536 MB GDDR3512 MB GDDR3
Memory Clock4000 MHz1600 MHz1800 MHz1800 MHz1800 MHz2200 MHz1900 MHz1800 MHz
Bus Width2 × 256 bit320 bit384 bit2 × 384 bit3 × 384 bit384 bit384 bit256 bit
Memory Bandwidth256 GB/s64 GB/s86.4 GB/s2 × 86.4 GB/s3 × 86.4 GB/s105.6 GB/s91.2 GB/s57.6 GB/s
Fillrate (Pixel)2 × 23200 MP/s10260 MP/s13800 MP/s2 × 13800 MP/s3 × 13800 MP/s15600 MP/s14400 MP/s9600 MP/s
Fillrate (Texel)2 × 58000 MT/s12312 MT/s18400 MT/s2 × 18400 MT/s3 × 18400 MT/s20800 MT/s19200 MT/s33600 MT/s
Compute Power (FP32)2 × 2320 GFLOPS228 GFLOPS346 GFLOPS2 × 346 GFLOPS3 × 346 GFLOPS415 GFLOPS346 GFLOPS336 GFLOPS
Compute Power (FP64)2 × 464 GFLOPS-------
Bus TypePCI-E 2.0PCI-E 1.0PCI-E 1.0PCI-E 1.0PCI-E 1.0PCI-E 1.0PCI-E 1.0PCI-E 2.0
TDP294 W143 W155 W2 × 155 W3 × 155 W171 W~171 W125 W
DirectX1110101010101010
OpenGL4.53.33.33.33.33.33.33.3
Launch Year20092006200620062006200720072007


2 × GeForce 8800 GTGeForce 8800 GTS 512 OCGeForce 9600 GTGeForce GT 240GeForce GTS 250GeForce GTX 260 OEMGeForce GTX 260 2162 × GeForce GTX 260 216
GPU2 × G92G92G94GT215G92GT200GT2002 × GT200
ArchitectureTeslaTeslaTeslaTeslaTeslaTeslaTeslaTesla
Technology65 nm65 nm65 nm40 nm65 nm65 nm65 nm65 nm
Die Size2 × 324 mm2324 mm2240 mm2144 mm2324 mm2576 mm2576 mm22 × 576 mm2
Transistor Count2 × 754 mil.754 mil.505 mil.727 mil.754 mil.1400 mil.1400 mil.2 × 1400 mil.
Transistor Density2.33 mil. / mm22.33 mil. / mm22.1 mil. / mm25.05 mil. / mm22.33 mil. / mm22.43 mil. / mm22.43 mil. / mm22.43 mil. / mm2
GPU Clock600 MHz740 MHz650 MHz550 MHz740 MHz518 MHz576 MHz576 MHz
Shader Clock1500 MHz1800 MHz1625 MHz1340 MHz1836 MHz1080 MHz1242 MHz1242 MHz
ROPs2 × 16161681628282 × 28
TMUs2 × 566432326464722 × 72
Compute Units2 × 14168121624272 × 27
Shaders2 × 112 Unified128 Unified64 Unified96 Unified128 Unified192 Unified216 Unified2 × 216 Unified
L1 Cache2 × 7 × 16 kB8 × 16 kB4 × 16 kB6 × 16 kB8 × 16 kB8 × 24 kB9 × 24 kB2 × 9 × 24 kB
L2 Cache2 × 64 kB64 kB64 kB64 kB64 kB256 kB256 kB2 × 256 kB
Memory512 MB GDDR3512 MB GDDR3512 MB GDDR3512 MB GDDR51024 MB GDDR31792 MB GDDR3896 MB GDDR3896 MB GDDR3
Memory Clock1800 MHz2070 MHz1800 MHz3400 MHz2000 MHz2000 MHz2000 MHz2000 MHz
Bus Width2 × 256 bit256 bit256 bit128 bit256 bit448 bit448 bit2 × 448 bit
Memory Bandwidth2 × 57.6 GB/s66.2 GB/s57.6 GB/s54.4 GB/s64 GB/s112 GB/s112 GB/s2 × 112 GB/s
Fillrate (Pixel)2 × 9600 MP/s11840 MP/s10400 MP/s4400 MP/s11840 MP/s14504 MP/s16128 MP/s2 × 16128 MP/s
Fillrate (Texel)2 × 33600 MT/s47360 MT/s20800 MT/s17600 MT/s47360 MT/s33152 MT/s41472 MT/s2 × 41472 MT/s
Compute Power (FP32)2 × 336 GFLOPS461 GFLOPS208 GFLOPS257 GFLOPS470 GFLOPS415 GFLOPS537 GFLOPS2 × 537 GFLOPS
Compute Power (FP64)-----52 GFLOPS67 GFLOPS2 × 67 GFLOPS
Bus TypePCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0
TDP2 × 125 W~150 W95 W69 W150 W150 W182 W2 × 182 W
DirectX10101010.110101010
OpenGL3.33.33.33.33.33.33.33.3
Launch Year20072007200820092009200820082008


3 × GeForce GTX 260 216GeForce GTX 285 OCGeForce GTX 295GeForce GTS 450GeForce GTX 460 OEMGeForce GTX 460 OCGeForce GTX 470GeForce GTX 480
GPU3 × GT200GT200B2 × GT200BGF106GF104GF104GF100GF100
ArchitectureTeslaTeslaTeslaFermiFermiFermiFermiFermi
Technology65 nm55 nm55 nm40 nm40 nm40 nm40 nm40 nm
Die Size3 × 576 mm2470 mm22 × 470 mm2238 mm2332 mm2332 mm2526 mm2526 mm2
Transistor Count3 × 1400 mil.1400 mil.2 × 1400 mil.1170 mil.1950 mil.1950 mil.3200 mil.3200 mil.
Transistor Density2.43 mil. / mm22.98 mil. / mm22.98 mil. / mm24.92 mil. / mm25.87 mil. / mm25.87 mil. / mm26.08 mil. / mm26.08 mil. / mm2
GPU Clock576 MHz700 MHz576 MHz810 MHz650 MHz925 MHz608 MHz700 MHz
Shader Clock1242 MHz1476 MHz1240 MHz1620 MHz1300 MHz1850 MHz1216 MHz1400 MHz
ROPs3 × 2832281632324048
TMUs3 × 7280803256565660
Compute Units3 × 27302 × 304771415
Shaders3 × 216 Unified240 Unified2 × 240 Unified192 Unified336 Unified336 Unified448 Unified480 Unified
L1 Cache3 × 9 × 24 kB10 × 24 kB2 × 10 × 24 kB4 × 64 kB7 × 64 kB7 × 64 kB14 × 64 kB15 × 64 kB
L2 Cache3 × 256 kB256 kB2 × 256 kB256 kB512 kB512 kB768 kB768 kB
Memory896 MB GDDR31024 MB GDDR3896 MB GDDR31024 MB GDDR52048 MB GDDR51024 MB GDDR51280 MB GDDR51536 MB GDDR5
Memory Clock2000 MHz2800 MHz2000 MHz3600 MHz3400 MHz4200 MHz3350 MHz3700 MHz
Bus Width3 × 448 bit512 bit2 × 448 bit128 bit256 bit256 bit320 bit384 bit
Memory Bandwidth3 × 112 GB/s179.2 GB/s2 × 112 GB/s57.6 GB/s108.8 GB/s134.4 GB/s134 GB/s177.6 GB/s
Fillrate (Pixel)3 × 16128 MP/s22400 MP/s2 × 16128 MP/s12960 MP/s20800 MP/s29600 MP/s24320 MP/s33600 MP/s
Fillrate (Texel)3 × 41472 MT/s56000 MT/s2 × 46080 MT/s25920 MT/s36400 MT/s51800 MT/s34048 MT/s42000 MT/s
Compute Power (FP32)3 × 537 GFLOPS708 GFLOPS2 × 595 GFLOPS622 GFLOPS874 GFLOPS1243 GFLOPS1089 GFLOPS1344 GFLOPS
Compute Power (FP64)3 × 67 GFLOPS88.5 GFLOPS2 × 74 GFLOPS52 GFLOPS73 GFLOPS104 GFLOPS136 GFLOPS168 GFLOPS
Bus TypePCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0PCI-E 2.0
TDP3 × 182 W~225 W289 W106 W150 W~225 W215 W250 W
DirectX1010101111111111
OpenGL3.33.33.34.64.64.64.64.6
Launch Year20082009200920102010201020102010


GeForce GTX 480 OCQuadro 4000Tesla C2075 Mod
GPUGF100GF100GF110
ArchitectureFermiFermiFermi
Technology40 nm40 nm40 nm
Die Size526 mm2526 mm2520 mm2
Transistor Count3200 mil.3200 mil.3000 mil.
Transistor Density6.08 mil. / mm26.08 mil. / mm25.77 mil. / mm2
GPU Clock825 MHz475 MHz700 MHz
Shader Clock1650 MHz950 MHz1400 MHz
ROPs483248
TMUs603256
Compute Units15814
Shaders480 Unified256 Unified448 Unified
L1 Cache15 × 64 kB8 × 64 kB14 × 64 kB
L2 Cache768 kB768 kB768 kB
Memory1536 MB GDDR52048 MB GDDR56144 MB GDDR5
Memory Clock4000 MHz2800 MHz3700 MHz
Bus Width384 bit256 bit384 bit
Memory Bandwidth192 GB/s89.6 GB/s177.6 GB/s
Fillrate (Pixel)39600 MP/s15200 MP/s33600 MP/s
Fillrate (Texel)49500 MT/s15200 MT/s39200 MT/s
Compute Power (FP32)1584 GFLOPS486 GFLOPS1254 GFLOPS
Compute Power (FP64)198 GFLOPS243 GFLOPS627 GFLOPS
Bus TypePCI-E 2.0PCI-E 2.0PCI-E 2.0
TDP~350 W142 W225 W
DirectX111111
OpenGL4.64.64.6
Launch Year201020102011

Next page