Other benchmarks, such as HPCtech.com's GPCBenchmarkOCL which unfortunately can no longer be found on their site, show a more detailed picture. As long as raw performance and throughput is concerned, everything stays within the usual margin of error. That means, GFLOPS for single and double precision math as well as bandwith are not affected. But especially in the image processing section of the benchmark there are large differences in performance compared to what I've measured only minutes ago on the Geforce 275.20 release on the same rig. A drop to about 18 percent of the original performance in the NLM denoise subtest marks the highest performance drop I observed. Other tests, such as the bicubic scaling filter dropped to 20 percent with a global access pattern. Image access compared favourably and only slowed down to half speed. The histogram function appears to work about 52 percent faster though.
Speaking of image access - that one seems to have gotten the most love from Nvidia's driver team, as it has been expanded to support larger images as well as having it's performance slashed badly. More precisely, the OpenCL extension CL_DEVICE_IMAGE2D_MAXWIDTH and _HEIGHT have been adapted to DX11's standard of 16384 pixel wide and high images, before 4096 x 32768 (!) for width and height respectively had been reported by the driver.
But there are good news too! Nvidia has revised some of the integrated 3D vision profiles as well as integrating a ton of new ones. And the driver now enables SLI on AMD chipsets for the upcoming Bulldozer-CPUs (as well as current ones) like 970, 990X and 990 FX. More about these very important fixes can be found in Nvidia's release notes PDF for Geforce 280.19 Beta - I've also attached the file to this article. Apart from catering to diminishingly small target audiences like Quad-SLI and 3D Vision users, the driver actually fixes some stuttering in the free online MMO World of Tanks and some rendering glitches in Crysis 2 for Geforce 500 owners. Officially undocumented is the apparent fix for Bioware's RPG Neverwinter Nights that does no longer slow down to a crawl as soon as you put on cloaks as reported in the respective forae.
The downloads of the respective driver packages are as follows:
- Geforce 280.19 Beta (Windows XP, 32 bit)
- Geforce 280.19 Beta (Windows XP, 64 bit)
- Geforce 280.19 Beta (Windows 7, Vista, 32 bit)
- Geforce 280.19 Beta (Windows 7, Vista, 64 bit)
Fortunately, the driver supports all current GPUs - starting from Geforce 6 and the motherboard chipsets with integrated graphics based on that technology up to the new and shiny Geforce GTX 590 dual GPU cards including the newly added Geforce GT 530, Geforce 510, Geforce GTX 460 SE v2, Geforce GTX 580M and Geforce GT 520MX. OpenCL though worries only those using a Geforce 8 and newer.
The full list for desktop cards looks as follows:
- Geforce 500: GTX 590, GTX 580, GTX 570, GTX 560 Ti, GTX 560, GTX 550 Ti, GT 545, GT 530, GT 520, 510
- Geforce 400: GTX 480, GTX 470, GTX 465, GTX 460 SE v2, GTX 460 SE, GTX 460, GTS 450, GT 440, GT 430, GT 420, 405
- Geforce 300: GT 340, GT 330, GT 320, 315, 310
- Geforce 200: GTX 295, GTX 285, GTX 280, GTX 275, GTX 260, GTS 250, GTS 240, GT 240, GT 230, GT 220, G210, 210, 205
- Geforce 100: GT 140, GT 130, GT 120, G 100
- Geforce 9: 9800 GX2, 9800 GTX/GTX+, 9800 GT, 9600 GT, 9600 GSO, 9600 GS, 9500 GT, 9500 GS, 9400 GT, 9400, 9300 GS, 9300 GE, 9300, 9200, 9100
- Geforce 8:8800 Ultra, 8800 GTX, 8800 GTS 512, 8800 GTS, 8800 GT, 8800 GS, 8600 GTS, 8600 GT, 8600 GS, 8500 GT, 8400 SE, 8400 GS, 8400, 8300 GS, 8300, 8200 / nForce 730a, 8200, 8100 / nForce 720a
- Geforce 7: 7950 GX2, 7950 GT, 7900 GTX, 7900 GT/GTO, 7900 GS, 7800 SLI, 7800 GTX, 7800 GT, 7800 GS, 7650 GS, 7600 LE, 7600 GT, 7600 GS, 7550 LE, 7500 LE, 7350 LE, 7300 SE / 7200 GS, 7300 LE, 7300 GT, 7300 GS, 7150 / NVIDIA nForce 630i, 7100 GS, 7100 / NVIDIA nForce 630i, 7100 / NVIDIA nForce 620i, 7050 PV / NVIDIA nForce 630a, 7050 / NVIDIA nForce 630i, 7050 / NVIDIA nForce 610i, 7025 / NVIDIA nForce 630a
- Geforce 6: 6800 XT, 6800 XE, 6800 Ultra, 6800 LE, 6800 GT, 6800 GS/XT, 6800 GS, 6800, 6700 XL, 6610 XL, 6600 VE, 6600 LE, 6600 GT, 6600, 6500, 6250, 6200 TurboCache, 6200SE TurboCache, 6200 LE, 6200 A-LE, 6200, 6150SE nForce 430, 6150LE / Quadro NVS 210S, 6150 LE, 6150, 6100 nForce 420, 6100 nForce 405, 6100 nForce 400, 6100
- Ion: Ion LE, Ion
According to the official consortium at Khronos.org, the 1.1 revision of the open computing language OpenCL, the following are the main points that differentiate 1.1 from 1.0 version of the specification:
- Host-thread safety, enabling OpenCL commands to be enqueued from multiple host threads;
- Sub-buffer objects to distribute regions of a buffer across multiple OpenCL devices;
- User events to enable enqueued OpenCL commands to wait on external events;
- Event callbacks that can be used to enqueue new OpenCL commands based on event state changes in a non-blocking manner;
- 3-component vector data types;
- Global work-offset which enable kernels to operate on different portions of the NDRange;
- Memory object destructor callback;
- Read, write and copy a 1D, 2D or 3D rectangular region of a buffer object;
- Mirrored repeat addressing mode and additional image formats;
- New OpenCL C built-in functions such as integer clamp, shuffle and asynchronous strided copies;
- Improved OpenGL interoperability through efficient sharing of images and buffers by linking OpenCL event objects to OpenGL fence sync objects;
- Optional features in OpenCL 1.0 have been bought into core OpenCL 1.1 including: writes to a pointer of bytes or shorts from a kernel, and conversion of atomics to 32-bit integers in local or global memory.