• Hardware porn: GF100s' naked Die exposed

    In order to measure the real die size, I took off the heatspreader off a GF100-Fermi graphics processorEver since the first DirectX 10 GPU, the G80, Nvidia has chosen to shroud the actual chip in its highest end products under a metal hood called heatspreader. So, no pictures of the bare naked die can be taken without the severe risk of mortally wounding the chip or its surroundings and thus possibly breaking a graphics card worth a couple of hundred Euros or US Dollars.

    Also, the heatspreader prevents curious journalists and users to effectively determine the real die-size, a number Nvidia is often reluctant to give because compared to, say, the number of transistors in a given chip, the die size directly influences production costs and thus could be deemed a relevant number for analysts comparing it to those of the competition. Now, that's a potential problem for Nvidia for two reasons:
    • First, their hardest competitor AMD has had an obvious advantage over the last few generations in terms of transistor density. In 55 nm manufacturing for example, Nvidia was able to pack around 2,8 millions of transistors into a square millimeter of silicon both for G92b and GT200b, whereas AMD easily broke the 3 million mark, stuffing as much as 3,48 million gates into the same area in RV770 - both numbers obviously depend on the die size as well as number of transistors. For the former I relied on my own measurements using a digital caliper gauge, while for the latter I had to take the numbers given by the companies.
    • Second, traditionally since G80 Nvidia had opted for a different approach to their graphics proccessors in terms of ALU alignment, scheduling, texturing and last but not least features to make the graphics processor more suitable for GPU Computing by adding shared memory and caches to their chips as well as speeding up the ALU lanes in order to achieve faster serial throughput which otherwise would also limit non-100-percent parallel workloads (say hello to Amdahl here).

    With Fermi, they've taken this approach to new heights by designing the chip almost exclusively to the needs of the HPC market with only double-speed single-precision for the 256 parallel processors (or half-rate DP for the officially communicated 512 ALUs), an almost-coherent two level caching system, optional ECC capability for the main memory and very fast atomic memory access. Quite possibly this was done in preparation for Intels Larrabee many core processor, which the CPU giant basically cancelled as a commercial product because it couldn't manufacture it economically viable enough.

    All of this (among other things) made GF100 grow to 3 billion transistors (3000 millions for old-school europeans), around 39,5% more than AMD has integrated into their Cypress chip, which, by my measurement is 349,7 mm˛ in size (the official number is 337 mm˛ I believe).

    In terms of seemingly undieable Fps-bars in traditional benchmarks around the net (think 3DMark Vantage, Crysis and the like), this translates into a very poor so-called efficiency both in terms of fps per watt and fps per die area. Thus, it doesn't come as a complete surprise that Nvidia is quite reluctant to give official numbers on their high-end chips' die sizes. Smaller ones like the GF106 are readily exposed, so everyone can attach a ruler and come to approximately 15 x 15 mm for those.

    With this in mind, enter the gigantic Fermi chip. Using the aforementioned digital caliper, I measured 23,4 x 23,5 mm, arriving at ever so slightly under 550 mm˛: 549,9 mm˛ would be „my official” number for GF100s die size. If you want some more pictures, take a look over at PCGH.de.