To find the maximum overclock of our card we used a combination of Rivatuner, ATITool and our benchmarking suite. In order to enable overclocking of the G94 cards using Rivatuner, you just edit rivatuner.cfg and add the device id in our case "0622h" to the G92 section.
An interesting thing I noticed when testing the card was that on my GIGABYTE motherboard the BIOS screen would not show up on my analog CRT, only when the card came to the Windows Desktop the display would turn on. This was encountered on both DVI outputs. When using the card on an ASUS P5K3 this problem was not seen, so I would say it is an issue with the GIGABYTE motherboard.
The final overclocks of our card are 740 MHz Core (2 % overclock) and 1096 MHz Memory (10 % overclock). These overclocks are over the stock speeds of the Zotac AMP! Edition, which is already overclocked compared to the NVIDIA reference design. When comparing the clocks to the standard GeForce 9600 GT we see overclocks of 14% on core and 22% on memory.
Wrong Core Clock?
During testing I noticed that the core clocks reported by Rivatuner Monitoring do not match the clocks returned by Rivatuner Overclocking and GPU-Z:
RT Overclocking and GPU-Z read the clocks from the NVIDIA driver, displaying whatever the driver returns as core clock. Rivatuner Monitoring however accesses the clock generator inside the GPU directly and gets its information from there. A PLL to generate clocks works as follows. It is fed a base frequency from a crystal oscillator, typically in the 13..27 MHz range. It then multiplies and divides this frequency by an integer value to reach the final clock speed. For example 630 MHz = 27 MHz * 70 / 3.
The information which crystal is used is stored inside the GPU's strap registers which are initialized from a resistor configuration on the PCB and the BIOS. In the case of the GeForce 9600 GT the strap says "27 MHz" crystal frequency and Rivatuner Monitoring applies that to its clock reading code, resulting frequency: 783 MHz = 27 MHz * 29 / 1. The NVIDIA driver however uses 25 MHz for its calculation: 725 MHz = 25 * 29 / 1. This explains the clock difference and can only be seen on the core frequency (the memory PLL is running at 27 MHz). Now the big question is, who is wrong? When I asked NVIDIA about this phenomenon they replied: "The crystal frequency is 25MHz on 9600GT. Clock is 650MHz". So far so good. But why would you want to use a 25 MHz crystal for core and a 27 MHz one for memory? And why is the only crystal I see on the PCB a 27 MHz one?
Unwinder has modified Rivatuner's monitoring code to ignore the 27 MHz information from the straps and use a hardcoded 25 MHz value instead in order to return "correct" values. Unless NVIDIA sheds some light on this issue it remains unclear what correct is. It is well possible that NVIDIA planned to use a 25 MHz crystal, yet production placed a 27 MHz crystal on the board causing all GeForce 9600 GT cards to run at a 8% (= 27 / 25) higher core clock speed than intended.
I tried running fillrate performance tests to determine which core frequency the card is really running at but those were inconclusive. Also I do not know of a way to physically measure the core frequency using an oscilloscope or frequency counter.
Temperatures
For a single slot cooler, these are very nice temperatures. Unfortunately the cooler tends to be noisier than necessary as mentioned before. I would have prefered a bit higher temperatures with less noise.