I believe this is what he said
I was answering your post, not anyone elses. The fact that I quoted your post should have been a dead giveaway. Why you answered a post concerning double precision with a supposed need for ECC I have no idea- they aren't inextricably linked.
As for whatever vague point you're making, there are plenty of instances where FP64 could be useful to a prosumer (mixed single+ double precision workloads such as
3D modelling)
Nvidia Titan Z 12 GB
single precision = 8.0 Tflops
double precision = 2.6 Tflops
$2,999
I'll save you some money on Nvidia at the same site
Nvidia Tesla K40 12 GB
single precision = 4.29 Tflops
double precision = 1.43 Tflops
$4,245
Nvidia Quadro K6000 12 GB
single precision = 5.2 Tflops
double precision = 1.7 Tflops
$4,235
AMD FirePro W9100 16 GB
single precision = 5.24 Tflops
double precision = 2.62 Tflops
$3,499
So, judging by the bolding and price inclusion, you're saying double precision :
Titan Z...0.87 GFlop/$
W9100..0.75 GFlop/$
K6000...0.45 GFlop/$ (
the card is available for $3800)
K40.......0.34 GFlop/$
Not sure how that the Tesla, Quadro, or FirePro are supposed to be "saving some money".
Of course, it's still an apples vs oranges scenario. Professional drivers, software (Nvidia's OptiX, SceniX, CompleX etc.), support, binning, and a larger frame buffer (the Titan Z isn't a 12GB card, it's a 2 x 6GB card) should all add value to the pro boards regardless of vendor.
A further point to note is that Nvidia's FLOP numbers are calculated on base clock (which is correct for double precision since boost is disabled) , not boost -either guaranteed minimum or maximum sustained for single precision. The FLOPS for AMD's cards are calculated on maximum boost, whether it is attainable/sustainable or not.
Case in point: The
GTX 780 is quoted as having a 3977 GFlops FP32 rate ( 863 base clock * 2304 cores * 2 Ops/clock). But GPGPU apps can be as intensive as games. The GTX 780 I have here at the moment - based on that the usual calculation should be 967 * 2304 * 2 =
4456 GFlops. In reality the card sustains a boost of 1085 MHz at stock settings (no extra voltage, no OC above factory, no change in stock fan profile). The actual FP32 rate would be 1085 * 2304 * 2 =
5000 GFlops
A quick Heaven run to show how meaningless the base clock (and its associated numbers) are, and why they generally aren't worth the time to record
You don't build supercomputers with a card from a gaming stack.
Jesus, how many times are you going to edit a post.
It probably depends upon your definition of a supercomputer. If its an HPC cluster, then no, you wouldn't...but that's a very narrow association used by people with little technical knowledge of the range of compute solutions.
Other examples:
The Fastra II is a
desktop supercomputer designed for
tomography
Rackmount GPU servers also generally come under the same heading, since big iron generally tend to be made up of the same hardware....just add more racks to a cabinet...and more cabinets to a cluster...etc. etc.
I'd also note that they aren't "one offs" as you opined once before,
as explained here: " We build and ship at least a few like this every month".
Go nuts configure away.