# Home PC outperforms a supercomputer in complex calculations



## CAPSLOCKSTUCK (Jul 25, 2016)

Russian physicists have put a computer running a consumer-level Nvidia GPU to work on equations that are normally performed using a powerful supercomputer, and found that the home PC solved them in 15 minutes – far faster than the supercomputer's time of two or three days.

A GPU is designed with multiple threads of processing power, which allows it to perform many more simultaneous calculations than a CPU. The researchers from the Lomonosov Moscow State University wanted to take advantage of that, and test whether consumer-level tech would make an accessible alternative to supercomputers, in situations where many equations had to be run parallel to each other.

The GPU tackled few-body scattering equations, which describe how multiple quantum particles interact with each other. Where three or more of these bodies are involved, the equations become extremely difficult to calculate, involving a table containing tens or even hundreds of thousands of rows and columns of data. Running on Nvidia software as well as custom programs written by the researchers, the GPU performed better than expected.

"We reached a speed we couldn't even dream of," says team leader Vladimir Kukulin. "The program computes 260 million complex double integrals on a desktop computer within three seconds. No comparison with supercomputers! My colleague from the University of Bochum in Germany carried out the calculations using one of the largest supercomputers in Germany with the famous blue gene architecture, which is actually very expensive. And what took his group two or three days we do in 15 minutes without spending a dime."

http://phys.org/news/2016-06-scientists-pc-complex-problems-tens.html


----------



## JalleR (Jul 25, 2016)

So Russia IS years behind....


----------



## basco (Jul 25, 2016)

quote from text:
Using the software developed by Nvidia and having written their own programs, the researchers split their calculations on the many thousands of streams and were able to solve the problem.

which software they are talking about? or did i miss that.


----------



## CAPSLOCKSTUCK (Jul 25, 2016)

basco said:


> which software they are talking about? or did i miss that.




full transcript is $ 35.95

http://www.sciencedirect.com/science/article/pii/S0010465516300765


----------



## qubit (Jul 25, 2016)

lol how embarrassing for the supercomputer. If they hired time on one of those NVIDIA Tesla powered supercomputers the speed would go up another several orders of magnitude.


----------



## okidna (Jul 25, 2016)

basco said:


> quote from text:
> Using the software developed by Nvidia and having written their own programs, the researchers split their calculations on the many thousands of streams and were able to solve the problem.
> 
> which software they are talking about? or did i miss that.



CUDA. Based on the paper, they're using CUDA (with PGI Fortran compiler/OpenAcc) and the hardware is i5 3570K and a GTX 670.


----------



## basco (Jul 25, 2016)

wow really a gtx 670??

thanks very much for your answers okidna+caps


----------



## Caring1 (Jul 25, 2016)

Of course Nvidia used Cuda, because AMD cards don't have it.
If they used C.U's then AMD would possibly be faster, although I fail to see how one computer can beat a Super Computer


----------



## R-T-B (Jul 26, 2016)

Caring1 said:


> although I fail to see how one computer can beat a Super Computer



Answer:  old supercomputer.


----------



## Steevo (Jul 26, 2016)

Nah, the answer is implementation of code. One of the exercises in programming that I took was how to write a simple compound interest calculator with early payoff interest savings and to be able to show the difference, the core part of the program was easy, but you could make it run super slow by implementing huge pools of resources to hold all your daily (30 Year Loan) interest calculations and have all the answers and steps along the way pre-calculated. Same for some random number generators, move a few lines of code around, and it ran half speed. 

I'm sure CUDA plus the parallel ability of a GPU (F@H knew this a long time ago) and not dumping a 1GB file of garbage to disk to reference probably helped a ton.


----------



## R-T-B (Jul 26, 2016)

Super computers tend to be massively parallel machines too, though.


----------



## Steevo (Jul 26, 2016)

R-T-B said:


> Super computers tend to be massively parallel machines too, though.




No doubt, but many CPU cores are hard to manage, and GPU is hardware managed. 
Why do in software, what hardware can.


----------



## SaltyFish (Jul 27, 2016)

The Way It's Meant to be Computed™


----------



## ViperXTR (Jul 27, 2016)

Reminds me of this:
http://fastra.ua.ac.be/en/index.html


----------



## FordGT90Concept (Jul 27, 2016)

If they were truly complex, the supercomputer would win.  GPUs aren't good for complex workloads.  They're good for vast, simple workloads.  The equation described is definitely in this latter category.

An article at Top 500 on the same topic:
https://www.top500.org/news/with-the-right-software-pcs-can-outdistance-supercomputers/


> The supercomputer in question is a bit of a mystery, since the story implies it was JUGENE, the 220-teraflop Blue Gene/P system that was decommissioned four years ago at Forschungszentrum Jülich. The system that replaced it, JUQUEEN, is a 5.8 petaflop Blue Gene/Q.


Pretty bad that they don't name the system.  If they did run it on JUGENE, no wonder it made supercomputers look bad.  It's a 2009 dinosaur.


----------



## Recon-UK (Jul 27, 2016)

Meanwhile i bought 4 GTX 280 to outrun the Russian Federation.


----------



## FordGT90Concept (Jul 27, 2016)

It was conducted by the Moscow University but the article strongly suggests it was run on a German supercomputer.


----------



## silentbogo (Jul 27, 2016)

FordGT90Concept said:


> It was conducted by the Moscow University but the article strongly suggests it was run on a German supercomputer.


It also strongly suggests that the original paper was written a very long time ago, since JUGENE was shut down in 2012 and replaced with significantly more powerful JUQUEEN.
The paper was submitted in 2015, so it was actually written some time between 2012 and 2014, when Kepler was still the shit.


----------



## FordGT90Concept (Jul 27, 2016)

Source?  It's dated July 28, 2016:
http://www.msu.ru/science/main_them...-v-desyatki-raz-bystree-superkompyuterov.html


----------



## Nordic (Jul 27, 2016)

The TPU folding team has known this for a long time ;P


----------



## mroofie (Jul 27, 2016)

FordGT90Concept said:


> *Source*?  It's dated July 28, 2016:
> http://www.msu.ru/science/main_them...-v-desyatki-raz-bystree-superkompyuterov.html


*sauce *


----------



## silentbogo (Jul 27, 2016)

FordGT90Concept said:


> Source?  It's dated July 28, 2016:
> http://www.msu.ru/science/main_them...-v-desyatki-raz-bystree-superkompyuterov.html


http://www.sciencedirect.com/science/article/pii/S0010465516300765

Click "show more" under author's name list. 



> Received 1 December 2015, Accepted 30 March 2016, Available online 8 April 2016


Given that this is a foreign publication of a russian paper, it was definitely published in some scientific periodical sponsored by RAS prior to submitting it overseas.
I'm pretty sure the paper is somewhere in the ether, but I'm too lazy to look for it.
Meanwhile, here is a short intro on the topic dated March 2014:
http://www.msu.ru/news/superkompyuter_ili_personalnyy_kompyuter.html?sphrase_id=737865


----------



## FireFox (Jul 27, 2016)

CAPSLOCKSTUCK said:


> and found that the home PC solved them in 15 minutes – far faster than the supercomputer's time of two or three days.


That's because the Supercomputer do the work more accurately and take more time while the Home pc it doesn't.


----------



## silentbogo (Jul 27, 2016)

Knoxx29 said:


> That's because the Supercomputer do the work more accurately and take more time while the Home pc it doesn't.


Both are computers - they are both accurate. It's not like a home PC will eventually get tired and leave for a beer and a nap ))

JUGENE has a much higher combined performance (initially ~167TFLOPS, later upgraded to 1PFLOPS), even comparing to an upcoming GP100-based DGX-1 rack; it was simply bound by circumstances. The task was relatively small, which affected relative performance with a large management overhead. Additionally, it was memory-bound so the entire process probably got bottlenecked by slow DDR2 memory, and even slower bandwidth for data exchange between nodes. Plus I am 100% sure that at the time this supercomputer was shared amongst many entities, so it won't be a surprise that some small team from a Russian university, with a project at early stages of development, was not high on the priority list for compute resource allocation.


----------



## FireFox (Jul 27, 2016)

silentbogo said:


> Both are computers - they are both accurate. It's not like a home PC will eventually get tired and leave for a beer and a nap ))


A Home PC is a Home PC and a Supercomputer is a Supercomputer, a Supercomputer can't be like a Home pc otherwise it wouldn't be a Supercomputer.

Does it make any sense ?

For some reason there is the word SUPER before Computer.


----------



## silentbogo (Jul 27, 2016)




----------



## Caring1 (Jul 27, 2016)

How old does a Super Computer have to be, before it is no longer Super, and merely a large computer?


----------



## FireFox (Jul 27, 2016)

silentbogo said:


>


----------



## FireFox (Jul 27, 2016)

Caring1 said:


> How old does a Super Computer have to be, before it is no longer Super, and merely a large computer?


*World’s first petaflop supercomputer is obsolete after just five years, will be shut down*

*http://www.extremetech.com/computing/152191-worlds-first-petaflop-supercomputer-is-obsolete-after-just-five-years-will-be-shut-down*


----------



## Disparia (Jul 27, 2016)

As a point of reference the $129,000 nVidia DGX-1 with (8) Tesla P100 claims 42.5 TFLOPS (FP64). A PFLOP is still costly, but much better than those $100-million super computers.


----------



## FordGT90Concept (Jul 27, 2016)

It is not better for heavy, logic problems.  That's why there are CISC, RISC, and GPGPU supercomputers out there.  They all have a preferred workload.

The problem in the article is perfectly suited for GPGPU which is why it performed so much better than the RISC Blue Gene supercomputer.


----------

