Tuesday, June 6th 2017

AMD's Entry-Level 16-core, 32-thread Threadripper to Reportedly Cost $849

Jun 6th, 2017 08:48 Discuss (128 Comments)

AMD has recently announced that at least nine models are in preparation for their new HEDT line-up, which will, for now, feature processors with up to 16 cores and 32 threads. The entry-level 16-core chip, the Threadripper 1998, will come in at 3.20 GHz with 3.60 GHz boost, 155 W TDP, and is absent of XFR.

If recent reports hold true, this entry-level Threadripper 1998 will come in at $849. Now, let's be honest - this seems like an immensely optimistic value, undercutting even Intel's 10-core 7900X, which has been announced at $999 (in tray quantities.) That's over 6 more cores and 12 more threads for $150 less. And let's just say that AMD's IPC isn't that much lower than Intel's to justify such an aggressive undercutting, a high-volume approach to the market.

Source: ETeknix

Add your own comment

128 Comments on AMD's Entry-Level 16-core, 32-thread Threadripper to Reportedly Cost $849

#76

Aquinus

Resident Wat-man

EarthDogWell, most are worried about that same fabric and more ccx modules and how that effects performance too. Its easy to add, but weve already seen latency issues from the two ccx/8 core...i can imagine 4 ccx will have more latency?

I suspect that it will be similar to latency to talk to L3 cache since that's usually where all the cores can eventually talk to each other, on AMD and Intel CPUs (most of them anyways,) but, the CCX is even kind of one level out from that so, it's almost like what the latency would be for an on-die L4 cache depending on how everything is laid out. I suspect that they'll be quick so long as the OS isn't switching threads to a different CCX often because the memory it's using might not be already resident in that CCX' cache and more communication between the different functional parts means more latency. I'm not going to lie, I think that single-threaded performance is going to be worse but, it's coming with the trade-off of having more cores. I say this to people at work and I'll say it here too. Use the right tool for the right job. Not everyone needs a ton of cores but, there are situations where people need that. I know that a lot of people here are mainly concerned about gaming and that's fine but, we all have to remember that there is more to computers than just games and that there are a lot of different tasks that devs and scientists are using them for.

#77

infrared

I assume AMD have been putting aside all the best 8c dies to use in threadripper. Some of the better 1800X's can do 4.1+ on all cores, my one is running WCG at 4.1ghz 1.406v at the moment, under a lighter load I can push it to 4.2ghz. On the same note power consumption should be reduced with higher quality dies. So I think their target is plausible but probably not easy for them to achieve.

I'm very impressed with their pricing if that's accurate.. I believe they're going as low as they can on price, not because performance is bad, but because they want as many people as possible to buy it and spread the word of how good their new generation of products is. They've hurt their reputation so badly this last decade that they need to price stuff low for IT companies to begin using them again.

#78

HTC

infraredI assume AMD have been putting aside all the best 8c dies to use in threadripper. Some of the better 1800X's can do 4.1+ on all cores, my one is running WCG at 4.1ghz 1.406v at the moment, under a lighter load I can push it to 4.2ghz. On the same note power consumption should be reduced with higher quality dies. So I think their target is plausible but probably not easy for them to achieve.

I'm very impressed with their pricing if that's accurate.. I believe they're going as low as they can on price, not because performance is bad, but because they want as many people as possible to buy it and spread the word of how good their new generation of products is. They've hurt their reputation so badly this last decade that they need to price stuff low for IT companies to begin using them again.

Even if it is plausible, i doubt they'd make it so because it will be more susceptible to instability @ higher speeds (assuming you are talking about clock speed of 4 GHz for a 16 c / 32 t chip). Stability in server environment is far more crucial so i'd expect @ most 3.5 GHz base clock: even this much is pretty much unheard of in chips with this many cores.

We already know how the fabric works in an 1x00(X) chip: what we don't know is how it works linking different chips. Does it have a higher penalty then the latency between different CCXs of the same chip? The infinity fabric's performance when linking different chips is what is going to make or brake threadripper as well as epyc: we shall see ...

Regarding the pricing, i totally agree with what you said.

#79

thesmokingman

ratirtMaybe it's not being a fanboy but it's an unconditional love :)

Nah, the one in question is really a full-on self professed fanboy.

#80

pantherx12

EarthDogWell, most are worried about that same fabric and more ccx modules and how that effects performance too. Its easy to add, but weve already seen latency issues from the two ccx/8 core...i can imagine 4 ccx will have more latency?

Inter chip communication is supposed to be the same speed as inter ccx communication, certainly when the chips are in the same package anyway. AMDs new processors /gpus are okay but its the fabric thats the most impressive bit of engineering.

According to AMD they're getting close to 100% scaling with this modular design, and likely would be if they could keep clock speeds the same. Amd did a presentation recently where thread ripper done a cinebench done a run in 13 seconds ( approx) where as if actually 100% scaling it be 12 or so. That still aint bad at all.

It looks like if you want IPC single core performance intel are still king, but when it comes to multithreaded work loads I think AMD have outdone themselves.

#81

Melvis

Thats way cheaper then I thought they would be. Remember the old FX-57/60 priced more then this and your getting alot more CPU now then you did back then for the Money!

#82

ratirt

pantherx12Inter chip communication is supposed to be the same speed as inter ccx communication, certainly when the chips are in the same package anyway. AMDs new processors /gpus are okay but its the fabric thats the most impressive bit of engineering.

According to AMD they're getting close to 100% scaling with this modular design, and likely would be if they could keep clock speeds the same. Amd did a presentation recently where thread ripper done a cinebench done a run in 13 seconds ( approx) where as if actually 100% scaling it be 12 or so. That still aint bad at all.

It looks like if you want IPC single core performance intel are still king, but when it comes to multithreaded work loads I think AMD have outdone themselves.

I'd really like to see the IPC for one core with intel high core count platforms too. They were not that great. (maybe it's due to lower frequency and OC potential?)

#83

EarthDog

Frequency.. has nothing to do with interconnectivity or IPC.

#84

msroadkill612

pantherx12A lot of people seem to be getting bent out of shape at these being cheap and therefore expecting poor performance Vs Intel.

This is nothing like that it's just the scaleble modular design means pricing follows a very simple format.

About twice the power costs about twice as much. Each time.

This is because AMD are making 8 core parts and them sticking them in the infinity fabric.

Intel with are straight up making 20 core monolithic CPUs, meaning their pricing scales down from the top rather than building from the bottom.

So Amd chips maybe cost 50 dollars each then they stick 4 of them on a PCB then charge you 1000 for it. But you still get loads of performance.

Intel 20 core maybe costs 500 dollars straight up thanks to complex monolithic CPUs having exponentially more chances of having errors. Vs a quad core based on same architecture and process.

So an 18-16-14 core are all that same 20 core chip that costs Intel 500 or so.

AMD have engineered a game changer here folks assuming they don't balls it right up.

It means dual GPU cards should In theory not be shitty as they'll be using the infinity fabric as well.

Sorry writing isn't my forte but hopefully you get the jist.

AMD have made god damn processor Legos,that's what Vega and Ryzen are, building blocks that fit together perfectly.

So you could build a "little house", or"Lego land "

AMD could could put 64 cores in a single package,128 etc etc.

The scalebility on depends on how much the customer wants to spend and power delivery/cooling etc.

Agree, but its even more modular than u say - the "leggoBlock" wafer is 4 core - an 8 core ryzen is 2 x on the fabric/mcm.

Doubling up to yield 16 core TR, & again for 32 core Epyc, clearly has been plain sailing, given the latters imminent release.

Next big trick is getting the cpu & vega gpu leggo blocks to team up nicely on their "fabric", and few amd savvy folk would doubt thats a done deal too - its the same as an apu, which amd has done for many years.

we will see zen/vega apuS b4 xmas, in the more lucrative mobile market first.

Beyond that point, i personally cant see intel/nvidea getting much of a look in, if AMD can grow fast enough to meet demand. But again, the simplicity of their formula, makes me confident they have this covered.

The world will be their oyster.

Take the ~existing Epyc MCM. Its a fabric that can team up to 8 of the above leggo blocks. a server (which it sure is w/ 128 lanes), and could be purchased with just the right balance of cpu/gpu cores to suit projected c/gpu workloads.

#85

msroadkill612

A question for the mavens here please?

What do u think of the notion that it pays to have symmetrical arrangements of cores/cache on these AMD Multi Core Modules? e.g. 8 core vs 6 core ryzen.

Conversely, the notion that the asymmetrically populated MCMs may yield more bang for buck, as they nip some cores, but leave the cache alone, but dont really price it in?

#86

pantherx12

msroadkill612Agree, but its even more modular than u say - the "leggoBlock" wafer is 4 core - an 8 core ryzen is 2 x on the fabric/mcm.

Doubling up to yield 16 core TR, & again for 32 core Epyc, clearly has been plain sailing, given the latters imminent release.

Next big trick is getting the cpu & vega gpu leggo blocks to team up nicely on their "fabric", and few amd savvy folk would doubt thats a done deal too - its the same as an apu, which amd has done for many years.

we will see zen/vega apuS b4 xmas, in the more lucrative mobile market first.

Beyond that point, i personally cant see intel/nvidea getting much of a look in, if AMD can grow fast enough to meet demand. But again, the simplicity of their formula, makes me confident they have this covered.

The world will be their oyster.

Take the ~existing Epyc MCM. Its a fabric that can team up to 8 of the above leggo blocks. a server (which it sure is w/ 128 lanes), and could be purchased with just the right balance of cpu/gpu cores to suit projected c/gpu workloads.

I know they've described the CCX as the "base" but I've not seen any dedicated quad cores based on this design yet hence my assumption 8 is the smallest "block" We'll see what happens with APU's though as I suspect you'll get a single ccx there plus graphics linked with the fabric.

Either way still clever stuff!

@At the guy who asked about intels high core count stuff being lower clocks.

Basically with a monolithic design you've got two problems, first of all power delivery and TDP start to effect what high clocks you can get as the chips will start having stability issues earlier on. That and as they are sold for the server market not consumer lower clocks that guarantee stability and TDP are better. Also monolithic design means all x cores need to hit the same target core speed. If 1 doesn't go to 3.5ghz for example then they'll fuse off that 1 off plus its pair and sell it as an 18 core for example ( if it were a 20 core chip)

Where as AMD are making these 8 core chips, ones that arnt right end up in the 1600/1500/1400 series cpus. All the good ones get matched up ( best ones in server, followed by thread ripper and then ryzen) so if you have two chips that can 3.5 ghz on all cores then they'll stick them in a thread ripper package, they'll also have the benefit of the two chips actually being apart and so not heating each other up to such a degree which is why Ryzen/Thread ripper are clocked about the same. A monolithic CPU with that many cores would be toasty to say the least.
(Large chips are more costly to design as you get less chips per wafer, you also increase the chances of having a flaw the larger the chip as well)

Basically if all this works as it should, AMD have a system where they can maximise profit for each product sku whilst minimising losses on bad chips.
Its just chip binning as usual cept the infinity fabric allows them to be a lot more flexible.

#87

Aquinus

Resident Wat-man

pantherx12I know they've described the CCX as the "base" but I've not seen any dedicated quad cores based on this design yet hence my assumption 8 is the smallest "block" We'll see what happens with APU's though as I suspect you'll get a single ccx there plus graphics linked with the fabric.

Isn't Ryzen 5 1400 a single CCX CPU?
Edit: I see that they just disabled two cores and half of the L3 on each CCX in 1400. At that point, why didn't they just do a full CCX? :kookoo:

#88

msroadkill612

pantherx12I know they've described the CCX as the "base" but I've not seen any dedicated quad cores based on this design yet hence my assumption 8 is the smallest "block" We'll see what happens with APU's though as I suspect you'll get a single ccx there plus graphics linked with the fabric.

I ponder that too.

It seems clear that what we historically know as an apu from amd (ryzen mobile nee raven ridge), will ~comply with ryzen basic layout MCM. Instead of 2 x 4 core cpu leggo blocks on the MCM, it will be 1 x 4 core cpu & 1 vega gpu.

If linking these 2 core units on the fabric works as well as it should in amdS uniquely experienced hands, then the concept is very extendable to historically UNFAMILIAR APUs, like 16 core and 4 x gpu on one amd ~epyc MCM.

I agree that their step down from ~ryzen, the expected soon, R3 cpuS, which are 4 core or less, i.e only one leggo block for which the MCM (& maybe chipset) is ~superflous expense, will be revealing.

FYI, I will post an amd slide separately as behoves its interesting nature :).

#89

msroadkill612

FYI

woo - this slide shows a fully populated epyc mcm, with the fabric branching yet again to further cpu/gpu cores.

I am curious how this works physically.

What it may refer to, is that the links are via a 2 socket server? - i.e. the fabric is extended to include a second epyc MCM on the other socket. These 2 socket servers allow double the leggo blocks, but no increase in pcie lanes over a 128 lane 1 socket server, as the extra 128 lanes are used for interconnecting the two ~epyc mcm/fabrics.

Still, its not how it looks. It shows a branch to a cpu/gpu on both ends of the epyc mcm.

www.overclock.net/t/1620103/lightbox/post/26112320/id/3040021

or

www.google.com.au/imgres?imgurl=http://www.overclock.net/content/type/61/id/3040021/width/350/height/700/flags/LL&imgrefurl=http://www.overclock.net/t/1620103/various-exclusive-amd-vega-presentation/920&docid=Wz5LiR-5iN6G0M&tbnid=a0ukJ3NOv60CjM:&vet=10ahUKEwjmxZHSzLrUAhWCWbwKHayrCcIQMwg0KA0wDQ..i&w=350&h=197&bih=880&biw=1776&q=amd epyc slides&ved=0ahUKEwjmxZHSzLrUAhWCWbwKHayrCcIQMwg0KA0wDQ&iact=mrc&uact=8

an exciting (imo) but obscure feature of vega is provision for dedicated raid nvm ssd storage directly connected to the gpu, laying the foundations for unheard of storage speeds for use as ~unlimited (512TB) vram sizes using virtual memory.

Perhaps even ~500GBps HBM2 vram can be pooled among GPUs and even CPUs, tho i doubt the latter (cpus cannot use vram from what i hear).

#90

msroadkill612

The price remains speculative, but yeah, have long thought TR/epyc (same socket & hence mobo) would be surprisingly cheap, rather than dear. As in, 1.5 x an 1800 ryzen, rather than 2x or more as intel may have done.

Fundamental to the biz model, is its a ~fixed cost biz. sure there are variable costs payable to fabs etc., but the unit variable costs pale (and amd are getting pleasing yields as a bonus) beside the money irretrievably sunk in getting that first chip produced and supported.

Nor do i think ryzen price cuts are unconnected with the imminent release of epyc products. Its normal to gouge early adopters a bit when supply is short anyway. early adopters accept there is a premium & dont begrudge it. The new reality is ryzen has competition from sibling products, and must adjust its price point.

This is contrary to the popular view. TR is a consumer product, not for the cores, but for the lanes.

Hi bandwidth devices, NVMEs ssdS especially, have rather blindsided an industry with 5 year product development cycles.

Every such ssd, must have 4 pcie3 lanes (both due to the specification, and because they will soon saturate that bandwidth, and now come close to doing so). It would take 8x sata ssdS to saturate 4 lanes. multiple 16 lane gpuS have gained utility recently. The new norm for lans will be 5-10x the bandwidth of the current 1Gb/120MB lan, ...

Intel even have the nerve to sell current products with a ludicrous 16 lanes total. even the higher lane count models i distrust. - they seem to be shared via the southbridge.

Amd got a bit lucky - they just happened to be aiming for servers via their mcm/fabric, which they may as well flog to consumers as well, and all that server type grunt, serendipitously, is now very desirable on even consumer desktops, and certainly on hedtS. (FYI, TR =64 lanes & epyc 128 lanes. Ryzen x350 - 24 lanes, x370 chipset = 28 lanes).

the apu could be interesting, as it will probably forgo the need for the usual 16 lanes for cpu/gpu interconnect, by simply using the fabric rather than the system bus.

This is how their apuS have worked for many years (A10-7850k e.g.).
As others have ~said here, amdS focus should not be on profits now, but seeding the; vast, lucrative, virgin and cautious server market.

The ecosystem is ~as important as the product. W/o sales volume, developers wont do the tweaks that can make big differences to productivity.

I forget re which product, but lisa su mentioned 5000 "seeder packs" sent pre-release to industry folks. Surely this excellent investment is a serious hit to the current account books, yet alleged stock analysts pilloried AMD for losing a lousy $70m in the quarter ending a few weeks post ryzen release. They are scoundrels and/or fools.

#91

Captain_Tom

infraredI assume AMD have been putting aside all the best 8c dies to use in threadripper. Some of the better 1800X's can do 4.1+ on all cores, my one is running WCG at 4.1ghz 1.406v at the moment, under a lighter load I can push it to 4.2ghz. On the same note power consumption should be reduced with higher quality dies. So I think their target is plausible but probably not easy for them to achieve.

Pretty much this. AMD's CCX design will allow them to launch the entry models at ridiculously low prices while also allowing for the highest core counts we have EVER seen in chips with this many cores.

There will be a 16-core for ~$850, but this model will likely max out at ~3.5-3.8GHz. However with some binning it wouldn't be insane to think they could have 16-cores clocked at 4.0 - 4.2GHz, but at a higher $1200 - $1500 price point.

#92

thesmokingman

I am ready for this. Bring it on!

#93

EarthDog

Captain_TomHowever with some binning it would be insane to think they could have 16-cores clocked at 4.0 - 4.2GHz, but at a higher $1200 - $1500 price point.

Now you are reaching i think. At least not with zen... zen2... with you..

#94

thesmokingman

EarthDogNow you are reaching i think. At least not with zen... zen2... with you..

He's not far off if boost is at 3.9ghz for the X, that is its plausible.

#95

EarthDog

thesmokingmanHe's not far off if boost is at 3.9ghz for the X, that is its plausible.

For all 16 cores like he said??? No..not from the factory.

xfr is one core and these 8 core ccx cant get past their own boost/xfr with all cores for the most part. Not sure why adding another ccx using the same exact architecture can suddenly reach past what others cannot...

That logic leap escapes me (cause its not logical). :)

Seriously...what suddenly makes these all 4-4.2ghz cpus when most 8 cores cant get past 4?

#96

thesmokingman

EarthDogFor all 16 cores like he said??? No..not from the factory.

xfr is one core and these 8 core ccx cant get past their own boost/xfr with all cores for the most part. Not sure why adding another ccx using the same exact architecture can suddenly reach past what others cannot...

That logic leap escapes me (cause its not logical). :)

Seriously...what suddenly makes these all 4-4.2ghz cpus when most 8 cores cant get past 4?

Relax dude. One its certainly possible for them to improve their process, and two the yields on their process have improved to 80%. That's all good news so there's more positive than not. And judging from the clock limitations on Ryzen gen1, it is not cooling related leading to the assumption that it is process limited. Iirc clocks are one of the things they focused on. And lastly, if their 16 core part boosts to 3.9, that's a pretty damn good outlook.

And he was imagining the fantastical probability not stating fact, "However with some binning it would be insane to think they could have 16-cores clocked at 4.0 - 4.2GHz."

www.guru3d.com/news-story/amd-ryzen-14nm-wafer-yields-pass-80-threadripper-cpus-on-track.html

#97

EarthDog

Quite relaxed. :pimp:

Indeed it is possible to improve the process. This is why i conceded zen2 for that type of 'insane to think of' clock.

Threadripper is nothing more than 2 zen ccx's, right? Where is the process improvement? I can see amd binning these cpus, but to fit within a power envelope more so than higher clocks. The more cores, generally, the less stock clockspeed.

I dont find it impressive that one core boosts to 3.9...its no different than one core boosting to 4ghz xfr on one ccx. In my feeble head, the fact that it boosts less with merely one more ccx speaks volumes to me about the 16 core ever clocking that high. :)

And cheers for amd hitting 80% yields...but that has nothing to do with clockspeed on the same exact process and architecture.

I also took the '...insane to think....' part different than you. In a positive light. I mean, why throw the binning part in there if its so outlandish?? When i see sick tricks i call it insane.... but again, in a positive way. A little disbelief, a lot of awesome. ;)

#98

Aquinus

Resident Wat-man

I don't think clocks are the problem. Sure, we've seen a ceiling on what has been release so far but, that only sets the bar. Lets say for a moment that a 16c, 4 CCX CPU comes out. The real question should be, even if clocks are the same, will IPC be maintained and will the interconnect become a limiting factor? So, we know that Xen runs better with faster memory because memory clock is tied to the clock of "the IMC" which communicates with all of the CCXs. The real question should be is the IMC/Infinity Fabric bandwidth sensitive or is it latency sensitive? If it's latency sensitive, it means that quad channel memory might not improve performance as much as we think it will (like how my 3820 can drive my memory at 1600Mhz or 2400Mhz and it is really makes no difference,) which I think is more likely. If it's bandwidth sensitive, it means that the sucker is going to fly and could scale incredibly well in purely parallel workloads which seems too good to be true.

If AMD is smart, they will decouple the memory clock from the IMC/Fabric clock and let them be independently adjusted if it's latency sensitive. That would drive it home all together.

#99

EarthDog

I think its going to remain the same, ipc. I believe there will be worse performance when threads jump ccx's...so it depends on the programming, no? With more ccx's, there is a potential for more thread jumping just by sheer, well, math, lol. But again it depends on how the application is programmed(?)...

Im betting, lets be honest...guessing... quad channel wont matter much. Increased bandwidth, but also increased latency... so, it will depend. Im curious to see how some empirical testing on that goes. :)

#100

msroadkill612

Re clocks, just saying, but as AMD have shifted the desktop/hedt goal posts to seriously multicore+, intel are at a serious TDP/clock/IPC disadvantage with their monolithic chip approach. This has been their traditional strong suite.

Intel 12-16 core may well be technically superior in parts, but have a cost structure of a different order - fab AND cooling.

Add your own comment

AMD's Entry-Level 16-core, 32-thread Threadripper to Reportedly Cost $849

128 Comments on AMD's Entry-Level 16-core, 32-thread Threadripper to Reportedly Cost $849

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

AMD's Entry-Level 16-core, 32-thread Threadripper to Reportedly Cost $849

Related News

128 Comments on AMD's Entry-Level 16-core, 32-thread Threadripper to Reportedly Cost $849

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts