# AMD Zen Features Double the Per-core Number Crunching Machinery to Predecessor



## btarunr (Oct 5, 2015)

AMD "Zen" CPU micro-architecture has a design focus on significantly increasing per-core performance, particularly per-core number-crunching performance, according to a 3DCenter.org report. It sees a near doubling of the number of decoder, ALU, and floating-point units per-core, compared to its predecessor. In essence, the a Zen core is AMD's idea of "what if a Steamroller module of two cores was just one big core, and supported SMT instead."

In the micro-architectures following "Bulldozer," which debuted with the company's first FX-series socket AM3+ processors, and running up to "Excavator," which will debut with the company's "Carrizo" APUs, AMD's approach to CPU cores involved modules, which packed two physical cores, with a combination of dedicated and shared resources between them. It was intended to take Intel's Core 2 idea of combining two cores into an indivisible unit further. 






AMD's approach was less than stellar, and was hit by implementation problems, where software sequentially loaded cores in a multi-module processor, resulting in a less than optimal scenario than if they were to load one core per module first, and then load additional cores across modules. AMD's workaround tricked software (particularly OS schedulers) into thinking that a "module" was a "core" which had two "threads" (eg: an eight-core FX-8350 would be seen by software as a 4-core processor with 8 threads). 

In AMD's latest approach with "Zen," the company did away with the barriers that separated two cores within a module. It's one big monolithic core, with 4 decoders (parts which tell the core what to do), 4 ALUs ("Bulldozer" had two per core), and four 128-bit wide floating-point units, clubbed in two 256-bit FMACs. This approach nearly doubles the per-core number-crunching muscle. AMD implemented an Intel-like SMT technology, which works very similar to HyperThreading.

*View at TechPowerUp Main Site*


----------



## NC37 (Oct 5, 2015)

Just hope AMD isn't going to try to charge a premium for it. Course if they'll finally have CPUs that will go toe to toe with Intel then I'm sure they will.


----------



## bubbly1724 (Oct 5, 2015)

They better deliver this time or they won't have anything left. And the "what if a Steamroller module of two cores was just one big core, and supported SMT instead." sounds like reverse hyperthreading or something, which a lot of people were speculating.


----------



## geon2k2 (Oct 5, 2015)

If they fail, they fail for good.
Apple with A9 just proved that ARM is indeed a solid competitor for Intel so there will be nobody to support AMD for competition sake and they can just die in peace.

Considering though that Intel brought nothing to the table since Sandy Bridge, they might have a chance. (lower lithography gives better power, and very slightly better performance which will be null, when Zen will come, cpu graphics is irrelevant for performance machines, and the rest of the performance increase over sandy is mostly due to higher stock clocks)


----------



## cyneater (Oct 5, 2015)

To little to late?


----------



## hellowalkman (Oct 5, 2015)

Zen seems to have success written all over it which is good news for everyone ..


----------



## john_ (Oct 5, 2015)

I wonder how a Zen core will compare to a Thuban core. That way we will have a real idea about what performance increase we have from AMD after 5 years. Because Bulldozer was one or more steps backwards.


----------



## hellowalkman (Oct 5, 2015)

john_ said:


> I wonder how a Zen core will compare to a Thuban core. That way we will have a real idea about what performance increase we have from AMD after 5 years. Because Bulldozer was one or more steps backwards.



Thuban IPC is in between Steamroller and Excavator I believe ..


----------



## Assimilator (Oct 5, 2015)

geon2k2 said:


> Apple with A9 just proved that ARM is indeed a solid competitor for Intel



In the mobile space. Apple has no intention of competing with Intel on desktop, which is the whole point of AMD.


----------



## Ebo (Oct 5, 2015)

#6

1.Not really, problem with Bulldozer was/is too long a pipeline to run 2 cycles at the same time.

2. They(AMD) hadent more power that I5-2500K especially when that was Oc'ed.

3. The industry didnt go the way AMD had chozen to focus on, just execpt that Bulldozer actually was/is a fine server CPU fore that inviroment at the time when it came out. It wasent intended 110% for gaming, the faults the design had from the start was parcially solved with Visheara core, but thats too old now.

4. *if* the Zen design works, and offers better preformance that I get from my system today, it will be changed in a heartbeat.


----------



## lilhasselhoffer (Oct 5, 2015)

Thuban was a 45 nm process.  While not too bad for its day, AMD is working with the 14 nm process now, correct?

If Zen was just a shrunk down Thuban they'd be working with somewhere between 7 and 9 times as many transistors squashed into the same approximate space (yeah, not exactly accurate, but 90 nm between features and 28 nm is just a ballpark).


What I'd compare Zen to is Sandy Bridge.  Hear me out, because off hand that is a low bar.  What I'd conjecture is needed is good overclocking, a great pricing, DDR4, SATA III, and an ejection of the iGPU theory.  Points 1 and 2 are generally where AMD focuses, so we're good there.  Points 3 and 4 are what AMD promised with the ejection of the AM3+ socket.  The final point is AMD utilizing all of the die space they can to overcome R&D shortcomings.  If AMD can release a desktop CPU that genuinely does all of that, I would gladly go to it rather than a similarly priced Intel offering.  Everything since SB has been either a compromise in overclocking, a compromise in performance (FIVR, sigh), or a compromise in cost (DDR4 really isn't yet performing well enough to justify the upgrade cost). 

Zen could be the first step in AMD getting back to work on good CPUs.  It could also be too little too late.  Let's wait and see, before passing judgement.


Edit:
I have made a mistake.  As per TeNor's correction, the 12 nm process has been changed to a 14 nm process.  Much obliged for the correction.


----------



## micropage7 (Oct 5, 2015)

nice they work for performance per- core
im kinda sick of their many cores and high Ghz but it cant challenge Intel processor
just make mid range processor with better performance per-core and lower power consumption, i guess it would help them in the market much


----------



## bug (Oct 5, 2015)

Number crunching? That's a little suspect.
We already know AMD is using one FPU for every two CPU cores. I hope adding a FPU for each core is NOT the best feature Zen has to offer.


----------



## TeNor (Oct 5, 2015)

*#11*

As far as it can be known AMD will release Zen on 14nm (GloFo) or 16nm (TSMC) FinFET technology.

By the way you are right when you say you'd compare Zen to SB. If Zen reaches SB's performance level I would say well done!

Based on my own Cinebench R15 single thread results calculations, SB has app. 45-50% more IPC than Piledriver/Steamroller and ~30% more than K10. (See how bad is the Bulldozer family?) So reaching SB's performance level would be a great leap forward.

Another question is that it'd be still behind Intel's actual performance level.


----------



## Chaitanya (Oct 5, 2015)

I will believe when I see reviews from independent authority.


----------



## bpgt64 (Oct 5, 2015)

I have all the hope in the world of Zen/AMD, but I will definitely be waiting for a review.  However, if Zen gives us a 16 Core Desktop processor that's within 80% of Haswell Single Threaded performance, I'll be switching...Having a 16 core monster sounds awesome.  Especially considering how Intel has relegated it's 8+ cores to Servers/Xeons for the most part.


----------



## geon2k2 (Oct 5, 2015)

Assimilator said:


> In the mobile space. Apple has no intention of competing with Intel on desktop, which is the whole point of AMD.



Well they could if they want.
They have 2500 geekbench single thread score at 1.8 Ghz and in a very power restricted environment.

http://cdn.arstechnica.net/wp-content/uploads/2015/09/charts.0011.png

An i5 4440 at 3.1 has ~2900 in the same test.

http://browser.primatelabs.com/geekbench3/search?utf8=✓&q=i5+4440

And the FX8350 is around 2400 

http://browser.primatelabs.com/geekbench3/search?utf8=✓&q=fx+8350

They are definitely competitive and that is for sure desktop class CPU and if they could push ARM so far,  I'm sure others will soon follow and there are big heavy names there: Qualcomm, Samsung,  nVidia ...


----------



## mastrdrver (Oct 5, 2015)

Original source

I might be worth noting that Jim Keller worked with DEC in the late 90s when DEC first developed the idea of SMT.

It's believed that the processor that would have come out after the first one with SMT would have gone from 2 threads per core to 4. Some have suggested that one of the changes that will come to Zen+ (the successor to Zen) will make it so it's 4 threads per core.


----------



## dj-electric (Oct 5, 2015)

Did nobody asked:

If Zen is so promising, why did Keller leave after he finished the project?


----------



## happita (Oct 5, 2015)

Dj-ElectriC said:


> Did nobody asked:
> 
> If Zen is so promising, why did Keller leave after he finished the project?



The question has been tackled 100 times. I'll make it 101... it's because he finished his job (contract) and now he has nothing else to do and on top of it AMD can't afford to keep him on for future projects it seems.


----------



## Random Murderer (Oct 5, 2015)

Dj-ElectriC said:


> Did nobody asked:
> 
> If Zen is so promising, why did Keller leave after he finished the project?


Because that's what Keller does; he finishes an architecture and then jumps ships to work on something different. It's not just AMD he's done this to(though this makes the third time he's done it to AMD), he did it to Apple, as well as IBM IIRC.


----------



## librin.so.1 (Oct 5, 2015)

Looks like it actually has *FOUR TIMES the floating point units*.
In bulldozer and later, in full config, there are four FPU2x128bit units, can either act as one 256bit / 2x128bit for a single core or gets split to a single 128bit unit per core on workloads when two cores access the shared FPU unit.
So, by having 4x128bit units per core, in a way, Zen has _four times_ the floating-point units as bulldozer and later.


----------



## AVXX (Oct 5, 2015)

If the Greenland 16-core comes to pass...

... and can clock at a respectable 3GHz+ without melting

... and is priced comparably to Intel's high end desktop / low end workstation offerings

... and packs 16 SMT cores with four SSE FMACs each

.. then AMD are well and truly back in the game. At least until such time as Cannonlake arrives.

(If Cannonlake on desktop has 6-8 cores with AVX512 FMACs, AMD's victory may be rather short lived...)


----------



## lilhasselhoffer (Oct 5, 2015)

TeNor said:


> *#11*
> 
> As far as it can be known AMD will release Zen on 14nm (GloFo) or 16nm (TSMC) FinFET technology.
> 
> ...



Much obliged for the correction.  Don't know why 12 nm popped into my head, but it was in error.

If Zen performs as well as SB, per core, it'll knock the ball out of the park.  IB was a joke, because of that cheap thermal paste.  Haswell brought better paste, but FIVR.  Skylake looks to be a genuine upgrade, but DDR4 just isn't worth the extra cost.

By the time DDR4 drops in price, and speeds up, we'll see Zen.  If it follows other AMD offerings, we'll have a competent PCH, a focus on being unlocked, and a boat load of cores.  SB was locked to 4 cores.  Even SB-e topped out at 6 cores.  SB-e's PCH was terrible (speaking as an owner, it just didn't have enough of anything without expansion cards).  SB overclocked very well, but it suffered the Intel lockdown unless you spent the tax on a K processor.

I'm expecting SB level performance, with more cores, running cooler.  With that kind of a base, the overclocking will more than make up the ground for IB and Haswell.  It still might be behind Skylake, but those extra cores would make all the difference.




Dj-ElectriC said:


> Did nobody asked:
> 
> If Zen is so promising, why did Keller leave after he finished the project?



Every time.

Do you ask why the pediatrician isn't your doctor for life?  Do you ask why the assembly line worker does only one job, and never actually finishes a car?  Do you ask why everyone doesn't cross the finish line in a marathon?  If the answer was yes to any of these you might need to seek medical help, due to damaged cognitive functions.

Keller left because his part was over, and he's functionally a mercenary.  You hire him, set a goal, put money on the table, and negotiate the contract.  Keller doesn't get involved in production, marketing, or support.  He designs, then leaves.  His career speaks to that tendency, and conflating his leaving with some issue is foolish.


----------



## AVXX (Oct 5, 2015)

Not entirely true Gorbaz - SSE4.x & AVX2 both support vector integer computation, but the hardware that crunches it still get referred to as FMACs. Depends whether or not the integer code in question can be vectorized.


----------



## GorbazTheDragon (Oct 5, 2015)

Title is misleading... It only doubles floating point, not integer performance.


----------



## nem (Oct 5, 2015)

intel fanboys after read this new.


----------



## theeldest (Oct 5, 2015)

NC37 said:


> Just hope AMD isn't going to try to charge a premium for it. Course if they'll finally have CPUs that will go toe to toe with Intel then I'm sure they will.



If it goes toe-to-toe with Intel they had BETTER charge a premium for it. Otherwise how will they regain financial stability?


----------



## Solidstate89 (Oct 5, 2015)

NC37 said:


> Just hope AMD isn't going to try to charge a premium for it. Course if they'll finally have CPUs that will go toe to toe with Intel then I'm sure they will.


If it actually does somehow manage to go toe-to-toe with Intel (or at least close to it) why shouldn't they charge more for it then their current CPU offerings?

AMD is a company out to make a profit, not be your best friend.


----------



## cdawall (Oct 5, 2015)

AMD has to recreate their name. Even if this competes with Intel's offering (outside of the server market) AMD still comes with the negative connotation of being, hot power hungry underperforming processors. This will force AMD to sell their CPU's at a lower price to regain marketshare.


----------



## ZeDestructor (Oct 5, 2015)

geon2k2 said:


> Considering though that Intel brought nothing to the table since Sandy Bridge, they might have a chance. (lower lithography gives better power, and very slightly better performance which will be null, when Zen will come, cpu graphics is irrelevant for performance machines, and the rest of the performance increase over sandy is mostly due to higher stock clocks)



Not true. Intel has made steady 5-10% IPC improvements from SNB to SKL (I expand on it in fair detail here and here, complete with sources!)



cdawall said:


> AMD has to recreate their name. Even if this competes with Intel's offering (outside of the server market) AMD still comes with the negative connotation of being, hot power hungry underperforming processors. This will force AMD to sell their CPU's at a lower price to regain marketshare.



They have to compete at the server level if they want to actually earn real money, and with the success that Intel has had since SNB-EP/IVB-EX and newer chips, it's very clear that you need lots of cores, lots of bandwidth, lots of IPC and a decent amount of vector processing on the CPU, even on very GPU-centric boxes, and that's where AMD is reverting to after the utter failure of their HSA "bet" (Bulldozer cores with strong Integer perf, offload FP to GPU(s)).


----------



## FordGT90Concept (Oct 5, 2015)

GorbazTheDragon said:


> Title is misleading... It only doubles floating point, not integer performance.


FPU is where AMD is weakest because, presently, it is shared across two cores.  AMD assumed most FPU work was going to move to the GPU which is false.  They took the Achilles' Heel of the CPU and broke it.


----------



## Casecutter (Oct 5, 2015)

Did anyone notice the picture actually says (with little speculation) from Mark Waldhauer.  Well it seems “Dresdenboy” aka Matthias Waldhauer took inspiration from this patch and hypotheses more here as to what the patch might revel.  

I think it's all speculation and a just really to early to start beating any Zen drum.

While it's fun to think what Mark Keller might have come up with, I think the premise of...


btarunr said:


> "what if a Steamroller module of two cores was just one big core, and supported SMT instead."


Does anyone believe it was that straight forward of a revamp.

I say tapper the enthusiasm as this is "speculation", and nothing that has fact in what AMD/Keller went about laying out.


----------



## RejZoR (Oct 5, 2015)

While it's nice to keep us in the loop, I kinda wish AMD would just drop Zen out of the blue and shock Intel a bit. Giving them headsup like this only means Intel will stop sleeping on laurels and start doing shit right now. Which is bad for AMD considering the resources (in)balance...


----------



## FordGT90Concept (Oct 5, 2015)

If AMD comes out swinging, they'll have a window to reclaim marketshare before Intel can push out processors with more cores.  I'm not entirely sure LGA 15## can even handle a mainstream AMD monster.  Intel could get caught with its pants down with only LGA 2011 being able to respond.  My hope is that AMD will force Intel to move LGA 2011 to mainstream/enthusiast and LGA 15## to budget.


----------



## librin.so.1 (Oct 5, 2015)

RejZoR said:


> Giving them headsup like this only means Intel will stop sleeping on laurels and start doing shit right now. Which is bad for AMD considering the resources (in)balance...


Yeah, because corporate espionage doesn't exist and intel finds this stuff out the same time we normal people get to know it.


----------



## ZeDestructor (Oct 5, 2015)

Casecutter said:


> Did anyone notice the picture actually says (with little speculation) from Mark Waldhauer.  Well it seems “Dresdenboy” aka Matthias Waldhauer took inspiration from this patch and hypothesis more here as to what the patch might revel.
> 
> I think it's all speculation and a just really to early to start beating any Zen drum.
> 
> ...



If the source data is real, then the diagram is pretty logical and straightforward as far as overall speculation goes. And from where I'm sitting, it looks very much like an clone of the SNB-HSW lineup as far as the overall design goes (EU counts, port counts, number of each EU type)



FordGT90Concept said:


> If AMD comes out swinging, they'll have a window to reclaim marketshare before Intel can push out processors with more cores.  I'm not entirely sure LGA 15## can even handle a mainstream AMD monster.  Intel could get caught with its pants down with only LGA 2011 being able to respond.  My hope is that AMD will force Intel to move LGA 2011 to mainstream/enthusiast and LGA 15## to budget.



LGA1155 can easily handle 8, maybe even 12 cores. The reason it can lies in the PCIe and memory controllers being on the same die as the CPU core, and as a result, the socket just acts as a unified PCIe/RAM/DMI socket. Hell, if they wanted to, they could probably fit their 18-core monster Xeon on LGA1155, but at that point, dual-channel RAM would be an actual limiting factor.

EDIT: I talked about it in this old thread a bit, and my expetation hasn't changed: if Zen outperforms SKL/KBL, Intel will just change core counts at the different price points and match AMD again, the obvious one being enabling HT on i5 and moving i7 to 6core HT.


----------



## FordGT90Concept (Oct 6, 2015)

It's the power/VRM that's the problem.  As far as we know LGA 1151 tops out at 95 watt which i7-6700K uses.  If they add two more cores, something is going to suffer be it cache, removal of the GPU, or a significant drop in clockspeed.


----------



## ZeDestructor (Oct 6, 2015)

FordGT90Concept said:


> It's the power/VRM that's the problem.  As far as we know LGA 1151 tops out at 95 watt which i7-6700K uses.  If they add two more cores, something is going to suffer be it cache, removal of the GPU, or a significant drop in clockspeed.



Minor drop in clockspeed (heat/power scales linearly with clock, quadratically with voltage, and voltage scales with clockspeed.. effectively something between linear and cubic scaling overall), and require the beefier end of mobos (the average Z170 should be good, with the average 25+% OC considered bare minimum). Possibly more restrained Turbo as well, for good measure.


----------



## librin.so.1 (Oct 6, 2015)

FordGT90Concept said:


> removal of the GPU


When was the last time Intel did this to a non-Xeon CPU?


----------



## Scrizz (Oct 6, 2015)

cdawall said:


> AMD has to recreate their name. Even if this competes with Intel's offering (outside of the server market) AMD still comes with the negative connotation of being, hot power hungry underperforming processors. This will force AMD to sell their CPU's at a lower price to regain marketshare.



That sounds like Intel in the P4-Pentium D era.


----------



## FordGT90Concept (Oct 6, 2015)

Vinska said:


> When was the last time Intel did this to a non-Xeon CPU?


It would have made sense on 6700K.  Even so, we don't see 6 core Xeons on LGA 15## either.


----------



## Corey (Oct 6, 2015)

cyneater said:


> To little to late?


 
Nah they ain't on their death bed yet, they are close. I think a lot of people have been waiting for this chip for a long time. I have personally been waiting to upgrade my FX 8350 and my Sandy Bridge i7. It might be better than the FX but I don't think its going to do much over the Sandy (should be a lil faster but prob not enough to upgrade). What is going to matter the most is if they bring Zen out at the cheaper price point of the i5's hoping it sits between the i5 and i7 in performance.


----------



## RejZoR (Oct 6, 2015)

FordGT90Concept said:


> It's the power/VRM that's the problem.  As far as we know LGA 1151 tops out at 95 watt which i7-6700K uses.  If they add two more cores, something is going to suffer be it cache, removal of the GPU, or a significant drop in clockspeed.



No it's not. They already have 5820K and boards can deliver way over 95W for LGA1151. Thy just need to sack the stupid GPU part in 6700K and replace it with extra cores. But they're not going to do that because 5820K already exists and they don't want to spit in their own enthusiast bowl...


----------



## FordGT90Concept (Oct 6, 2015)

5820K is on LGA 2011.  See this post for context.


----------



## Steevo (Oct 6, 2015)

I look at it all this way.


We are close to the end of the line with IPC improvements for Intel or AMD, the rest will come through process, cache, and instruction set/hardware support. We are close to the end of the line with Silicon in the high performance categories. We can slap more cores in, more sockets, more memory. But the next big thing is either going to be quantum computing, or photon based. 


Either way until I see hard numbers from a source I trust, AMD is a zombie.


----------



## AsRock (Oct 6, 2015)

RejZoR said:


> While it's nice to keep us in the loop, I kinda wish AMD would just drop Zen out of the blue and shock Intel a bit. Giving them headsup like this only means Intel will stop sleeping on laurels and start doing shit right now. Which is bad for AMD considering the resources (in)balance...



I am sure intel have so many ide;as by now that even i f AMD did release a totally awesome CPU intel would have some thing better as we all know intels been holding back it's had a hell long time to plan shit.


----------



## RejZoR (Oct 6, 2015)

FordGT90Concept said:


> 5820K is on LGA 2011.  See this post for context.



You know, considering I own one, I'd most likely know that, don't you think? What I was saying is that they have the tech in retail form. They can easily produce something similar for the LGA1151 in no time.

Only thing that AMD will mess up is the way how mainstream and enthusiast is now separated. Because if Intel will have to bump mainstream up quickly, it means they'll have to bump up enthusiast as well, otherwise they'll make them equal which means they'll sell less of the more expensive enthusiast platforms and CPU's. But that's good as far as consumers go, assuming the prices stay low...


----------



## FordGT90Concept (Oct 6, 2015)

It takes at least a year to produce a new processor including design from an existing microarchitecture to prototyping to mass production.  If AMD put out a processor that beats 6700K (in today's terms) for $300, Intel couldn't respond with a competitive product for a year.  Even assuming they already did prototyping and have a design ready for mass production the shelves, AMD could still get a several month long window to grab market share.

Intel's immediate response would probably be cutting prices across the board which likely moves LGA 2011 into mainstream prices.  As I said in the post, a shocker from AMD could turn LGA 1151 into a budget socket overnight and LGA 2011 into mainstream.


----------



## geon2k2 (Oct 6, 2015)

ZeDestructor said:


> Not true. Intel has made steady 5-10% IPC improvements from SNB to SKL (I expand on it in fair detail here and here, complete with sources!)



You're veteran here and for sure you know what you are saying.
Obviously there were improvement and new instruction sets but they were underwhelming in current applications to say the least. 

Of course Intel will still have the performance crown for a long time from now on as they have strong Enthusiast parts, but that is not the mainstream market. The mainstream market is formed by the i3, i5 and i7 non E series.

If anyone, can produce a competitive product here then they are in business, and intel in this area was pretty much sleeping over the years. You said it yourself in those long posts, ivy was die shrink, haswell brought very good power consumption, broadwell its a different beast but I'd exclude as it has expensive edram, and its mostly a very expensive GPU with CPU cores, good for laptops ... but not very viable in my opinion from cost perspective for desktop. Lake mostly die shrink ... again a bit underwhelming.

I found an article which compares different generation performance at same clock also. 
It doesn't include the latest architecture though, but it is focused on gaming, which is basically the reason for which most of the people buy these processors, otherwise for browsing or office even the atom is good. If you think is too extreme, fine go with an i3.

See here: http://wccftech.com/intel-sandy-bridge-ivy-bridge-haswell-graphics-compared-10-difference-average/

I put them in a table also, and compared similar products:

Lowest details   i7 2600K @4.5   i7 4770K @4.5   % increase
Crysis 3         91              93              2%
Black Ops 2      355.4           382.8           8%
Bioshock         243.2           265.9           9%
Battlefield 3    199.8           200             0%
Unigine Heaven   4243            4280            1%
Firestrike       7292            7466            2%

Average:                                         4%

This increase you will probably get by just shrinking the original sandy without any architecture change. Maybe things will change once the new instructions sets will start to be used ... but if they are not supported by most of the computers in use, they will mostly be scattered optimizations in one app or another. 

So yes, if anybody can get sandy level performance, good clocks and reasonable power consumption they are back in business.


----------



## bug (Oct 6, 2015)

geon2k2 said:


> You're veteran here and for sure you know what you are saying.
> Obviously there were improvement and new instruction sets but they were underwhelming in current applications to say the least.
> 
> Of course Intel will still have the performance crown for a long time from now on as they have strong Enthusiast parts, but that is not the mainstream market. The mainstream market is formed by the i3, i5 and i7 non E series.
> ...



Those tests are only looking at graphics. Here's a more complete set: http://www.hardocp.com/article/2015/08/05/intel_skylake_core_i76700k_ipc_overclocking_review/4
You'll see gains in synthetic performance and mostly encoding and rendering. Other things are in the same ballpark as Sandy Bridge.


----------



## AsRock (Oct 6, 2015)

FordGT90Concept said:


> It takes at least a year to produce a new processor including design from an existing microarchitecture to prototyping to mass production.  If AMD put out a processor that beats 6700K (in today's terms) for $300, Intel couldn't respond with a competitive product for a year.  Even assuming they already did prototyping and have a design ready for mass production the shelves, AMD could still get a several month long window to grab market share.
> 
> Intel's immediate response would probably be cutting prices across the board which likely moves LGA 2011 into mainstream prices.  As I said in the post, a shocker from AMD could turn LGA 1151 into a budget socket overnight and LGA 2011 into mainstream.



Not sure if it would take a year for intel, but so if it did i bet all current chips would drop in price to discourage people from buying other than theirs.


----------



## medi01 (Oct 6, 2015)

Guys, back in the Prescott times Intel had no problems more power hungry slower yet more expensive P4's over Athlon64's, why would it have any problems with Zen?


----------



## GorbazTheDragon (Oct 6, 2015)

geon2k2 said:


> So yes, if anybody can get sandy level performance, good clocks and reasonable power consumption they are back in business.


The thing is you can't just consider gaming performance. SKL is a good 20-30% faster in CPU bound applications... In fact, the FX CPUs still make decent gaming machines, as long as you are not pushing for very high framerates in certain games, but the fact that they are now lagging behind intel in overall performance is the killer, especially considering they are 100-200w parts. 

SB level performance on a per core basis will be acceptable, assuming the power envelope for a quad core is around the same or lower than SKL, but they will need to be pushing at least 8 core parts and should consider 24-32 pcie lanes for higher end GPU setups.

Also, they need to be considering at least triple channel memory, especially if they want to put the APUs in the same socket, otherwise there will be 0 reason to not go with an intel part and dGPU. That said, intel should also be considering the same if they want to push their iGPU market forward. And this is all on top of the things skylake already brings to the table as far as chipset connectivity


----------



## ZeDestructor (Oct 6, 2015)

Scrizz said:


> That sounds like Intel in the P4-Pentium D era.



Bulldozewr has been AMD's NetBurst



FordGT90Concept said:


> It would have made sense on 6700K.  Even so, we don't see 6 core Xeons on LGA 15## either.



Never gonna happen on the mainstream chip. It's just too good a place to crush the low-end nVidia/AMD cards. Remember, the 6700K is just an unlocked 6700/E3-1275v5 (whenever that comes out), which are both quite popular in business settings (partly because of the quite decent iGPU). It's also the same core as shared with all other quad-core CPUs, with some cache and HT sliced off for the i5 variants.



Steevo said:


> We are close to the end of the line with IPC improvements for Intel or AMD, the rest will come through process, cache, and instruction set/hardware support. We are close to the end of the line with Silicon in the high performance categories. We can slap more cores in, more sockets, more memory. But the next big thing is either going to be quantum computing, or photon based.



Pretty much



RejZoR said:


> Only thing that AMD will mess up is the way how mainstream and enthusiast is now separated. Because if Intel will have to bump mainstream up quickly, it means they'll have to bump up enthusiast as well, otherwise they'll make them equal which means they'll sell less of the more expensive enthusiast platforms and CPU's. But that's good as far as consumers go, assuming the prices stay low...



Been calling that since the Broadwell rumour days. KBL or CNL is when that will happen I think.



FordGT90Concept said:


> It takes at least a year to produce a new processor including design from an existing microarchitecture to prototyping to mass production.  If AMD put out a processor that beats 6700K (in today's terms) for $300, Intel couldn't respond with a competitive product for a year.  Even assuming they already did prototyping and have a design ready for mass production the shelves, AMD could still get a several month long window to grab market share.
> 
> Intel's immediate response would probably be cutting prices across the board which likely moves LGA 2011 into mainstream prices.  As I said in the post, a shocker from AMD could turn LGA 1151 into a budget socket overnight and LGA 2011 into mainstream.



They probably already have 6-core LGA115x variants internally validated (or nearing validation) and ready to go (I know I would with the rumours currently flying). Based on how close the E launch has been with the EP launch, it looks like most of the extra time needed for E/EP/EX chips is validation in multi-CPU and PCIe validation (based on how everything from quads to 18-core chips launch at the same time), both of which are not needed on LGA115x..



AsRock said:


> Not sure if it would take a year for intel, but so if it did i bet all current chips would drop in price to discourage people from buying other than theirs.



Some reshuffling of the lineup: increased core counts on i7, HT on i5.



medi01 said:


> Guys, back in the Prescott times Intel had no problems more power hungry slower yet more expensive P4's over Athlon64's, why would it have any problems with Zen?



Intel had to resort to some seriously scummy anti-competitive tactics back then, got sued by AMD, AMD won, and proceeded to wreck Intel the server space with their way better Opterons. Intel seems determined not to get back into a similar situation ever again.



GorbazTheDragon said:


> The thing is you can't just consider gaming performance. SKL is a good 20-30% faster in CPU bound applications... In fact, the FX CPUs still make decent gaming machines, as long as you are not pushing for very high framerates in certain games, but the fact that they are now lagging behind intel in overall performance is the killer, especially considering they are 100-200w parts.
> 
> SB level performance on a per core basis will be acceptable, assuming the power envelope for a quad core is around the same or lower than SKL, but they will need to be pushing at least 8 core parts and should consider 24-32 pcie lanes for higher end GPU setups.
> 
> Also, they need to be considering at least triple channel memory, especially if they want to put the APUs in the same socket, otherwise there will be 0 reason to not go with an intel part and dGPU. That said, intel should also be considering the same if they want to push their iGPU market forward. And this is all on top of the things skylake already brings to the table as far as chipset connectivity



AMD already has quad-channel memory on their high-end Opterons. What they really need to do is move the PCIe controller into the CPU and turn HyperTransport up to 11 in order to make cross-CPU PCIe access not crap in order to match what Intel is doing on LGA2011.


----------



## GorbazTheDragon (Oct 6, 2015)

ZeDestructor said:


> AMD already has quad-channel memory on their high-end Opterons. What they really need to do is move the PCIe controller into the CPU and turn HyperTransport up to 11 in order to make cross-CPU PCIe access not crap in order to match what Intel is doing on LGA2011.


I'm pretty sure they will have an integrated PCIe controller, especially since they want to have the same socket for both the normal CPUs and APUs. Opterons have no influence on the consumer market, even less than intels high end Xeons.

HyperTransport is essentially the equivalent to DMI when AMD moves to an integrated northbridge (memory/PCIe controller) so to accommodate the faster chipset interfaces (PCIe3/SATAe/M.2, high sata 6g count, USB3.1, etc) they will NEED to make improvements there.


----------



## RichF (Oct 7, 2015)

TeNor said:


> *#11*
> 
> As far as it can be known AMD will release Zen on 14nm (GloFo) or 16nm (TSMC) FinFET technology.
> 
> ...


Cinebench is an Intel-centric benchmark as far as I know because it emphasizes FP.


----------



## ZeDestructor (Oct 7, 2015)

GorbazTheDragon said:


> I'm pretty sure they will have an integrated PCIe controller, especially since they want to have the same socket for both the normal CPUs and APUs. Opterons have no influence on the consumer market, even less than intels high end Xeons.



Err.. wat? Do you not remember the good old S940 Athlon FX chips? Those were straight up binned Opteron chips, down to requiring ECC FB-DIMMs on consumer boards. Single-CPU Opteron platforms also share the same socket as desktop Opterons (AM3+ right now). You need to go up to multi-socket server before Socket C32 and Socket G34 come into play. They also reuse Socket FT3 (BGA relative to AM1) for the APU-based server chips.



GorbazTheDragon said:


> HyperTransport is essentially the equivalent to DMI when AMD moves to an integrated northbridge (memory/PCIe controller) so to accommodate the faster chipset interfaces (PCIe3/SATAe/M.2, high sata 6g count, USB3.1, etc) they will NEED to make improvements there.



No. AMD so far has only moved the memory controller to the CPU on the big cores, much like LGA1366 Nehalem/Westmere from Intel. This results in HyperTransport being used to do inter-CPU communication and CPU-NorthBridge communication, with a seperate SouthBridge. HyperTransport is really much closer to QPI than DMI.

Of course, the APUs have been updated much more frequently, and have over time had the PCIe controller integrated into the CPU. Rumours say that they want to integrate the SouthBridge too at some point, which would be quite fun to see.

The DMI equivalent on AMD's side would be A-Link Express or UMI, basically a tweaked PCIe 4x interface.



RichF said:


> Cinebench is an Intel-centric benchmark as far as I know because it emphasizes FP.



There's no Intel preference, just that AMD didn't put in as much FP hardware in Bulldozer (based on the bet that basically everyone would port FP code to GPUs). If you compare a 6-core Thuban to a 6-core Bulldozer, The older Thuban core beats the Bulldozer in Cinebench MT.


----------



## medi01 (Oct 7, 2015)

ZeDestructor said:


> Intel had to resort to some seriously scummy anti-competitive tactics back then, got sued by AMD, AMD won, and proceeded to wreck Intel the server space with their way better Opterons. Intel seems determined not to get back into a similar situation ever again.



Opteron never went beyond 12% of the market share. (despite all the cheaper/cooler/faster)
Intel has not been defeated in court, they've merely settled for 1 billion $. (a laughable sum, compared to the damage done)

There was hardly anything new as far as anti-competitive practices go in times of Prescott, AMD had the same problems prior to that.


Neither did nVidia lose much of the market share, despite Fermi fiasco.


PS
On a side note. Does any AMD CPU beat i5 750 (a 45nm product) IPC wise? Or have something with comparable performance but better perf/watt?


----------



## ZeDestructor (Oct 7, 2015)

Intel went from 5% to over 25% in a matter of 2 years from K8. By 2007 though, Core 65nm (Conroe) was ready, so that's where it stalled and Intel took the crown back.

Fermi succeeded despite it's issues because of CUDA, even though nVidia lost a solid 10% marketshare (thanks to dbz over from beyond3d). In comparison, AMD's Stream API was hopeless, and OpenCL was just a pipe dream

On the i7-750, the FX-8150 beats it in some places, and loses out in others. An FX-8350 in comparison is either neck and neck or faster


----------



## medi01 (Oct 8, 2015)

AMD went from 12% to 25% (more than I thought) but still rather a modest change, considering that we are talking about product that was better on all fronts.

CUDA argument is laughable.

As a bottom line, we see slower more expensive more power hungry products by AMD competitors only mildly affect the market share (10%-ish swing, if at all) while AMD losing one of the points can lead to it's market share to be cut in half, this looks particularly bad in graphic cards market, where AMD has more than competitive products.
.


----------



## R-T-B (Oct 9, 2015)

ZeDestructor said:


> Fermi succeeded despite it's issues because of CUDA, even though nVidia lost a solid 10% marketshare (thanks to dbz over from beyond3d). In comparison, AMD's Stream API was hopeless, and OpenCL was just a pipe dream



Agreed on Stream (it was destined to die), but I think you are massively underestimating the market penetration of OpenCL.


----------



## progste (Oct 9, 2015)

I hope we finally get a great AMD CPU lineup


----------



## Aquinus (Oct 9, 2015)

ZeDestructor said:


> AMD's Stream API was hopeless, and OpenCL was just a pipe dream


OpenCL is not a pipe dream, it's a tool for a very special group of problems. The issue is that OpenCL is only helpful in a handful of situations where mass parallel processing on large data sets is advantageous. You think making applications multi-threaded is hard? Imagine having only parallel compute at your disposal and that's what OpenCL basically is. One doesn't tend to just write any application using OpenCL, it tends to compliment other applications that aren't OpenCL, not stand out on its own. Just wanted to point that out. You use the right tool the right job. You don't go writing real time applications in PHP and you don't to write web applications in C (cgi-bin). You use the right tool for the right job.

What is a pipe dream is AMD thinking that they could make a pipeline as long as Netburst but not encounter the same problems as Netburst. That's what I call a pipe dream.


----------



## Arjai (Oct 10, 2015)

That's it. I am seriously going to talk with my Broker.

Zen, could make or break my investment future. Or, Not. I figure 1 to 2 Hundred of investment, now, could be a big deal or a minor loss.

Putting it on my Monday Calendar. I am gonna Bank on the new Zen chip. There's just too much to possibly lose, not doing it!


----------



## bpgt64 (Oct 10, 2015)

I would love AMD to take a good look a comparable ShadowPlay esq feature.  It's been the only way(outside of a 1,000US investment in a 4k external capture system) to record in 4k for most games.


----------



## librin.so.1 (Oct 10, 2015)

I believe they already have that.


----------



## 64K (Oct 10, 2015)

AMD has two battles to fight. The biggest one isn't whether Zen is a good performer compared to whatever Intel has in the battle chest to counter Zen if it's strong. The biggest battle they have to fight is getting their chips in the Computer Manufacturer's PCs at a decent profit. It does little good for AMD to have great chips if they can't sell them at a decent profit.

http://www.investopedia.com/stock-a...oblems-facing-advanced-micro-devices-amd.aspx

"AMD competes with Intel in PC and server processors but has been bleeding share in both markets. In servers, AMD won as high as 25% market share in 2006, making it a major player in the industry. Today, AMD is essentially nonexistent in the segment, claiming a low single-digit share. Meanwhile, Intel has built a near-monopoly, allowing it to charge high prices and extract extremely high margins.

In PCs, the story is much the same. Intel has continued to steal market share from AMD, especially at the low end with its Atom chips, despite already controlling most of the segment. During the second quarter of 2014, Intel generated nearly 95% of PC processor revenue, shipping 84% of all desktop processors and 88% of all laptop processors."

The PC market continues to shrink as well. I read a report that that may change around 2019 as emerging markets come online but that is just speculation for now.


----------



## RichF (Oct 11, 2015)

ZeDestructor said:


> There's no Intel preference, just that AMD didn't put in as much FP hardware in Bulldozer...


There are other benchmarks that don't emphasize FP so much. Just using Cinebench to compare processors is questionable.


----------



## alucasa (Oct 11, 2015)

I am old enough to know that whatever numbers they are throwing must be taken with a mountain of salt, especially given their past track record.

I will hold my judgement until I actually see this thing rolling.


----------



## cadaveca (Oct 12, 2015)

alucasa said:


> I am old enough to know that whatever numbers they are throwing must be taken with a mountain of salt, especially given their past track record.
> 
> I will hold my judgement until I actually see this thing rolling.


One might question why this even made it into the public space. Given rumors of sales/merger/breakup, this sort of info is truly only useful to investors or those looking to purchase AMD.


----------



## Israsuke (Oct 14, 2015)

Good for AMD! So does that mean, Zen won't be an eight-core? More like a quad-cores with large modules? I feel a slight letdown, again good for AMD, if not they will go broke.

I was hoping an eight core that performs what bulldozer should have been from the start. A real eight-core at a decent price.  If Zen gets close to the performance of Skylake or cannonlake,  I won't be upgrading my 3770k, especially with prices of DDR4. It feels CPU technology has been void and stuck for the last 5 years on performance and speed.

I do a lot of editing and buying an eight-core from Intel feels like a rip off, really a 1000 dollars for a chip? what is it made of pixie dust? Hope AMD gets the market stable on  the price, but that also means I'll be waiting at least 2 or 3 years to have a eight-core at a good price, like the rest of us.


----------



## ZeDestructor (Oct 16, 2015)

R-T-B said:


> Agreed on Stream (it was destined to die), but I think you are massively underestimating the market penetration of OpenCL.





Aquinus said:


> OpenCL is not a pipe dream, it's a tool for a very special group of problems. The issue is that OpenCL is only helpful in a handful of situations where mass parallel processing on large data sets is advantageous. You think making applications multi-threaded is hard? Imagine having only parallel compute at your disposal and that's what OpenCL basically is. One doesn't tend to just write any application using OpenCL, it tends to compliment other applications that aren't OpenCL, not stand out on its own. Just wanted to point that out. You use the right tool the right job. You don't go writing real time applications in PHP and you don't to write web applications in C (cgi-bin). You use the right tool for the right job.
> 
> What is a pipe dream is AMD thinking that they could make a pipeline as long as Netburst but not encounter the same problems as Netburst. That's what I call a pipe dream.



At the time OpenCL was a pipe dream. Nowadays the story is very different.



bpgt64 said:


> I would love AMD to take a good look a comparable ShadowPlay esq feature.  It's been the only way(outside of a 1,000US investment in a 4k external capture system) to record in 4k for most games.



Gaming Evolved



RichF said:


> There are other benchmarks that don't emphasize FP so much. Just using Cinebench to compare processors is questionable.



Of course not, that's why I linked two comparisons of a range of benchmarks, where we find that AMD loses where FP is involved (Video/Image processing, Monte Carlo for example), but wins at integer (7-zip file compression for example).



Israsuke said:


> Good for AMD! So does that mean, Zen won't be an eight-core? More like a quad-cores with large modules? I feel a slight letdown, again good for AMD, if not they will go broke.
> 
> I was hoping an eight core that performs what bulldozer should have been from the start. A real eight-core at a decent price.  If Zen gets close to the performance of Skylake or cannonlake,  I won't be upgrading my 3770k, especially with prices of DDR4. It feels CPU technology has been void and stuck for the last 5 years on performance and speed.
> 
> I do a lot of editing and buying an eight-core from Intel feels like a rip off, really a 1000 dollars for a chip? what is it made of pixie dust? Hope AMD gets the market stable on  the price, but that also means I'll be waiting at least 2 or 3 years to have a eight-core at a good price, like the rest of us.



Haha, thinking $1000 is expensive.. how cute. A mid-range Xeon is closer to around $2500 nowadays, with the top-end ones cracking $6000

Intel charges what they want because they can, and thanks to mainstream desktop not needing more than 4 cores without HT, there just isn't enough demand for more cores on LGA115x for Intel to move it's slow ass and raise core counts. When they increse the core count again, I have a strong feeling that unlike 1->2 cores or 2->4cores, the performance boost of 4->6 or 4->8 will take a lot longer to materialize for most, if at all.

On the other end of the spectrum, more and more stuff wants a good GPU, and the popularity of phones and tablets demand ever more GPU to smooth animations as far as possible, so the iGPU keeps getting improved.


----------



## bpgt64 (Oct 16, 2015)

Gaming evolved SUCKS.


----------



## ZeDestructor (Oct 16, 2015)

bpgt64 said:


> Gaming evolved SUCKS.



Funny.. I rather like GFE on my end...


----------



## TheGuruStud (Oct 22, 2015)

geon2k2 said:


> Well they could if they want.
> They have 2500 geekbench single thread score at 1.8 Ghz and in a very power restricted environment.
> 
> http://cdn.arstechnica.net/wp-content/uploads/2015/09/charts.0011.png
> ...



You're high if you think ARM can do general processing with the power that x86 has.

Go ahead and run a real app on one core of x86 and that pitiful arm chip. Guess what is going to happen? Synthetic benchmarks are more useless than ever.


----------



## Aquinus (Oct 22, 2015)

TheGuruStud said:


> You're high if you think ARM can do general processing with the power that x86 has.
> 
> Go ahead and run a real app on one core of x86 and that pitiful arm chip. Guess what is going to happen? Synthetic benchmarks are more useless than ever.


Depends on the task. If most of the application's instructions are simple integer math operations for data and addresses, an ARM CPU will do pretty well because both x86 and ARM architectures will execute instructions like this in a single cycle. The time when this becomes a difference is when you start considering the more complex instructions offered by CISC instruction set CPUs like x86. Extensions like SSEx where introduced to do what would normally take several clock cycles and reduces it to only a handful if not a single cycle. However, that comes at a cost. It requires more circuity and transistors to have the extra logic to do these more complex instructions quickly. The result is higher manufacturing costs and higher power consumption but, on the other hand you can get significantly improved performance depending on the application.

So I won't say that ARM is crap compared to x86 because, it depends on what you're doing. If you're using a browser, reading/writing email, or playing a simple game like Angry Birds, an ARM CPU is more than enough. However, if you're doing video encoding, physics processing, or really any floating point math application, you're better off with something that can do a little more in a little less time but, that only helps you if you have the power to spare.

I just thought that a more balanced perspective on the matter was required because neither architectures are bad, it's just that they were designed with different things in mind under different philosophies.


----------



## Titus Joseph (Oct 23, 2015)

To add to that, x86 is designed for larger platforms like server, desktop, laptops.

Arm is mobile platform specific as of now. So the requirements and there for the capacity varies. Intel's x86 mobile processors which are being used in ASUS ZenFone series which I m using, are great performers but at the cost of higher power consumtion. I had to implement tonnes of tweaks to get a steady 10 hour continuous usage from it from 100% to 0%. 

That its self says a lot.


----------



## ZeDestructor (Oct 25, 2015)

TheGuruStud said:


> You're high if you think ARM can do general processing with the power that x86 has.
> 
> Go ahead and run a real app on one core of x86 and that pitiful arm chip. Guess what is going to happen? Synthetic benchmarks are more useless than ever.





Aquinus said:


> Depends on the task. If most of the application's instructions are simple integer math operations for data and addresses, an ARM CPU will do pretty well because both x86 and ARM architectures will execute instructions like this in a single cycle. The time when this becomes a difference is when you start considering the more complex instructions offered by CISC instruction set CPUs like x86. Extensions like SSEx where introduced to do what would normally take several clock cycles and reduces it to only a handful if not a single cycle. However, that comes at a cost. It requires more circuity and transistors to have the extra logic to do these more complex instructions quickly. The result is higher manufacturing costs and higher power consumption but, on the other hand you can get significantly improved performance depending on the application.
> 
> So I won't say that ARM is crap compared to x86 because, it depends on what you're doing. If you're using a browser, reading/writing email, or playing a simple game like Angry Birds, an ARM CPU is more than enough. However, if you're doing video encoding, physics processing, or really any floating point math application, you're better off with something that can do a little more in a little less time but, that only helps you if you have the power to spare.
> 
> I just thought that a more balanced perspective on the matter was required because neither architectures are bad, it's just that they were designed with different things in mind under different philosophies.





Titus Joseph said:


> To add to that, x86 is designed for larger platforms like server, desktop, laptops.
> 
> Arm is mobile platform specific as of now. So the requirements and there for the capacity varies. Intel's x86 mobile processors which are being used in ASUS ZenFone series which I m using, are great performers but at the cost of higher power consumtion. I had to implement tonnes of tweaks to get a steady 10 hour continuous usage from it from 100% to 0%.
> 
> That its self says a lot.



Not exactly.. comparing ARM to x86 is hard, because of how easy it is to build a custom ARM SoC vs an x86 SoC, so really, you need to be comparing them in each segment they're in.

Server:

In serverland, you're mostly limited by whether or not you can scale horizontally (as far as CPUs go). If you can scale horizontally, the ARM chips are competitive with x86 in terms of overall/total cost, because what you trade in in terms of per-CPU performance, you regain back in terms of being able to fit more machines in the same space (a high-end ARM SoC fits in about the same sort of space a RasPi takes, while Atom, for all it's low-power-ness, still needs about twice as much space).

The result is that some companies, like Linode, are using ARM chips for their low-power use cases, and others, like Google and Facebook, are considering ARM alongside POWER.

Desktop:

ARM hasn't had a desktop chip since the original Acorn RISC machines. Still, as far as basic browser/productivity/media use goes, something like the Shield microconsole or RasPi 2 running a better OS than Android do OK. Not amazing (mostly because of limited RAM), but OK.

Laptop:

ARM Chromebooks do about as well as low-end x86 chromebooks, but very few use high-end SoCs like Tegra K1/X1, so they lose out to their x86 brethren. Linux users also have fun with it, but in the end, nothing really beats a proper high-end Windows laptop (Like an XPS13 for example) running Linux.

Phones & Tablets:

In the phone and tablet arena, x86 has mostly been hampered by overall platform power and complexity, not performance. If you look at the landscape, one company stands above all the others: Qualcomm. Qualcomm has such a position because of their ability to pack the CPU, GPU, DSPs, ISP, modem, wifi, bluetooth, GPS all into the same die, then strap the RAM on top of the same package. This makes the board design really, really simple, since all you have to do is wire up the sensors (camera, accelerometer, gyro, barometer, temperature), radio frontends, PMIC, screen, digitizer and you're done. On x86, as of right now, you have to put on the CPU/RAM package, then wire up the various wireless interfaces (cellular, GPS, and wifi/BT) to the SoC. With 3 more extra fairly hefty chips to put on the board, things get expensive and raises idle battery usage a fair chunk. This is why the ZenFone 2 is the only phone using x86, and it shows compared to Qcomm. Other companies (MediaTek) also have more integrated stuff, similar to Qcomm.

As for the CISC/RISC argument.. that argument sailed a loooong time ago, around when IBM (POWER), Intel (Pentium Pro/II) and ARM (ARM8/ARMv4 I think, because of their use of speculative processing.. arguably even ARM7T/ARMv4T because of it's pipelining) went Out-of-Order/speculative/superscalar because they all decode the instructions into micro-ops and are no longer run directly. The myth kinda lived on for a while though, because Intel was, well, kinda crap at platform design compared to IBM - most of their CPUs, up until the P4 and Nehalem jumps really, were memory-starved by the FSB, and had higher power consumption than the POWER cores, which is a RISC core. That all changed when Apple replaced the hot-running PowerPC 970/G5 (POWER4) with cool-running Core chips, so much that nobody cares about RISC vs CISC anymore outside of people writing raw assembly, and even then, only for ease of use arguments, not performance.

EDIT: On the subject of pipelining: longer pipelines (i.e more stages, iirc SNB-SKL is 11-14 depending on instruction, Bulldozer is 31) let you achieve higher clockspeeds, but the longer the pipeline, the worse the penalty of a pipeline stall (from a branch mispredict causing a flush, for example). The problem with a long pipeline lies in the stall penalty just going up somewhat exponentially, as Intel learnt with NetBurst, and AMD through Bulldozer (though the pipeline length isn't as dominating an issue there as it was with NetBurst...).


----------



## Aquinus (Oct 25, 2015)

ZeDestructor said:


> As for the CISC/RISC argument.. that argument sailed a loooong time ago, around when IBM (POWER), Intel (Pentium Pro/II) and ARM (ARM8/ARMv4 I think, because of their use of speculative processing.. arguably even ARM7T/ARMv4T because of it's pipelining) went Out-of-Order/speculative/superscalar because they all decode the instructions into micro-ops and are no longer run directly. The myth kinda lived on for a while though, because Intel was, well, kinda crap at platform design compared to IBM - most of their CPUs, up until the P4 and Nehalem jumps really, were memory-starved by the FSB, and had higher power consumption than the POWER cores, which is a RISC core. That all changed when Apple replaced the hot-running PowerPC 970/G5 (POWER4) with cool-running Core chips, so much that nobody cares about RISC vs CISC anymore outside of people writing raw assembly, and even then, only for ease of use arguments, not performance.


You misunderstand what I'm saying. RISC CPUs are going to tend to want instructions that all execute quickly and using those instructions to everything, that means you're not going to have instructions that are bulky or relatively slow compared to others. There is an expectation that the faster instructions will be used to do the same kind of operation. The problem is things like SSE exists to accelerate these kinds of workloads, where a more heavy-weight instruction that does multiple things at once very well can save clock cycles.

What I can tell you is while RISC can have complex instruction sets, that's not entirely true for ARM-based CPUs as it may be for others like SPARC. ARM was intended to be low power and cheap, not fast and performant. That's why don't often see ARM CPUs in clusters as you do SPARCs. So while your argument about RISC in general is true, that doesn't hold true for ARM as a RISC. Not all RISCs are made equally and I can tell you that a modern ARM core is much more simple than a modern x86 core.

Edit: Also a side note, load/store architectures tend to require more instructions to do the same thing. So performance aside, this will result in larger applications by size since you must explicitly do all memory operations outside of non-memory related instructions where even on the 68K, you could run operations on variables in memory without explicitly loading them into CPU registers before acting on them. That is more indicative of the RISC/CISC debate as opposed to the ARM/x86 which is quite a bit different.


----------



## FordGT90Concept (Oct 25, 2015)

CISC is much faster at specialized workloads (like decryption, encoding, and decoding) than RISC; however, RISC can reasonably compensate for that shortcoming through cheap parallelism but doing so greatly increases complexity of compilers/code.  At the end of the day, RISC lands somewhere between CISC and the heavily paralleled workloads  modern GPUs are champions of which begs the question: why not make ARM add-in cards for x86 machines?  Let CISC of the x86 handle specialized workloads, hand off simple but heavy logic workloads to ARM, and hand off limited logic workloads to GPUs?


----------



## profoundWHALE (Oct 30, 2015)

FordGT90Concept said:


> CISC is much faster at specialized workloads (like decryption, encoding, and decoding) than RISC; however, RISC can reasonably compensate for that shortcoming through cheap parallelism but doing so greatly increases complexity of compilers/code.  At the end of the day, RISC lands somewhere between CISC and the heavily paralleled workloads  modern GPUs are champions of which begs the question: why not make ARM add-in cards for x86 machines?  Let CISC of the x86 handle specialized workloads, hand off simple but heavy logic workloads to ARM, and hand off limited logic workloads to GPUs?



So basically, take the big.LITTLE design of ARM, but instead, bigCISC.LITTLEarm + GPU?

2-4 low power ARM
4-8 high power x86
???? GPU clusters

So I'm assuming that it would run similar to how I have my OS on an SSD and Games on a large HDD. The ARM processors would handle the basic system functions (filesystem, networking, sound?), the x86 processor would handle processes that need both grunt, and have no GPU acceleration, and would handle the draw calls? Or would that be the ARM?

I'm way in over my head. I feel like I was on the right track and flew off of them.


----------



## Super XP (Mar 31, 2016)

I am not against AMD charging a bit more for the ZEN desktop CPU's if and when they match and outperform its competition. But seeing AMD's past pricing scheme, the only Processors they actually over charged was the Quad-FX compatible CPU's. They were a complete ripoff. 

I trust AMD will price ZEN in a fair manner.


----------



## ZeDestructor (Mar 31, 2016)

profoundWHALE said:


> So basically, take the big.LITTLE design of ARM, but instead, bigCISC.LITTLEarm + GPU?
> 
> 2-4 low power ARM
> 4-8 high power x86
> ...



The difference of CISC vs RISC is largely academic in these days of superscalar processing with internal microcode, and big, expensive instruction decode stages: each instruction is turned into an internal micro-op, not run directly, thus making the actual execution style identical for both types. For high-performance chips at least.

As for straight performance, it's largely gated by power and die size these days, and in those arenas, Intel holds the crown across the board. In fact, current x86 performance is so good that in terms of performance/W, Intel is ahead of all the various ARM cores, but at the cost of having higher total power consumption and heat. It is for that reason that you don't see ARM in most scale-out server deployments.

As a result, there's simply no point in having a heterogenous architecture that mixes x86 and ARM, and AMD knows that, as do Intel, Samsung, Qualcomm etc.


----------



## Aquinus (Apr 1, 2016)

ZeDestructor said:


> The difference of CISC vs RISC is largely academic in these days of superscalar processing with internal microcode, and big, expensive instruction decode stages: each instruction is turned into an internal micro-op, not run directly, thus making the actual execution style identical for both types. For high-performance chips at least.
> 
> As for straight performance, it's largely gated by power and die size these days, and in those arenas, Intel holds the crown across the board. In fact, current x86 performance is so good that in terms of performance/W, Intel is ahead of all the various ARM cores, but at the cost of having higher total power consumption and heat. It is for that reason that you don't see ARM in most scale-out server deployments.
> 
> As a result, there's simply no point in having a heterogenous architecture that mixes x86 and ARM, and AMD knows that, as do Intel, Samsung, Qualcomm etc.


Actually one of the differences that exist still to this day is that most RISC CPUs don't tend to combine regular instructions and memory operations (load/store). For example, in x86, you may have an instruction that takes two operands (say ADD,) but, that last operand could be either a register or a memory location. This basically means that the output of the instruction should get stored directly into memory. RISC CPUs aren't like this, in fact you have to explicitly say load this memory location into register n or store this register n into a memory location. There are advantages to doing both of these methods. Separate load/store instructions allows for a simpler pipeline because no instruction will ever have to do a memory operations contained within the same instruction. RISC CPUs also tend to have a lot of general purpose registered with makes this even more feasible. Depending on the application, keeping variables in registers until a full computation is done means less pressure on the memory controller and cache as well as faster turnaround time since CPU registers are the fastest storage in a CPU.

I wanted to point this out because while a lot of the differences between CISC and RISC CPUs have evaporated, there are some things like the LOAD/STORE bit that still tends to hold true. IIRC, I want to say that RISC CPUs tend to be much more rigid in terms of the number of operands that can be provided to any given instruction as well where there are some X86 instructions that are essentially multi-arity instructions. These small things tend to result in a smaller pipeline on ARM and other RISC CPUs compared to their X86 counterparts.


----------



## ZeDestructor (Apr 1, 2016)

Aquinus said:


> Actually one of the differences that exist still to this day is that most RISC CPUs don't tend to combine regular instructions and memory operations (load/store). For example, in x86, you may have an instruction that takes two operands (say ADD,) but, that last operand could be either a register or a memory location. This basically means that the output of the instruction should get stored directly into memory. RISC CPUs aren't like this, in fact you have to explicitly say load this memory location into register n or store this register n into a memory location. There are advantages to doing both of these methods. Separate load/store instructions allows for a simpler pipeline because no instruction will ever have to do a memory operations contained within the same instruction. RISC CPUs also tend to have a lot of general purpose registered with makes this even more feasible. Depending on the application, keeping variables in registers until a full computation is done means less pressure on the memory controller and cache as well as faster turnaround time since CPU registers are the fastest storage in a CPU.
> 
> I wanted to point this out because while a lot of the differences between CISC and RISC CPUs have evaporated, there are some things like the LOAD/STORE bit that still tends to hold true. IIRC, I want to say that RISC CPUs tend to be much more rigid in terms of the number of operands that can be provided to any given instruction as well where there are some X86 instructions that are essentially multi-arity instructions. These small things tend to result in a smaller pipeline on ARM and other RISC CPUs compared to their X86 counterparts.



As I understand it, those are architectural preferences by designers than actual traits of RISC vs CISC, which  is why I tend to ignore it in favour of directly comparing the number of instructions available on each and the internal implementations. I mean, sure, a tiny ARM Cortex-M4 is essentially a direct implementation of the ISA, but a high-performance POWER8 design is much closer to a modern OoO CISC design like x86, and it shows when you compare dies and power consumption to performance...


----------

