Tuesday, October 25th 2011

AMD Trinity Detailed Further, Compatible with A75 Chipset

AMD detailed its upcoming "Virgo" PC platform that consists of next-generation "Trinity" APU (accelerated processing unit), and current-generation AMD A75 "Hudson-D" chipset. A notable revelation here is that the next-gen APUs will be compatible with AMD A75, although it will be designed for a new socket called FM2. It remains to be seen if FM1 and FM2 are pin-compatible.

"Trinity" packs four x86-64 cores based on the next-generation "Piledriver" architecture, arranged in two Piledriver modules. A module is a closely-knit group of two cores, with certain shared and dedicated resources. Each Piledriver module has 2 MB of L2 cache shared between the two cores. In all, Trinity, with its two modules, has 4 MB of L2 cache without any L3 cache.
AMD is talking about a 20% performance improvement over current-generation "Llano" APUs, which use K10 "Stars" architecture cores. Trinity will feature 3rd-generation TurboCore technology that adds a few new power-management and selective overclocking features.

The integrated memory controller will get an overhaul, too. Unlike with K10-based processors that have two independent 64-bit wide memory interfaces that can be configured to work ganged or unganged, Trinity will have a single 128-bit memory interface, the controller will support dual-channel DDR3-2133 MHz memory standard, with DRAM voltages of under 1.5V. Trinity will include a 24-lane PCI-Express root complex, it supports 2-way multi-GPU configurations.

Moving on to the integrated GPU component, AMD promises a 30% performance improvement over Llano's iGPU. The GPU component is DirectX 11 compliant, and features UVD 3 hardware HD video acceleration, with SAMU and native VCE. Featuring AMD Eyefinity technology, this integrated GPU will support up to three displays without needing a discrete graphics card. Eyefinity can be used to step up productivity.
Source: DonanimHaber
Add your own comment

41 Comments on AMD Trinity Detailed Further, Compatible with A75 Chipset

#26
Inceptor
Perhaps I should clarify...
When I say the same performance per core, I mean raw single-threaded cpu performance.
How quickly does the core process and execute instructions?
Not anything that utilizes other types of functionality, such as a full blown application or a benchmark that uses code based on common desktop applications or based on a game (such as the various ****Mark programs). And not something that utilizes all cores and then derives a single threaded score, such as the 7-zip benchmark, as an example, since this would run into the scheduling snafu on the FX.

Scroll down to 'Cinebench R10 Single threaded'
www.anandtech.com/bench/Product/434?vs=102

Cinebench 11.5 Single thread would be just as good, and would give analogous results.

My Phenom II x4 945 @ 3.6 Ghz achieves single threaded benchmarks:
Cinebench R10 : ~4100 +/-~50
Cinebench 11.5 : 1.06

The FX-8150 is:
Cinebench R10 : ~3940
Cinebench 11.5 : 1.02

A Phenom II x4 @3.6ghz is ~4% faster than a Phenom II x4 @3.4ghz in the Cinebench single threaded cpu benchmarks. A Phenom II x4 @3.4ghz (965) is equivalent to an FX-8150 in those same Cinebench single threaded cpu benchmarks. Plus or minus 5% is nothing, in terms of noticeable performance, which means that any Phenom II (x2/x3/x4) operating in the 3.2 to 3.7ghz range is roughly equivalent to an FX-8150 in raw single threaded performance. For a Thuban, since the single core raw performance is a bit better than Deneb clock for clock, slide that range down a bit.

Raw-single-threaded-performance. Raw, naked, single threaded performance.
Posted on Reply
#27
Inceptor
Dent1I would like to see AMD push out a 8 core Thuban and up the L2 to 2MB/core and L3 cache to 8MB shared. We are already seeing the Thubans on the heels of the i5 and i7 in certain benchmarks already wouldnt it be more logical just to increase the cache and core count until they revise the Bulldozer.
Problem: the i5s and i7s have massive memory bandwidth that the Thubans can't ever match. Like TheLaughingMan said, larger memory bandwidth increases performance. And obviously something about the K10 architecture does not allow for more bandwidth, otherwise they would have done it, it would have cost them much less to revise an architecture than create a new one. So, there must have been a problem they couldn't get around.
Do or die.
Posted on Reply
#28
Benetanegia
InceptorProblem: the i5s and i7s have massive memory bandwidth that the Thubans can't ever match. Like TheLaughingMan said, larger memory bandwidth increases performance. And obviously something about the K10 architecture does not allow for more bandwidth, otherwise they would have done it, it would have cost them much less to revise an architecture than create a new one. So, there must have been a problem they couldn't get around.
Do or die.
OR they were just too busy trying to fix Bulldozer that they didn't look at K10 derivatives, nor did anything to try and really improve it beyond some simple tweaking. Intel (with 10x more R&D money to spend) did the same with P4, until they realized that the laptop CPUs which were based on P3 were largely better clock for clock, despite not having many of the "pluses" that they had been adding to P4's (better cache, better fetch&decode, etc.) and they could achieve almost the same clock anyway if you gave them som more voltage. So they went the route "what if we add all those features to P3 and allow a higher maximum TDP?" and they gave birth to Conroe.

So yeah it's very posible that AMD with it's limited R&D funding, have not really tried to improve bandwidth in an architecture they had long deemed dead. It turned out to be a huge mistake, but that's easy to say now, not so much when the limited funding had to be granted to the different working groups*.

* Still, once they got first silicon back, IMHO AMD should have made a U turn a long time ago regarding BD, either canceling it or delaying it even further. Improving K10 even further, with many improvements that are suposed/advertised on Bulldozer, i.e. AMD advertised BD's fetch and decode unit to be much more advanced than on Thuban. More or less it's in these units where Intel has obtained the performance improvements between Nehalem and Sandy Bridge. My point being, K10 + BD's fetch&decode may have easily offered a 10-20% improvement per-core-per-clock over Thuban, not to mention BD which is lower.
seronx28.media.tumblr.com/tumblr_lsbeohjv041r1i9ueo1_500.jpg

Phenom II X8 about 2.7GHz

www.phoronix.com/scan.php?page=article&item=amd_fx8150_bulldozer&num=1

2 x Opteron 2384 should be your Phenom II example

Bulldozer Family architecture was a do or die scenario K8 derived architectures are over bros
3.6 Ghz (+turbo) versus 2.7 Ghz, not a fair comparison.

3.6 / 2.7 = 1.33

Add 33% to Opterons' results and FX8150 is pwned in most valid tests. I said valid because tests where even the A8 3850 is faster than the Opterons can not be taken seriously (and there are far too many of them btw)
Posted on Reply
#29
seronx
BenetanegiaI said valid because tests where even the A8 3850 is faster than the Opterons can not be taken seriously (and there are far too many of them btw)
The ones with the A8-3850 being faster are single threaded mostly
Posted on Reply
#30
xBruce88x
honestly i dont see why they wouldnt do the same thing with the FMx sockets that they did with the AMx sockets. with their limited R&D budget it would be a waste to dev another socket already
Posted on Reply
#31
Benetanegia
seronxThe ones with the A8-3850 being faster are single threaded mostly
But that is the kinda the point. Llano chip is faster because of higher clocks and maybe some tuning to cache, IMC... Thing is, not even in those single threaded tests BD scores a win (where in theory BD could execute 4 instructions versus 3 on K10). Just like the 3850 is faster than the Opteron, BD is only faster because of higher stock clocks.

8x K10.x cores made on 32nm + front-end improvements + TurboCore 2.0 would have probably destroyed what Bulldozer has turned out to be, and with a much smaller die size.
Posted on Reply
#32
Atom_Anti
Ok, I've also checked out the guru3d CPU test of 4core Phenom II vs 4 core Bulldozer. Phenom II cores are seems much faster than Bulldozer, therefore I'm confused with Trinity. Might it get faster graphics but slower CPU cores?
Posted on Reply
#33
Inceptor
Atom_AntiOk, I've also checked out the guru3d CPU test of 4core Phenom II vs 4 core Bulldozer. Phenom II cores are seems much faster than Bulldozer, therefore I'm confused with Trinity. Might it get faster graphics but slower CPU cores?
What do you mean by '4 core Bulldozer' ?
Do you mean, 2 module/4 integer core, OR 4 module/8 integer core.
In terms of Logical usage, rather than getting mixed up in the semantics of the word 'core', the "4 core" Bulldozer is a dual core with a kind of hyperthreading, and the "8 core" is a quad.

Trinity will have Piledriver modules. IF they manage to bump performance 10%, fix a lot of the little problems with the architecture, and get the scheduling improvements of Windows 8, THEN a two module (4 integer core) Piledriver APU should be somewhere in the Phenom II x4 stock performance range. And will have a GPU based on a 6xxx architecture, probably with a few improvements and marketed as a lower numbered 7xxx.
Posted on Reply
#34
Inceptor
xBruce88xhonestly i dont see why they wouldnt do the same thing with the FMx sockets that they did with the AMx sockets. with their limited R&D budget it would be a waste to dev another socket already
Well, I suppose going from Athlon II cores to Piledriver modules would require a new socket.
So, maybe the FM2 socket configuration will be engineered with the future in mind. What I mean is that maybe it will be a similar situation to the AM2 -> AM2+ -> AM3 sockets, with just chipset changes, and the ability to upgrade an APU or board up one level and still retain your existing board or APU, respectively.
Posted on Reply
#35
Benetanegia
InceptorWhat do you mean by '4 core Bulldozer' ?
Do you mean, 2 module/4 integer core, OR 4 module/8 integer core.
In terms of Logical usage, rather than getting mixed up in the semantics of the word 'core', the "4 core" Bulldozer is a dual core with a kind of hyperthreading, and the "8 core" is a quad.
Of course he is refering to a 2 module/4 integer core, which is what Trinity will use. AMD markets those as full featured cores, at least advertising that performance is equal to 2 full feature cores.
Trinity will have Piledriver modules. IF they manage to bump performance 10%, fix a lot of the little problems with the architecture, and get the scheduling improvements of Windows 8, THEN a two module (4 integer core) Piledriver APU should be somewhere in the Phenom II x4 stock performance range. And will have a GPU based on a 6xxx architecture, probably with a few improvements and marketed as a lower numbered 7xxx.
Yeah, but those are a lot of "IF"s, only to match or very slightly exceed current Llano performance, which is already lackluster in the CPU department, compared to Intel offerings. People do have a reason to worry about when the new one could turn out to be even weaker.
Posted on Reply
#36
Atom_Anti
InceptorWhat do you mean by '4 core Bulldozer' ?
Do you mean, 2 module/4 integer core, OR 4 module/8 integer core.
In terms of Logical usage, rather than getting mixed up in the semantics of the word 'core', the "4 core" Bulldozer is a dual core with a kind of hyperthreading, and the "8 core" is a quad.
Yeah I guess, I meant the AMD FX4100 vs Phenom II 9xx. Well if this stat means Trinity won't get 4 real core, that is going to catastrophic. I'll probably keep my A8-3530MX or move to Ivy Bridge. Any news of the graphics performance of Ivy Bridge?
Posted on Reply
#37
KooKKiK
Atom_AntiYeah I guess, I meant the AMD FX4100 vs Phenom II 9xx. Well if this stat means Trinity won't get 4 real core, that is going to catastrophic. I'll probably keep my A8-3530MX or move to Ivy Bridge. Any news of the graphics performance of Ivy Bridge?
Nothing new for the graphic til Haswell is out.
Posted on Reply
#39
Inceptor
KooKKiKNothing new for the graphic til Haswell is out.
No, Ivy Bridge has a new iGPU.
The Anand article doesn't mention it, but its rumoured to have Quad HD capability (3840x2160).
Probably only as an everyday workload monitor support, rather than for gaming quality graphics. Intel can't catch up that fast to AMD and Nvidia.
Posted on Reply
#40
Super XP
So what about this so called B3 stepping. This was posted on semiaccurate by somebody that calls themself ATInsider. If what he says is true, how much performance can a B3 give Bulldozer before Piledriver gets released with another added 10% increase?
AMD FX – Series B3 revision is more than just a basic stepping:
I have direct knowledge of a possible B3 revision for the AMD FX line of CPUs. I cannot disclose performance projections at this time, but be assured AMDs processor division is working vigorously on a (B3) stepping revision with minor architectural tweaks. The base architecture will not be changed at this time.
Within the B3 stepping revision, expect minor tweaks to the following:
1) L1, L2 and L3 latencies
2) Cache Thrashing Issues
3) Modified Algorithms for Branch Prediction
4) Healthy Bump in Processor Frequency
5) Slight Frequency increase via NB Controller
6) “Total Intelligent Control”
:confused:For example programs and applications should look at the module design approach and the ability for the processor to intelligently turn off and/or turn on specific cores that it believes is hindering performance for maximum performance. (May be for Socket FM2, not sure at this time).
7) Power will be improved but not my much. We will have to wait for Socket FM2 or a future B4 revision for the AM3+ platform for better power efficiency especially when Over-clocked.
ATInsider
semiaccurate.com/2011/10/17/bulldozer-doesnt-have-just-a-single-problem/comment-page-1/#comment-11399
Posted on Reply
#41
Inceptor
That's too much of nothing from a supposed insider...
And nothing anyone with a brain couldn't already figure out for themselves.
It's only logical that they would prioritize the fixes that would improve application/benchmark performance, even if they had to put power efficiency improvements on the shelf for the next iteration...

That really doesn't say anything new, at all.

I can't wait for the dozens and dozens of reports, rumours, and whisperings like this that will cause so many unnecessary spot fires in forum threads everywhere...:shadedshu
Posted on Reply
Add your own comment
Nov 14th, 2024 00:49 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts