Tuesday, October 25th 2011
AMD Trinity Detailed Further, Compatible with A75 Chipset
AMD detailed its upcoming "Virgo" PC platform that consists of next-generation "Trinity" APU (accelerated processing unit), and current-generation AMD A75 "Hudson-D" chipset. A notable revelation here is that the next-gen APUs will be compatible with AMD A75, although it will be designed for a new socket called FM2. It remains to be seen if FM1 and FM2 are pin-compatible.
"Trinity" packs four x86-64 cores based on the next-generation "Piledriver" architecture, arranged in two Piledriver modules. A module is a closely-knit group of two cores, with certain shared and dedicated resources. Each Piledriver module has 2 MB of L2 cache shared between the two cores. In all, Trinity, with its two modules, has 4 MB of L2 cache without any L3 cache.AMD is talking about a 20% performance improvement over current-generation "Llano" APUs, which use K10 "Stars" architecture cores. Trinity will feature 3rd-generation TurboCore technology that adds a few new power-management and selective overclocking features.
The integrated memory controller will get an overhaul, too. Unlike with K10-based processors that have two independent 64-bit wide memory interfaces that can be configured to work ganged or unganged, Trinity will have a single 128-bit memory interface, the controller will support dual-channel DDR3-2133 MHz memory standard, with DRAM voltages of under 1.5V. Trinity will include a 24-lane PCI-Express root complex, it supports 2-way multi-GPU configurations.
Moving on to the integrated GPU component, AMD promises a 30% performance improvement over Llano's iGPU. The GPU component is DirectX 11 compliant, and features UVD 3 hardware HD video acceleration, with SAMU and native VCE. Featuring AMD Eyefinity technology, this integrated GPU will support up to three displays without needing a discrete graphics card. Eyefinity can be used to step up productivity.
Source:
DonanimHaber
"Trinity" packs four x86-64 cores based on the next-generation "Piledriver" architecture, arranged in two Piledriver modules. A module is a closely-knit group of two cores, with certain shared and dedicated resources. Each Piledriver module has 2 MB of L2 cache shared between the two cores. In all, Trinity, with its two modules, has 4 MB of L2 cache without any L3 cache.AMD is talking about a 20% performance improvement over current-generation "Llano" APUs, which use K10 "Stars" architecture cores. Trinity will feature 3rd-generation TurboCore technology that adds a few new power-management and selective overclocking features.
The integrated memory controller will get an overhaul, too. Unlike with K10-based processors that have two independent 64-bit wide memory interfaces that can be configured to work ganged or unganged, Trinity will have a single 128-bit memory interface, the controller will support dual-channel DDR3-2133 MHz memory standard, with DRAM voltages of under 1.5V. Trinity will include a 24-lane PCI-Express root complex, it supports 2-way multi-GPU configurations.
Moving on to the integrated GPU component, AMD promises a 30% performance improvement over Llano's iGPU. The GPU component is DirectX 11 compliant, and features UVD 3 hardware HD video acceleration, with SAMU and native VCE. Featuring AMD Eyefinity technology, this integrated GPU will support up to three displays without needing a discrete graphics card. Eyefinity can be used to step up productivity.
41 Comments on AMD Trinity Detailed Further, Compatible with A75 Chipset
When I say the same performance per core, I mean raw single-threaded cpu performance.
How quickly does the core process and execute instructions?
Not anything that utilizes other types of functionality, such as a full blown application or a benchmark that uses code based on common desktop applications or based on a game (such as the various ****Mark programs). And not something that utilizes all cores and then derives a single threaded score, such as the 7-zip benchmark, as an example, since this would run into the scheduling snafu on the FX.
Scroll down to 'Cinebench R10 Single threaded'
www.anandtech.com/bench/Product/434?vs=102
Cinebench 11.5 Single thread would be just as good, and would give analogous results.
My Phenom II x4 945 @ 3.6 Ghz achieves single threaded benchmarks:
Cinebench R10 : ~4100 +/-~50
Cinebench 11.5 : 1.06
The FX-8150 is:
Cinebench R10 : ~3940
Cinebench 11.5 : 1.02
A Phenom II x4 @3.6ghz is ~4% faster than a Phenom II x4 @3.4ghz in the Cinebench single threaded cpu benchmarks. A Phenom II x4 @3.4ghz (965) is equivalent to an FX-8150 in those same Cinebench single threaded cpu benchmarks. Plus or minus 5% is nothing, in terms of noticeable performance, which means that any Phenom II (x2/x3/x4) operating in the 3.2 to 3.7ghz range is roughly equivalent to an FX-8150 in raw single threaded performance. For a Thuban, since the single core raw performance is a bit better than Deneb clock for clock, slide that range down a bit.
Raw-single-threaded-performance. Raw, naked, single threaded performance.
Do or die.
So yeah it's very posible that AMD with it's limited R&D funding, have not really tried to improve bandwidth in an architecture they had long deemed dead. It turned out to be a huge mistake, but that's easy to say now, not so much when the limited funding had to be granted to the different working groups*.
* Still, once they got first silicon back, IMHO AMD should have made a U turn a long time ago regarding BD, either canceling it or delaying it even further. Improving K10 even further, with many improvements that are suposed/advertised on Bulldozer, i.e. AMD advertised BD's fetch and decode unit to be much more advanced than on Thuban. More or less it's in these units where Intel has obtained the performance improvements between Nehalem and Sandy Bridge. My point being, K10 + BD's fetch&decode may have easily offered a 10-20% improvement per-core-per-clock over Thuban, not to mention BD which is lower. 3.6 Ghz (+turbo) versus 2.7 Ghz, not a fair comparison.
3.6 / 2.7 = 1.33
Add 33% to Opterons' results and FX8150 is pwned in most valid tests. I said valid because tests where even the A8 3850 is faster than the Opterons can not be taken seriously (and there are far too many of them btw)
8x K10.x cores made on 32nm + front-end improvements + TurboCore 2.0 would have probably destroyed what Bulldozer has turned out to be, and with a much smaller die size.
Do you mean, 2 module/4 integer core, OR 4 module/8 integer core.
In terms of Logical usage, rather than getting mixed up in the semantics of the word 'core', the "4 core" Bulldozer is a dual core with a kind of hyperthreading, and the "8 core" is a quad.
Trinity will have Piledriver modules. IF they manage to bump performance 10%, fix a lot of the little problems with the architecture, and get the scheduling improvements of Windows 8, THEN a two module (4 integer core) Piledriver APU should be somewhere in the Phenom II x4 stock performance range. And will have a GPU based on a 6xxx architecture, probably with a few improvements and marketed as a lower numbered 7xxx.
So, maybe the FM2 socket configuration will be engineered with the future in mind. What I mean is that maybe it will be a similar situation to the AM2 -> AM2+ -> AM3 sockets, with just chipset changes, and the ability to upgrade an APU or board up one level and still retain your existing board or APU, respectively.
www.anandtech.com/show/4830/intels-ivy-bridge-architecture-exposed
The Anand article doesn't mention it, but its rumoured to have Quad HD capability (3840x2160).
Probably only as an everyday workload monitor support, rather than for gaming quality graphics. Intel can't catch up that fast to AMD and Nvidia.
And nothing anyone with a brain couldn't already figure out for themselves.
It's only logical that they would prioritize the fixes that would improve application/benchmark performance, even if they had to put power efficiency improvements on the shelf for the next iteration...
That really doesn't say anything new, at all.
I can't wait for the dozens and dozens of reports, rumours, and whisperings like this that will cause so many unnecessary spot fires in forum threads everywhere...:shadedshu