Friday, September 24th 2010
AMD Orochi ''Bulldozer'' Die Holds 16 MB Cache
Documents related to the "Orochi" 8-core processor by AMD based on its next-generation Bulldozer architecture reveal its cache hierarchy that comes as a bit of a surprise. Earlier this month, at a GlobalFoundries hosted conference, AMD displayed the first die-shot of the Orochi die, which legibly showed key features including the four Bulldozer modules which hold two cores each, and large L2 caches. In coarse visual inspection, the L2 cache of each module seems to cover 35% of its area. L3 cache is located along the center of the die. The documents seen by X-bit Labs reveal that each Bulldozer module has its own 2 MB L2 cache shared between two cores, and an L3 cache shared between all four modules (8 cores) of 8 MB.
This takes the total cache count of Orochi all the way up to 16 MB. This hierarchy suggests that AMD wants to give individual cores access to a large amount of faster cache (that's a whopping 2048 KB compared to 512 KB per core on Phenom, and 256 KB per core on Core i7), which facilitates faster inter-core, intra-module communication. Inter-module communication is enhanced by the 8 MB L3 cache. Compared to the current "Istanbul" six-core K10-based die, that's a 77% increase in cache amount for a 33% core count increase, 300% increase in L2 cache per core. Orochi is built on a 32 nm GlobalFoundries process, it is sure to have a very high transistor count.
Source:
Xbit Labs
This takes the total cache count of Orochi all the way up to 16 MB. This hierarchy suggests that AMD wants to give individual cores access to a large amount of faster cache (that's a whopping 2048 KB compared to 512 KB per core on Phenom, and 256 KB per core on Core i7), which facilitates faster inter-core, intra-module communication. Inter-module communication is enhanced by the 8 MB L3 cache. Compared to the current "Istanbul" six-core K10-based die, that's a 77% increase in cache amount for a 33% core count increase, 300% increase in L2 cache per core. Orochi is built on a 32 nm GlobalFoundries process, it is sure to have a very high transistor count.
152 Comments on AMD Orochi ''Bulldozer'' Die Holds 16 MB Cache
I really think the next year is going to be very exciting for pc hardware, sandy bridge and bulldozer and of corse amd's 6xxx and 7xxx cards and nvidia's kepler :rockout:
I was kinda actually hoping for a reduction in both power and thermals as the process matures, as currently, my 965BE is overheating on stock cooling(65c+ load). It's a horrible sample though.
Any info you can give that, of course, is more than welcome. ;)
I really hope amd brings a new hsf with the bulldozer
*hint, hint* JF-AMD :p (although yes i admit i don't have a clue on temps or cooling needs on the new core so the current hsf may be plenty for bulldozer)
And what would you not say, the expensive bit? I hope they are not...
blogs.amd.com/work/author/jfruehe/
What I'm actually really interested in, of course, is 3D performance. What key areas are targeted to improve this?