• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD FX-8130P Processor Benchmarks Surface

Joined
May 7, 2009
Messages
5,392 (0.95/day)
Location
Carrollton, GA
System Name ODIN
Processor AMD Ryzen 7 5800X
Motherboard Gigabyte B550 Aorus Elite AX V2
Cooling Dark Rock 4
Memory G Skill RipjawsV F4 3600 Mhz C16
Video Card(s) MSI GeForce RTX 3080 Ventus 3X OC LHR
Storage Crucial 2 TB M.2 SSD :: WD Blue M.2 1TB SSD :: 1 TB WD Black VelociRaptor
Display(s) Dell S2716DG 27" 144 Hz G-SYNC
Case Fractal Meshify C
Audio Device(s) Onboard Audio
Power Supply Antec HCP 850 80+ Gold
Mouse Corsair M65
Keyboard Corsair K70 RGB Lux
Software Windows 10 Pro 64-bit
Benchmark Scores I don't benchmark.
Don't forget SSE5 and 128Bit AVX

I don't get this one

It is based by 10s actually
8100 = 3.5GHz 8110 = 3.6GHz 8120 3.7GHz and so on

SSE5 was replaced with several smaller instruction sets that were redesigned to work with AVX better. This happened right after the AMD/Intel contract was renegotiated. So SSE5 as far as the name is concerned will not be on Bulldozer.

Some reviews showed a while back (like years) if you change the name that was reported to some of the benchmark programs, you would magically get better numbers. A VIA C7 that was reported to the software as either an AMD processor or Intel process improved its memory and per-clock performance. While the performance could be justified as the VIA C7 aquired use of SSE3 at the time rather late and a patch for the software was needed. The memory performance change was just BS.

And there has been no confirmation of the naming scheme to my knowledge.
 

Benetanegia

New Member
Joined
Sep 11, 2009
Messages
2,680 (0.48/day)
Location
Reaching your left retina.
4 fetch/decode/store per cycle for 32bit

2 fetch/decode/store per cycle for 64bit

and in theory if there were registers for it

1 fetch/decode store per cycle for 128bit

Still a single unit. More than one fetch/decode/store operations per cycle happens in every architecture since, I can't even remember when. Previous AMD chips did up to 3, now it's 4. I see the improvement but it's still only 1 fetch/decode unit per module nonetheless.

The line is blurred definately, but you can call BD module an 1 core as easily as you can call it a 2 core. Because of the single fetch unit I'm more inclined to call it 1 core.
 
Joined
Jul 10, 2010
Messages
1,233 (0.23/day)
Location
USA, Arizona
System Name SolarwindMobile
Processor AMD FX-9800P RADEON R7, 12 COMPUTE CORES 4C+8G
Motherboard Acer Wasp_BR
Cooling It's Copper.
Memory 2 x 8GB SK Hynix/HMA41GS6AFR8N-TF
Video Card(s) ATI/AMD Radeon R7 Series (Bristol Ridge FP4) [ACER]
Storage TOSHIBA MQ01ABD100 1TB + KINGSTON RBU-SNS8152S3128GG2 128 GB
Display(s) ViewSonic XG2401 SERIES
Case Acer Aspire E5-553G
Audio Device(s) Realtek ALC255
Power Supply PANASONIC AS16A5K
Mouse SteelSeries Rival
Keyboard Ducky Channel Shine 3
Software Windows 10 Home 64-bit (Version 1607, Build 14393.969)
SSE5 was replaced with several smaller instruction sets that were redesigned to work with AVX better. This happened right after the AMD/Intel contract was renegotiated. So SSE5 as far as the name is concerned will not be on Bulldozer.

Some reviews showed a while back (like years) if you change the name that was reported to some of the benchmark programs, you would magically get better numbers. A VIA C7 that was reported to the software as either an AMD processor or Intel process improved its memory and per-clock performance. While the performance could be justified as the VIA C7 aquired use of SSE3 at the time rather late and a patch for the software was needed. The memory performance change was just BS.

And there has been no confirmation of the naming scheme to my knowledge.

For what I use SSE5(XOP,CVT16,FMA4) is highly important

http://www.xbitlabs.com/news/cpu/di...ulldozer_Chips_Incoming_Details_Revealed.html

Still a single unit. More than one fetch/decode/store operations per cycle happens in every architecture since, I can't even remember when. Previous AMD chips did up to 3, now it's 4. I see the improvement but it's still only 1 fetch/decode unit per module nonetheless.

The line is blurred definately, but you can call BD module an 1 core as easily as you can call it a 2 core. Because of the single fetch unit I'm more inclined to call it 1 core.

It's an 8 core

It has 4 fetch/decode/store units not one per module

Phenom II could only do 3 fetch/decodes per clock or 3 stores per clock
 
Last edited:
Joined
Apr 4, 2008
Messages
4,686 (0.77/day)
System Name Obelisc
Processor i7 3770k @ 4.8 GHz
Motherboard Asus P8Z77-V
Cooling H110
Memory 16GB(4x4) @ 2400 MHz 9-11-11-31
Video Card(s) GTX 780 Ti
Storage 850 EVO 1TB, 2x 5TB Toshiba
Case T81
Audio Device(s) X-Fi Titanium HD
Power Supply EVGA 850 T2 80+ TITANIUM
Software Win10 64bit
Cache speed and memory performance are very tightly linked together. We are talking about the combination of BOTH. Increase L3 speed, and memory bandwidth goes up with it. I can show this very simply with both AMD and Intel chips.

You can say that SPi favors Intel chips...but then again, if you want to go down that road, so do the majority of applications out there...any apps favors the faster performance on 1155. Like I posted above, I don't care, really, if an app favors one over the other...the fact of the matter is that the end user gets better performance on 1155, not how Intel really got there.

Both? Alright so we already know AMD has better ram bandwidth, let's look at the cache. Btw 775 doesn't even have L3.




Note the DDR3 on phenom only gives it a .2 latency boost on L2. It would win either way.

Biggest difference I see is the read and copy are switched. Overall it appears AMD is faster on cache as well. Yet super pi still does better on 775 despite all that. So really what purpose does super pi have here in comparing AMD and Intel chips if the architecture is making a bigger impact than the memory speeds?
 
Joined
Jul 10, 2010
Messages
1,233 (0.23/day)
Location
USA, Arizona
System Name SolarwindMobile
Processor AMD FX-9800P RADEON R7, 12 COMPUTE CORES 4C+8G
Motherboard Acer Wasp_BR
Cooling It's Copper.
Memory 2 x 8GB SK Hynix/HMA41GS6AFR8N-TF
Video Card(s) ATI/AMD Radeon R7 Series (Bristol Ridge FP4) [ACER]
Storage TOSHIBA MQ01ABD100 1TB + KINGSTON RBU-SNS8152S3128GG2 128 GB
Display(s) ViewSonic XG2401 SERIES
Case Acer Aspire E5-553G
Audio Device(s) Realtek ALC255
Power Supply PANASONIC AS16A5K
Mouse SteelSeries Rival
Keyboard Ducky Channel Shine 3
Software Windows 10 Home 64-bit (Version 1607, Build 14393.969)
Biggest difference I see is the read and copy are switched. Overall it appears AMD is faster on cache as well. Yet super pi still does better on 775 despite all that. So really what purpose does super pi have here in comparing AMD and Intel chips if the architecture is making a bigger impact than the memory speeds?

None, Super Pi doesn't use a living architecture like wPrime does

x87 vs SSE
SSE wins

Name applications that came out this year that uses x87
 
Joined
May 7, 2009
Messages
5,392 (0.95/day)
Location
Carrollton, GA
System Name ODIN
Processor AMD Ryzen 7 5800X
Motherboard Gigabyte B550 Aorus Elite AX V2
Cooling Dark Rock 4
Memory G Skill RipjawsV F4 3600 Mhz C16
Video Card(s) MSI GeForce RTX 3080 Ventus 3X OC LHR
Storage Crucial 2 TB M.2 SSD :: WD Blue M.2 1TB SSD :: 1 TB WD Black VelociRaptor
Display(s) Dell S2716DG 27" 144 Hz G-SYNC
Case Fractal Meshify C
Audio Device(s) Onboard Audio
Power Supply Antec HCP 850 80+ Gold
Mouse Corsair M65
Keyboard Corsair K70 RGB Lux
Software Windows 10 Pro 64-bit
Benchmark Scores I don't benchmark.
None, Super Pi doesn't use a living architecture like wPrime does

x87 vs SSE
SSE wins

Name applications that came out this year that uses x87

x87 is like 5 years old and completely obsolete. The Phenom II is running Super Pi with the SSE Instruction sets up to SSE3. Intel gets the benefit of the full SSE4, SSE4.1 and SSE4.2.

So of course there is no x87 programs. Why would anyone do that.
 

Benetanegia

New Member
Joined
Sep 11, 2009
Messages
2,680 (0.48/day)
Location
Reaching your left retina.
It has 4 fetch/decode/store units not one per module

False.

Shared Instruction Fetch

Sharing between cores is a key element of Bulldozer’s architecture, and it starts with the front end. The front-end has been entirely overhauled and is now responsible for feeding both cores within a module. Bulldozer’s front end includes branch prediction, instruction fetching, instruction decoding and macro-op dispatch. These stages are effectively multi-threaded with single cycle switching between threads. The arbitration between the two cores is determined by a number of factors including fairness, pipeline occupancy and stalling events.

Basically each module has one fetch/decode unit capable of issuing 4 macrops per cycle (same as Intel does since Nehalem, or sooner I'm not sure actually). So while a Phenom X6 had in total six units capable of issuing 3 Mops each, 8 "core" BD has 4 units capable of issuing 4 Mops each.
 
Last edited:
Joined
Apr 7, 2011
Messages
1,380 (0.28/day)
System Name Desktop
Processor Intel Xeon E5-1680v2
Motherboard ASUS Sabertooth X79
Cooling Intel AIO
Memory 8x4GB DDR3 1866MHz
Video Card(s) EVGA GTX 970 SC
Storage Crucial MX500 1TB + 2x WD RE 4TB HDD
Display(s) HP ZR24w
Case Fractal Define XL Black
Audio Device(s) Schiit Modi Uber/Sony CDP-XA20ES/Pioneer CT-656>Sony TA-F630ESD>Sennheiser HD600
Power Supply Corsair HX850
Mouse Logitech G603
Keyboard Logitech G613
Software Windows 10 Pro x64
More stuff for you guys to discuss :laugh:

http://support.amd.com/us/Processor_TechDocs/47414.pdf

The following performance caveats apply when using streaming stores on AMD Family 15h cores.
• When writing out a single stream of data sequentially, performance of AMD Family 15h
processors is comparable to previous generations of AMD processors.
• When writing out two streams of data, AMD Family 15h version 1 processors can be up to three
times slower than previous-generation AMD processors. AMD Family 15h version 2 processor
performance is approximately 1.5 times slower than previous AMD processors.
• When writing out four non-temporal streams, AMD Family 15h version 1 can be up to three
times slower than previous AMD processors. AMD Family 15h version 2 processor performance
is comparable to previous AMD processors.
• Using non-temporal stores but not writing out an entire cacheline may cause performance to be up
to six times slower than previous AMD processors.

*goes away to get more popcorn*
 

cadaveca

My name is Dave
Joined
Apr 10, 2006
Messages
17,232 (2.53/day)
Both? Alright so we already know AMD has better ram bandwidth, let's look at the cache. Btw 775 doesn't even have L3.

http://i46.tinypic.com/ta1se8.png
http://img219.imageshack.us/img219/3314/cachemem2m.png

Note the DDR3 on phenom only gives it a .2 latency boost on L2. It would win either way.

Biggest difference I see is the read and copy are switched. Overall it appears AMD is faster on cache as well. Yet super pi still does better on 775 despite all that. So really what purpose does super pi have here in comparing AMD and Intel chips if the architecture is making a bigger impact than the memory speeds?

I'm sorry, but your compare here is inaccurate. You've got AMD with DDR3, and Intel with DDR2. I dunno about you, but I ran my 775 on DDR3 as soon as DDR3 boards came out. In fact, my old 775 board, a Foxconn BlackOps, that supports DDR3, is on it's way to EasyRhino right now.


Anyway, the point was that SuperPi can directly relate to SOME APPs and how they can perform, and is in no way meant to be used as a comparison for all performance scenarios.


And I do have screenshots from that platform. I'll not fall for the obvious problems in your compare; your troll failed, sry.:laugh:
 

faramir

New Member
Joined
May 20, 2011
Messages
203 (0.04/day)
Unsurprisingly, as you've been prone to do today, seronx, you didn't actually read what you were responding to. I'm starting to think it's a reading comprehension deficiency.

That, or a bit of this. That's another link for him.
 
Joined
Apr 21, 2010
Messages
146 (0.03/day)
Location
Perth, Australia
Processor 5800x3d
Motherboard Asus B550 Gaming-F
Cooling Ek 240 Aio
Memory Gskill Trident Neo 4000 18-22-22-42 @3800 fclk 1900
Video Card(s) 2080ti
Storage 1 TB Nvme
Power Supply Seasonic 750w
Software Win 11
Amd is considering integer clusters as "cores". There are 8 Integer clusters on BD so they say 8 cores.
@cadaveca
isn't it the cache on Amd chips that is significantly lower performing and not Memory (ram) bandwidth, somuch. from what i've seen memory bandwidth isn't that far behind Intel on Amd. Also Super pi tests at or below chip cache should be only limited by cache bandwidth/latency. the larger tests should show combined effects from cache and memory.

From my understanding if Amd were to go out of bussiness then Intel would get carved up into bite sized chunks that would have to compete with eachother. Anyway why wuld you want the competion to fold, it just leads to higher prices. ideally you want at least 3 major players in a market each controlling roughly equal market share. that way you get lots of competition and good prices.
 
Joined
Jul 10, 2010
Messages
1,233 (0.23/day)
Location
USA, Arizona
System Name SolarwindMobile
Processor AMD FX-9800P RADEON R7, 12 COMPUTE CORES 4C+8G
Motherboard Acer Wasp_BR
Cooling It's Copper.
Memory 2 x 8GB SK Hynix/HMA41GS6AFR8N-TF
Video Card(s) ATI/AMD Radeon R7 Series (Bristol Ridge FP4) [ACER]
Storage TOSHIBA MQ01ABD100 1TB + KINGSTON RBU-SNS8152S3128GG2 128 GB
Display(s) ViewSonic XG2401 SERIES
Case Acer Aspire E5-553G
Audio Device(s) Realtek ALC255
Power Supply PANASONIC AS16A5K
Mouse SteelSeries Rival
Keyboard Ducky Channel Shine 3
Software Windows 10 Home 64-bit (Version 1607, Build 14393.969)
False.

Basically each module has one fetch/decode unit capable of issuing 4 macrops per cycle (same as Intel does since Nehalem, or sooner I'm not sure actually). So while a Phenom X6 had in total six units capable of issuing 3 Mops each, 8 "core" BD has 4 units capable of issuing 4 Mops each.

I don't care anymore



I got confused with this picture

You were right but my mind remembered something else

64KB L1I is divided by 2 for each core 32KB L1I per core just like Intel
 

GenTarkin

New Member
Joined
Oct 16, 2008
Messages
24 (0.00/day)
x87 is like 5 years old and completely obsolete. The Phenom II is running Super Pi with the SSE Instruction sets up to SSE3. Intel gets the benefit of the full SSE4, SSE4.1 and SSE4.2.

So of course there is no x87 programs. Why would anyone do that.

Um dude, Phenom II doesnt use SSE anything for SuperPI .. neither does SB.
SuperPI only utilizes x87 for its codebase, therefore thats whats run on both processors in that benchmark.
It makes no sense for any modern uarch strive for x87 prowess ... so, Im pretty sure superPI is the last thing on AMD's mind...if it ever was to begin with =P
It just so happens SB is better at x87 stuff ... who cares!
I wish people would drop superpi all together its meaningless nowadays...yet people use it to leave a good or bad taste in their mouth about an upcoming uarch....freakin retarded way to make first impressions of a new uarch!!!
 
Joined
Mar 26, 2008
Messages
1,877 (0.31/day)
Location
Cobourg,Ontario
System Name RyZen FX
Processor AMD Ryzen 9 5900x
Motherboard Gigabyte B550 Aorus Elite AX V2
Cooling DeepCool AK400 Zero Dark Plus
Memory Corsair CMK32GX4M2E3200C16 X2 32gig dual channel
Video Card(s) ASUS RX 7700XT TUF OC
Storage x2 Lexar SSD NM710 2TB 2XSeagate 1Terrabyte 1x Seagate 2 Terrabyte
Display(s) 40 Inch Samsung HDTV (monitor)
Case HAF-X:)
Audio Device(s) AMD/HDMI to Onkyo HT-R508 Receiver
Power Supply EVGA SuperNOVA 1000 G2 Power Supply
Software Windows 10 Pro X64
How do you figure? Preliminary pricing has the 8 core BD at 330 dollars and the 990FX boards are priced around the same as P67 boards.

As for overclocking. Sandy Bridge processors are 95W TDP's, BD 8 core is 140W. Which do you think is going to have an easier time overclocking?



By 1% to 2% ser. There will be no miracle 10% gains.

there 125 and 95 watt for Bulldozer,The 186 is a eng sample so it leaks more then a B2 chip.
 
Joined
Jul 10, 2010
Messages
1,233 (0.23/day)
Location
USA, Arizona
System Name SolarwindMobile
Processor AMD FX-9800P RADEON R7, 12 COMPUTE CORES 4C+8G
Motherboard Acer Wasp_BR
Cooling It's Copper.
Memory 2 x 8GB SK Hynix/HMA41GS6AFR8N-TF
Video Card(s) ATI/AMD Radeon R7 Series (Bristol Ridge FP4) [ACER]
Storage TOSHIBA MQ01ABD100 1TB + KINGSTON RBU-SNS8152S3128GG2 128 GB
Display(s) ViewSonic XG2401 SERIES
Case Acer Aspire E5-553G
Audio Device(s) Realtek ALC255
Power Supply PANASONIC AS16A5K
Mouse SteelSeries Rival
Keyboard Ducky Channel Shine 3
Software Windows 10 Home 64-bit (Version 1607, Build 14393.969)
there 125 and 95 watt for Bulldozer,The 186 is a eng sample so it leaks more then a B2 chip.

I didn't think about that :banghead: and it doesn't help it leaking that much at 3.2GHz lol

All FX Chips are overclockable

95 Watts FX-X110, 125 Watts FX-8130P

Um dude, Phenom II doesnt use SSE anything for SuperPI .. neither does SB.
SuperPI only utilizes x87 for its codebase, therefore thats whats run on both processors in that benchmark.
It makes no sense for any modern uarch strive for x87 prowess ... so, Im pretty sure superPI is the last thing on AMD's mind...if it ever was to begin with =P
It just so happens SB is better at x87 stuff ... who cares!
I wish people would drop superpi all together its meaningless nowadays...yet people use it to leave a good or bad taste in their mouth about an upcoming uarch....freakin retarded way to make first impressions of a new uarch!!!

exactly
 
Joined
Apr 4, 2008
Messages
4,686 (0.77/day)
System Name Obelisc
Processor i7 3770k @ 4.8 GHz
Motherboard Asus P8Z77-V
Cooling H110
Memory 16GB(4x4) @ 2400 MHz 9-11-11-31
Video Card(s) GTX 780 Ti
Storage 850 EVO 1TB, 2x 5TB Toshiba
Case T81
Audio Device(s) X-Fi Titanium HD
Power Supply EVGA 850 T2 80+ TITANIUM
Software Win10 64bit
I'm sorry, but your compare here is inaccurate. You've got AMD with DDR3, and Intel with DDR2. I dunno about you, but I ran my 775 on DDR3 as soon as DDR3 boards came out. In fact, my old 775 board, a Foxconn BlackOps, that supports DDR3, is on it's way to EasyRhino right now.


Anyway, the point was that SuperPi can directly relate to SOME APPs and how they can perform, and is in no way meant to be used as a comparison for all performance scenarios.


And I do have screenshots from that platform. I'll not fall for the obvious problems in your compare; your troll failed, sry.:laugh:

Ok. I've had enough of this crap from you. Every time you get pushed into a corner with one of your assumptions you clamp down into this "lalalala I can't hear you mode." That wasn't even remotely trollish to anyone but you. I explained the extent of the effect of the DDR3, which I had confirmed before posting. Here, see for yourself. http://www.legitreviews.com/article/902/6/

It's ok to have some confidence in your assumptions but you take it too far. Thinking I'm trolling you? Wth man.
 

cadaveca

My name is Dave
Joined
Apr 10, 2006
Messages
17,232 (2.53/day)
Yes, you are trolling, becuase although SuperPi is not indicitive of real-world performance, it does correlate to overall memory performance. As seen in F1 2010.

You started of saying AMD had better ram performance, but it does not; it only looks that way in your screenshots because you've got DDR2 VS DDR3. That's using skewed results that emulate what you want, rather than the truth. Start with factual comments, and I'll not call you a troll.

I've been doing cache speed compares since SKT754. if you search other forums for my posts, you'll find I even comapred 1MB vs 2MB CPUs. You're not informing me(or anyone else) of anything.
 
Joined
Jan 24, 2011
Messages
180 (0.04/day)
seronx I won't deny that BD has a balanced amount of resources so there won't be a bottleneck and one of the links I provided was from AMD not just some smart ass guy even if it was quite old presentation.
The thing is, for me it would be 4 cores with 8 integer clusters but not 8 cores, because for me a 2 core is CMP, 2 identical cores who share at most L3 cache for data sharing between cores, hyper-transport and Integrated Memory Controller and in some case IGP like in Llano or SB.
Thats why I think they would be better of calling it 4 cores with AMD-threading or something like that and not 8 cores just because some small part of core die, just 12% is doubled what is not a core but an integer unit(cluster) just a part of it. Intel SB with HT also has an increase in die size thanks to HT meaning something was doubled but not as much as in an AMD modul, yet no one calls it that way even if it can virtually work with 8 threads. Why doubling integer units means double amount of cores but doubling registers and some other things means just 4 cores?
(sorry i couldn't find what was actually doubled except some registers in P4 but from that time HT did a big improvement even if I still think modul is the right choice and not HT)

devguy you wrote L3 Cache, the Integrated Memory Controller, and the HyperTransport link are shared and thanks to that Deneb should be just one core if BD isn't an 8 core or something in this sense. Thats a bad comparison in my opinion.
L3 cache is there specifically just so each core can access data from the other, what other reason would be there if L2 cache is faster, so making it larger would be better for the performance than creating a new slower cache.
IMC is for a CPU to communicate with the memory modules, so why should each core have their own IMC?
Hyper-transport or intel equivalent is the same as IMC just a communication between cpu and northbridge, southbridge or other cpu.
Not one of them was ever included in a core as I can recall at least IMC and HTt.
Its enough if you just look at the BD modul and deneb core and you can see the difference is just twice the amount of integer clusters, but just integer clusters were never called cores so why should be now.
 
Joined
May 7, 2009
Messages
5,392 (0.95/day)
Location
Carrollton, GA
System Name ODIN
Processor AMD Ryzen 7 5800X
Motherboard Gigabyte B550 Aorus Elite AX V2
Cooling Dark Rock 4
Memory G Skill RipjawsV F4 3600 Mhz C16
Video Card(s) MSI GeForce RTX 3080 Ventus 3X OC LHR
Storage Crucial 2 TB M.2 SSD :: WD Blue M.2 1TB SSD :: 1 TB WD Black VelociRaptor
Display(s) Dell S2716DG 27" 144 Hz G-SYNC
Case Fractal Meshify C
Audio Device(s) Onboard Audio
Power Supply Antec HCP 850 80+ Gold
Mouse Corsair M65
Keyboard Corsair K70 RGB Lux
Software Windows 10 Pro 64-bit
Benchmark Scores I don't benchmark.
Technically your current title doesn't have "mod" in it. Your mod status is implied. lol
 
Joined
Jul 10, 2010
Messages
1,233 (0.23/day)
Location
USA, Arizona
System Name SolarwindMobile
Processor AMD FX-9800P RADEON R7, 12 COMPUTE CORES 4C+8G
Motherboard Acer Wasp_BR
Cooling It's Copper.
Memory 2 x 8GB SK Hynix/HMA41GS6AFR8N-TF
Video Card(s) ATI/AMD Radeon R7 Series (Bristol Ridge FP4) [ACER]
Storage TOSHIBA MQ01ABD100 1TB + KINGSTON RBU-SNS8152S3128GG2 128 GB
Display(s) ViewSonic XG2401 SERIES
Case Acer Aspire E5-553G
Audio Device(s) Realtek ALC255
Power Supply PANASONIC AS16A5K
Mouse SteelSeries Rival
Keyboard Ducky Channel Shine 3
Software Windows 10 Home 64-bit (Version 1607, Build 14393.969)
seronx I won't deny that BD has a balanced amount of resources so there won't be a bottleneck and one of the links I provided was from AMD not just some smart ass guy even if it was quite old presentation.
The thing is, for me it would be 4 cores with 8 integer clusters but not 8 cores, because for me a 2 core is CMP, 2 identical cores who share at most L3 cache for data sharing between cores, hyper-transport and Integrated Memory Controller and in some case IGP like in Llano or SB.
Thats why I think they would be better of calling it 4 cores with AMD-threading or something like that and not 8 cores just because some small part of core die, just 12% is doubled what is not a core but an integer unit(cluster) just a part of it. Intel SB with HT also has an increase in die size thanks to HT meaning something was doubled but not as much as in an AMD modul, yet no one calls it that way even if it can virtually work with 8 threads. Why doubling integer units means double amount of cores but doubling registers and some other things means just 4 cores?
(sorry i couldn't find what was actually doubled except some registers in P4 but from that time HT did a big improvement even if I still think modul is the right choice and not HT)

Everything was doubled
2 x 128bits SSE(1x256 bit AVX Add+Multiply)
2 x 16KB L1D
1 x 64KB L1I instead of 1 x 32KB L1I
64+64 and 32+32+32+32 registers instead of 64 and 32+32 registers
512KB(Phenom II) to 1MB L2(Regor/Llano) to 2MB L2(Zambezi)

To lazy to look up more that was doubled

The formula has changed a bit
Two Identical cores now use L2(For Zambezi)
Several Modules now use L3(For Zambezi)



Rather old dissection
 
Joined
Feb 17, 2007
Messages
1,238 (0.19/day)
Location
SoCal
Processor AMD Phenom II 1055T @ 3.6ghz 1.3V
Motherboard Asus M5A97 EVO
Cooling Xigmatek SD1284
Memory 2x4GB Patriot Sector 5 PC3-12800 @ 7-8-7-24-1T 1.7V
Video Card(s) XFX Radeon HD 7950 DD @ 1100/1350 1.185V
Storage OCZ Agility 3 120GB + 2x7200.12 500GB Raid1
Display(s) QNIX QX2710 27" LCD 1440p @ 120hz
Case Cooler Master 690M
Audio Device(s) Realtek ALC892
Power Supply Enermax Liberty 620W Eco Edition
Software Windows 7 Professional x64 / Ubuntu 12.04 x64
devguy you wrote L3 Cache, the Integrated Memory Controller, and the HyperTransport link are shared and thanks to that Deneb should be just one core if BD isn't an 8 core or something in this sense. Thats a bad comparison in my opinion.
L3 cache is there specifically just so each core can access data from the other, what other reason would be there if L2 cache is faster, so making it larger would be better for the performance than creating a new slower cache.
IMC is for a CPU to communicate with the memory modules, so why should each core have their own IMC?
Hyper-transport or intel equivalent is the same as IMC just a communication between cpu and northbridge, southbridge or other cpu.
Not one of them was ever included in a core as I can recall at least IMC and HTt.
Its enough if you just look at the BD modul and deneb core and you can see the difference is just twice the amount of integer clusters, but just integer clusters were never called cores so why should be now.

Here's a quote from JF-AMD that you should read:
Um, old school processors had the FPU in a seperate socket and few ever populated it. Are you telling me that everything prior to pentium was a "zero core" or "half core" processor?

Processors are full of shared and discrete components. Memory controller, L2 cache, L3 cache, Northbridge, HT links, etc. All of that stuff can be shared. Why don't you give each core a memory controller? When we went from single core with a single memory controller to dual core with a single memory controller, where was the outrage? You can't really call that a "dual core" with only a single memory controller....

The world is apparently never going to completely agree on what a core is. MOST of the world looks at integer execution clusters as the "core".

Here is something that we can all agree on: They will have a performance level, they will have a power consumption and they will have a price. And those will be the things that people compare to today. I am actually happy to be living in a world that does not force me to make my processors exactly like my competitors.
 
Joined
Apr 21, 2010
Messages
146 (0.03/day)
Location
Perth, Australia
Processor 5800x3d
Motherboard Asus B550 Gaming-F
Cooling Ek 240 Aio
Memory Gskill Trident Neo 4000 18-22-22-42 @3800 fclk 1900
Video Card(s) 2080ti
Storage 1 TB Nvme
Power Supply Seasonic 750w
Software Win 11
I've been doing cache speed compares since SKT754. if you search other forums for my posts, you'll find I even comapred 1MB vs 2MB CPUs. You're not informing me(or anyone else) of anything.

Is that at me?...

ok, can u clear something up for me. Doesn't super pi mostly stress cache bandwidth/latency esp at lower tests like below 8mb. I thought it was Amd's cache that was slower than intels and not so much the Imc or does Qpi significantly outpace it.

The question must be raised, is such detailed analysis even nessecary? Are we comparing cache, the CPU memory subsystem(which for me, is caches, controller, and system memory), the system memory subsystem, or jsut overall performance?

I raised this point earlier..I care about game perforamcne, so until I get game perforamcne compares, none of this really matters to me. Bulldozer could be the slowest CPU ever, but if in some magical way it makes my games play better, then it's a win, for me. So, what's really improtant for you? Games, or something else?
just curious, you were discussing phenom mem perf vs sandy or something earlier. i thought it might have some relevance to this. also below 8m the cache is whats being tested and after that both cache and the rest or the mem sub-system. i'm not sure what im saying anymore.. too tired.
games.. look at my rig. I spent $70 on the cpu and $180 on the Gfx. :D
 
Last edited:

cadaveca

My name is Dave
Joined
Apr 10, 2006
Messages
17,232 (2.53/day)
Is that at me?...

No, not at all! ;) It to those posting comment like "x87 is now useless". No kidding x87 is useless, as is SuperPi. However, the way the runtime works creates high memory traffic, and that's what we are analyzing, not how fast the CPU does x87, nor how long it takes to calculate to so many digits of Pi. It's not a "real-world" performance benchmark, it's a "simulated' performance benchmark, which means, by the nature of those definitions, that it must not be accepted as fact without special considerations. Raising any points about the validity of those benchmarks is stating the obvious, and as such, I consider a troll posting.

ok, can u clear something up for me. Doesn't super pi mostly stress cache bandwidth/latency esp at lower tests like below 8mb. I thought it was Amd's cache that was slower than intels and not so much the Imc.

The question must be raised, is such detailed analysis even nessecary? Are we comparing cache, the CPU memory subsystem(which for me, is caches, controller, and system memory), the system memory subsystem, or jsut overall performance?

I raised this point earlier..I care about game perforamcne, so until I get game perforamcne compares, none of this really matters to me. Bulldozer could be the slowest CPU ever, but if in some magical way it makes my games play better, then it's a win, for me. So, what's really improtant for you? Games, or something else?
 
Last edited:
Joined
Oct 29, 2010
Messages
2,972 (0.58/day)
System Name Old Fart / Young Dude
Processor 2500K / 6600K
Motherboard ASRock P67Extreme4 / Gigabyte GA-Z170-HD3 DDR3
Cooling CM Hyper TX3 / CM Hyper 212 EVO
Memory 16 GB Kingston HyperX / 16 GB G.Skill Ripjaws X
Video Card(s) Gigabyte GTX 1050 Ti / INNO3D RTX 2060
Storage SSD, some WD and lots of Samsungs
Display(s) BenQ GW2470 / LG UHD 43" TV
Case Cooler Master CM690 II Advanced / Thermaltake Core v31
Audio Device(s) Asus Xonar D1/Denon PMA500AE/Wharfedale D 10.1/ FiiO D03K/ JBL LSR 305
Power Supply Corsair TX650 / Corsair TX650M
Mouse Steelseries Rival 100 / Rival 110
Keyboard Sidewinder/ Steelseries Apex 150
Software Windows 10 / Windows 10 Pro
So going back to what we see in the screens posted at the beginning of this thread which sparked enthusiasm from some and skepticism from others, we can conclude that:

The Aida cache and memory benchmark is a disaster for BD, SuperPi the same and the other benchmarks are done at unknown clocks therefore we don’t have a true comparison with SB.

We’ll have to wait a little longer to realy compare BD and SB.
 
Top