• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Trinity (Piledriver) Integer/FP Performance Higher Than Bulldozer, Clock-for-Clock

OneMoar

There is Always Moar
Joined
Apr 9, 2010
Messages
8,795 (1.64/day)
Location
Rochester area
System Name RPC MK2.5
Processor Ryzen 5800x
Motherboard Gigabyte Aorus Pro V2
Cooling Thermalright Phantom Spirit SE
Memory CL16 BL2K16G36C16U4RL 3600 1:1 micron e-die
Video Card(s) GIGABYTE RTX 3070 Ti GAMING OC
Storage Nextorage NE1N 2TB ADATA SX8200PRO NVME 512GB, Intel 545s 500GBSSD, ADATA SU800 SSD, 3TB Spinner
Display(s) LG Ultra Gear 32 1440p 165hz Dell 1440p 75hz
Case Phanteks P300 /w 300A front panel conversion
Audio Device(s) onboard
Power Supply SeaSonic Focus+ Platinum 750W
Mouse Kone burst Pro
Keyboard SteelSeries Apex 7
Software Windows 11 +startisallback
k I am unsubbing from this thread until a mod hands out bans to the flamers here
 

eidairaman1

The Exiled Airman
Joined
Jul 2, 2007
Messages
42,222 (6.64/day)
Location
Republic of Texas (True Patriot)
System Name PCGOD
Processor AMD FX 8350@ 5.0GHz
Motherboard Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory 16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s) AMD Radeon 290 Sapphire Vapor-X
Storage Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s) NEC Multisync LCD 1700V (Display Port Adapter)
Case AeroCool Xpredator Evil Blue Edition
Audio Device(s) Creative Labs Sound Blaster ZxR
Power Supply Seasonic 1250 XM2 Series (XP3)
Mouse Roccat Kone XTD
Keyboard Roccat Ryos MK Pro
Software Windows 7 Pro 64
Joined
Mar 10, 2010
Messages
11,878 (2.21/day)
Location
Manchester uk
System Name RyzenGtEvo/ Asus strix scar II
Processor Amd R5 5900X/ Intel 8750H
Motherboard Crosshair hero8 impact/Asus
Cooling 360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory Corsair Vengeance Rgb pro 3600cas14 16Gb in four sticks./16Gb/16GB
Video Card(s) Powercolour RX7900XT Reference/Rtx 2060
Storage Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s) Samsung UAE28"850R 4k freesync.dell shiter
Case Lianli 011 dynamic/strix scar2
Audio Device(s) Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply corsair 1200Hxi/Asus stock
Mouse Roccat Kova/ Logitech G wireless
Keyboard Roccat Aimo 120
VR HMD Oculus rift
Software Win 10 Pro
Benchmark Scores 8726 vega 3dmark timespy/ laptop Timespy 6506
I actually thought it had not gone too Bad, mayhap because i posted once before this:)

i cant wait for the ins and outs of whats going on to be known in a few / year ,what will the ps3 and nextbox have etc.

im looking forward to a promising Vishera, not an awe inspireing one but id imagine with the new clock mesh tech these Apu's will OC a fair bit, given a reasonable Vreg which is only going to be on the Fm2 Platform not soo much laptops but in that form i can well imagine a modern day console type gameing experience on most pc games, maybe better/ deff better then an xbox360 game , and i can see it running well with an OC.

and with L3 ,more modules/cores and a (important)later possible stepping vishera could end up doing well, I only back Amd at any point due to the fact some are OT given i get 60-80Fps in any game on ultra(addmitedly with hybrid physx for the nv favoured games) with my main rig , an intel system may well do much better but My experience isnt as bad as some of you are making out i dont notice any wait times and these chips are going to perform better then my main rig does at min or should
 
Joined
Feb 13, 2012
Messages
523 (0.11/day)
I actually thought it had not gone too Bad, mayhap because i posted once before this:)

i cant wait for the ins and outs of whats going on to be known in a few / year ,what will the ps3 and nextbox have etc.

im looking forward to a promising Vishera, not an awe inspireing one but id imagine with the new clock mesh tech these Apu's will OC a fair bit, given a reasonable Vreg which is only going to be on the Fm2 Platform not soo much laptops but in that form i can well imagine a modern day console type gameing experience on most pc games, maybe better/ deff better then an xbox360 game , and i can see it running well with an OC.

and with L3 ,more modules/cores and a (important)later possible stepping vishera could end up doing well, I only back Amd at any point due to the fact some are OT given i get 60-80Fps in any game on ultra(addmitedly with hybrid physx for the nv favoured games) with my main rig , an intel system may well do much better but My experience isnt as bad as some of you are making out i dont notice any wait times and these chips are going to perform better then my main rig does at min or should

yes thats very true about the clocks, just imagine if a quad core piledriver can do 100watt tdp at 4.2ghz with half of the chip being a gpu clocked at 800mhz, just imagine how far can the quad core piledriver cores go without the gpu, or atleast how efficient they would be
from leaks it seems piledriver is 20% than bulldozer clock-clock so it almost finaly matches phenom II ipc, but offcourse clocks much higher.
once again the phenom/phenom II story going on, with piledriver being what bulldozer was meant to be(and some... or hopefully atleast)
 
Joined
May 18, 2010
Messages
3,427 (0.65/day)
System Name My baby
Processor Athlon II X4 620 @ 3.5GHz, 1.45v, NB @ 2700Mhz, HT @ 2700Mhz - 24hr prime95 stable
Motherboard Asus M4A785TD-V EVO
Cooling Sonic Tower Rev 2 with 120mm Akasa attached, Akasa @ Front, Xilence Red Wing 120mm @ Rear
Memory 8 GB G.Skills 1600Mhz
Video Card(s) ATI ASUS Crossfire 5850
Storage Crucial MX100 SATA 2.5 SSD
Display(s) Lenovo ThinkVision 27" (LEN P27h-10)
Case Antec VSK 2000 Black Tower Case
Audio Device(s) Onkyo TX-SR309 Receiver, 2x Kef Cresta 1, 1x Kef Center 20c
Power Supply OCZ StealthXstream II 600w, 4x12v/18A, 80% efficiency.
Software Windows 10 Professional 64-bit
from leaks it seems piledriver is 20% than bulldozer clock-clock so it almost finaly matches phenom II ipc, but offcourse clocks much higher.

OK stop spreading false information - I know you didn't do it deliberately but all this false information needs to be nipped in the bud.

Phenom II is not 20% faster than Bulldozer clock for clock. It's ridiculous to think that. Obviously its application dependant, but on a good day when an application favours Phenom II's architecture we are talking about maybe 5% or less or within margin for error. Overall the Bulldozer is faster.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,171 (2.81/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
OK stop spreading false information - I know you didn't do it deliberately but all this false information needs to be nipped in the bud.

Phenom II is not 20% faster than Bulldozer clock for clock. It's ridiculous to think that. Obviously its application dependant, but on a good day when an application favours Phenom II's architecture we are talking about maybe 5% or less or within margin for error. Overall the Bulldozer is faster.

He said IPC, buddy. Just because BD has a lower IPC it doesn't mean that it doesn't run faster. Once BD's IPC is down to where it was with the Phenom II, there will be a lot more performance because bulldozer clocks that much higher than the Phenom IIs did.

Also all of those tasks that do better on the P2 are single threaded tasks and unoptimized floating point applications and even in both of these cases, the performance is acceptable.
 
Joined
May 18, 2010
Messages
3,427 (0.65/day)
System Name My baby
Processor Athlon II X4 620 @ 3.5GHz, 1.45v, NB @ 2700Mhz, HT @ 2700Mhz - 24hr prime95 stable
Motherboard Asus M4A785TD-V EVO
Cooling Sonic Tower Rev 2 with 120mm Akasa attached, Akasa @ Front, Xilence Red Wing 120mm @ Rear
Memory 8 GB G.Skills 1600Mhz
Video Card(s) ATI ASUS Crossfire 5850
Storage Crucial MX100 SATA 2.5 SSD
Display(s) Lenovo ThinkVision 27" (LEN P27h-10)
Case Antec VSK 2000 Black Tower Case
Audio Device(s) Onkyo TX-SR309 Receiver, 2x Kef Cresta 1, 1x Kef Center 20c
Power Supply OCZ StealthXstream II 600w, 4x12v/18A, 80% efficiency.
Software Windows 10 Professional 64-bit
He said IPC, buddy. Just because BD has a lower IPC it doesn't mean that it doesn't run faster. Once BD's IPC is down to where it was with the Phenom II, there will be a lot more performance because bulldozer clocks that much higher than the Phenom IIs did.

Also all of those tasks that do better on the P2 are single threaded tasks and unoptimized floating point applications and even in both of these cases, the performance is acceptable.

Yes, but those single threaded tasks don't do 20% better as sergionography implied.
 

eidairaman1

The Exiled Airman
Joined
Jul 2, 2007
Messages
42,222 (6.64/day)
Location
Republic of Texas (True Patriot)
System Name PCGOD
Processor AMD FX 8350@ 5.0GHz
Motherboard Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory 16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s) AMD Radeon 290 Sapphire Vapor-X
Storage Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s) NEC Multisync LCD 1700V (Display Port Adapter)
Case AeroCool Xpredator Evil Blue Edition
Audio Device(s) Creative Labs Sound Blaster ZxR
Power Supply Seasonic 1250 XM2 Series (XP3)
Mouse Roccat Kone XTD
Keyboard Roccat Ryos MK Pro
Software Windows 7 Pro 64
how about we wait for trinity to be in the hands of the reviewers and users here. Same with AM3+ Piledriver
 
Joined
Jan 20, 2010
Messages
868 (0.16/day)
Location
Toronto, ON. Canada
System Name Gamers PC
Processor AMD Phenom II X4 965 BE @ 3.80 GHz
Motherboard MSI 790FX-GD70 AM3
Cooling Corsair H50 Cooler
Memory Corsair XMS3 4GB (2x2GB) DDR3-1333
Video Card(s) XFX Radeon HD 5770 1GB GDDR5
Storage 2 x WD Caviar Green 1TB SATA300 w/64MB Buffer (RAID 0)
Display(s) Samsung 2494SW 1080p 24" WS LCD HD
Case CM HAF 932 Full Tower Case
Audio Device(s) Creative SB X-FI TITANIUM -PCIE x 1
Power Supply Corsair TX Series CMPSU-650TX (650W)
Software Windows 7 Ultimate 64-bit
On a rather disturbing note, the performance-per-GHz figures of Piledriver are trailing far behind K12 architecture (Llano, A8-3850), let alone competitive architectures from Intel.
Each and every design is different. Piledriver/Bulldozer is design for higher clock speed. Llano and K12 is not. Just like the Athlon 64 of past it needed less clock speed to beat out Pentium 4 that needed at least an extra 1000MHz to stay competative.
 
Last edited:
Joined
Mar 13, 2012
Messages
396 (0.09/day)
Location
USA
Quite true, both Llano and Trinity are performing @ about 2400 integer score/ GHZ, which is 20% lower than the i5 SB score.
So depending on pricing and the clocks for the lower end chips, Trinity may be fully competitive thanks to it's higher clocks and superior iGPU, especially with IB i3/pentium not coming out till Q3/Q4 and trinity coming out late Q1, early Q2.
And if the earlier rumor of power efficiency is true as well, with it being 15% more power efficient than llano, and given BD OC'd rather decently, the unlocked parts I feel based on what information we currently have available to us at the moment show a great budget part.

Although it is still merely speculation until Wiz gets to do a review.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,171 (2.81/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
Yes, but those single threaded tasks don't do 20% better as sergionography implied.

At the same clock speed, I bet they did. Can we get a review of a Phenom II and BD at the same clock, HTT speeds, and memory speeds? It would answer this question very quickly.

This statement complete rubbish.

Stop trolling and only post if you have something useful to contribute.
 
Joined
Nov 4, 2005
Messages
11,984 (1.72/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s) 55" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
This statement complete rubbish.

Apparently you missed the chart, or lack the understanding to read it.


They directly compare the FP performance per clock, and the A8 series is raping the A10 and bulldozer if the chart is real.

In other words, AMD may have done nothing but tweaked the core a bit to conserve energy and increase the speed. All joking aside this new architecture is the P4 from AMD.


I keep thinking and saying their only saving grace will be GCN added to a quad core and software enhancement to offload the work to the much faster GPU, however I think they lack the manpower and drive to do it. So I am expecting mediocrity from their next chip after this too. Once they push for it, or pull back to a tweaked for efficency design they have a chance to gain the performance edge.


AMD, please, make software to support your hardware, Intel did it for years, programs would see a Intel chip and optimize performance, you can do it too. I would rather have two cores dedicated to serving data to a on bard GPU with enough stream processors and cache to run full tilt than have 8 cores total. Or do it in hardware, surely 10% die area is worth a exponential increase in performance.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,171 (2.81/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
surely 10% die area is worth a exponential increase in performance.

Bulldozer modules scales almost linearly, hyper-threading does not. I keep telling people that single-threaded applications aren't the future, they're the past. I think people are pandering about things that won't matter in the future since nothing is optimized for BD. (Applications that use FMA3 on BD actually have sizable floating point speed improvements.)

AMD, please, make software to support your hardware, Intel did it for years, programs would see a Intel chip and optimize performance, you can do it too. I would rather have two cores dedicated to serving data to a on bard GPU with enough stream processors and cache to run full tilt than have 8 cores total.

Clearly you don't know how the development model works. Why would you prepare software for cutting edge hardware that the majority of people don't have. People use technologies when it benefits them, and it benefits software companies when people have hardware that can run their software. That means requiring something like FMA3 puts people who don't have SB or BD at a loss which only hurts the consumer and the software developer.
 
Joined
Mar 24, 2011
Messages
2,356 (0.47/day)
Location
VT
Processor Intel i7-10700k
Motherboard Gigabyte Aurorus Ultra z490
Cooling Corsair H100i RGB
Memory 32GB (4x8GB) Corsair Vengeance DDR4-3200MHz
Video Card(s) MSI Gaming Trio X 3070 LHR
Display(s) ASUS MG278Q / AOC G2590FX
Case Corsair X4000 iCue
Audio Device(s) Onboard
Power Supply Corsair RM650x 650W Fully Modular
Software Windows 10
Bulldozer modules scales almost linearly, hyper-threading does not.

That's because HTing isn't intended to serve the same function. It's just there so the CPU can use previously unused resources to get some work done instead of idling. Bulldozer modules do scale well, but the problem is shit scaling linearly is still just shit. Plus it's not as though having 1,000 Cores is better than just 4 good ones for most people.

I keep telling people that single-threaded applications aren't the future, they're the past. I think people are pandering about things that won't matter in the future since nothing is optimized for BD. (Applications that use FMA3 on BD actually have sizable floating point speed improvements.)

I don't think anyone want's Single-Threaded applications, they are more like an unfortunate reality. This isn't like the Athlon X2 era when people were saying it didn't matter because only a handful of people even had Multi-Core CPU's, at this point just about everything comes with at LEAST a Dual-Core. The issue is that Bulldozer CPU's only get an edge when there are more than 4 Threads and you're using a BD CPU with more than 4 cores. Even then, having such low per-core performance usually results in the Intel CPU's winning out.
 
Joined
Mar 13, 2012
Messages
396 (0.09/day)
Location
USA
Apparently you missed the chart, or lack the understanding to read it.


They directly compare the FP performance per clock, and the A8 series is raping the A10 and bulldozer if the chart is real.

In other words, AMD may have done nothing but tweaked the core a bit to conserve energy and increase the speed. All joking aside this new architecture is the P4 from AMD.


I keep thinking and saying their only saving grace will be GCN added to a quad core and software enhancement to offload the work to the much faster GPU, however I think they lack the manpower and drive to do it. So I am expecting mediocrity from their next chip after this too. Once they push for it, or pull back to a tweaked for efficency design they have a chance to gain the performance edge.


AMD, please, make software to support your hardware, Intel did it for years, programs would see a Intel chip and optimize performance, you can do it too. I would rather have two cores dedicated to serving data to a on bard GPU with enough stream processors and cache to run full tilt than have 8 cores total. Or do it in hardware, surely 10% die area is worth a exponential increase in performance.
Apparently you don't see the point of BD/PD architecture. The very idea behind it is going to sacrifice FP performance by sharing the FP unit between two cores. Given that it is integer performance that matters to what the architecture is being geared for, thus is what is being improved. Note that the integer performance is the same as Llano but clocks 33% higher. Giving the quad core unlocked trinity only a 10% lower integer performance than the i5-2500k while having a superior iGPU. Meaning that it -should- outperform a SB i3, and we won't see IB i3's until Q3/Q4 most likely, and therefore Trinity should hold a very good value spot for up to 6 months, and quite possibly remain competitive with the IB i3's thanks to it's unlocked variants and it's superior iGPU.

Why do I keep mentioning the iGPU? Because AMD's long term plan is HSA, and dumping floating point math onto the iGPU. And HSA functions -should- be available next year, with 22nm steamroller + GCN (expected to be possibly a 7750-equivalent) on die.

Oh, and earlier possible leaks show Trinity to be more power efficient than Llano as well.



Also @ genocide, 20% lower performance / clock and 10% lower performance / watt = shit? I see it as being less efficient and powerful, but it's not like it is only half as powerful. (and I'm saying that based on the slide btw. Given the A10 is a 100w part that is really a 95w CPU + GPU + 5w bridge chip and all. Bulldozer was something of a fail, Piledriver isn't looking to be quite as bad.

I have to agree on the appearance of a Phenom I / II again here.
 
Joined
Nov 4, 2005
Messages
11,984 (1.72/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s) 55" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
Bulldozer modules scales almost linearly, hyper-threading does not. I keep telling people that single-threaded applications aren't the future, they're the past. I think people are pandering about things that won't matter in the future since nothing is optimized for BD. (Applications that use FMA3 on BD actually have sizable floating point speed improvements.)



Clearly you don't know how the development model works. Why would you prepare software for cutting edge hardware that the majority of people don't have. People use technologies when it benefits them, and it benefits software companies when people have hardware that can run their software. That means requiring something like FMA3 puts people who don't have SB or BD at a loss which only hurts the consumer and the software developer.

At a linear rate of what? Adding more cores to processors doesn't directly improve performance as many threaded items are/have dependancies, so core 0 may be still processing a thread that core 1 needs the result of to start work. IPC is still extremely important, thinking otherwise is naive.

So first you say the way is multi-core, and now you say they shouldn't prepare for multi-core systems? This makes no sense, if we are moving to a multi-core standard (we are) we need to have hardware/software resources to support it, and if developers aren't going to do it, AMD needs to.
 
Joined
Mar 13, 2012
Messages
396 (0.09/day)
Location
USA
At a linear rate of what? Adding more cores to processors doesn't directly improve performance as many threaded items are/have dependancies, so core 0 may be still processing a thread that core 1 needs the result of to start work. IPC is still extremely important, thinking otherwise is naive.

So first you say the way is multi-core, and now you say they shouldn't prepare for multi-core systems? This makes no sense, if we are moving to a multi-core standard (we are) we need to have hardware/software resources to support it, and if developers aren't going to do it, AMD needs to.

So given that it has 80-90% of the single core performance while scaling better, i would say there is an advantage. Technically it's single thread performance / watt that is important than pure IPC. Although they -usually- tend to go hand in hand vs clocks.
 
Joined
May 18, 2010
Messages
3,427 (0.65/day)
System Name My baby
Processor Athlon II X4 620 @ 3.5GHz, 1.45v, NB @ 2700Mhz, HT @ 2700Mhz - 24hr prime95 stable
Motherboard Asus M4A785TD-V EVO
Cooling Sonic Tower Rev 2 with 120mm Akasa attached, Akasa @ Front, Xilence Red Wing 120mm @ Rear
Memory 8 GB G.Skills 1600Mhz
Video Card(s) ATI ASUS Crossfire 5850
Storage Crucial MX100 SATA 2.5 SSD
Display(s) Lenovo ThinkVision 27" (LEN P27h-10)
Case Antec VSK 2000 Black Tower Case
Audio Device(s) Onkyo TX-SR309 Receiver, 2x Kef Cresta 1, 1x Kef Center 20c
Power Supply OCZ StealthXstream II 600w, 4x12v/18A, 80% efficiency.
Software Windows 10 Professional 64-bit
At the same clock speed, I bet they did. Can we get a review of a Phenom II and BD at the same clock, HTT speeds, and memory speeds? It would answer this question very quickly.

I've seen many comparison reviews of the two CPUs in question, at the same clock speed, and none was anything close to 20% average increase IPC over the Bulldozer.

I'd be happy to read a review which shows that claim. If anyone has external reading material feel free to post it.
 
Joined
Feb 13, 2012
Messages
523 (0.11/day)
Yes, but those single threaded tasks don't do 20% better as sergionography implied.

they actualy do dude, just go and look at an fx4100 review and compare it to a phenom II 980BE, you are looking an an fx8150 which clocks up to 4.2 and has 8 cores thats why it does better than a typical quad core phenom II, but comparing a quad core bulldozer to a quad core phenom II it fails miserably
only in situations were new instructions sets are supported does bulldozer hold ground,p but in typical use its way behind clock-clock, and yes by 20% if not more
phenom II does 3ipc while bulldozer does 4ipc shared between 2 cores, and because it has such a long pipeline each cycle takes a longer time(which isnt bad because its kinda designed that way so the resources can feed the second core in the module while the first one is munching on data)but things didnt go so well and the latency is worse than expected


http://www.legitreviews.com/article/1766/17/

heres some of conclusion from legitreview, i wasnt talking out of my ass just so you know

"When it comes to performance we were shocked to see the AMD A8-3850 'Llano' processor and the Socket FM1 platform performing better than the AMD FX-4100 'Bulldozer' processor and the Socket AM3+ platform. We quickly found out that the FX-4100 was priced this low as it needed to be. The performance of the FX-4100 wasn't awful, but we didn't expect to see the AMD A6-3650 running at 2.6GHz to beat the AMD FX-4100 running at 3.6GHz in benchmarks like POV-Ray and Cinebench! "
 
Joined
Nov 4, 2005
Messages
11,984 (1.72/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s) 55" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
So given that it has 80-90% of the single core performance while scaling better, i would say there is an advantage. Technically it's single thread performance / watt that is important than pure IPC. Although they -usually- tend to go hand in hand vs clocks.

Try between 50-90% better.

http://www.google.com/url?sa=t&rct=...sg=AFQjCNHF-2kq4OaqEGHMZBN3z0kXh3zM8A&cad=rja


Intel's own research shows a lack of increase when needing to tie up additional resources to schedule and track data between cores, and result dependency. I agree the initial result of two to four cores is a significant increase as we can offload other threads from the core running our primary worker, or assign different processes to different cores, however the overhead cost starts degrading the performance with more cores.


I was merely asking for a hardware thread handler, and if like Nvidias "hot clocks" it can run at 2 or 4 times the core speed it could easily dispatch and track resources, even handling the offload of work to the GPU cores for faster processing. I understand the unified memory and number of threads/different type of work makes it difficult, but compared to making mediocre processors blazingly fast, what downside is there? If it added 25W of heat but was only used on enthusiast grade processors I would still buy it, as would many.
 
Joined
Mar 13, 2012
Messages
396 (0.09/day)
Location
USA
Umm BD is not 1/10th the processing power of SB, try 70-80% of the power.
That said, I'm fairly sure. Many of us would buy higher clocked chips up to 250w, and it would still sell among us. Given it's still able to be air cooled and all and most mid tower cases supporting 120mm tower coolers. Just leave a cooler out of the box, or give an option for a 155-165mm tall cooler with push-pull fans and a good design and we're set.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,171 (2.81/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
That's because HTing isn't intended to serve the same function. It's just there so the CPU can use previously unused resources to get some work done instead of idling. Bulldozer modules do scale well, but the problem is shit scaling linearly is still just shit. Plus it's not as though having 1,000 Cores is better than just 4 good ones for most people.

I feel like all of my posts in this thread were just posted back to me...

I was merely asking for a hardware thread handler, and if like Nvidias "hot clocks" it can run at 2 or 4 times the core speed it could easily dispatch and track resources, even handling the offload of work to the GPU cores for faster processing. I understand the unified memory and number of threads/different type of work makes it difficult, but compared to making mediocre processors blazingly fast, what downside is there? If it added 25W of heat but was only used on enthusiast grade processors I would still buy it, as would many.

I think you're confusing how GPUs and CPUs work.

nVidia can do what they do because GPUs dispatch large workloads and runs a calculation on every shader that has data. CPUs don't work like this because you're not bulk processing the same instruction across a ton of data. You have different instructions being run, therefore what you're describing for a CPU is essentially a pipeline, which CPUs already have, but "dispatching" anything will result in less performance in single-threaded instances.

Do you know the basic 4 operations that almost any general purpose CPU does? Not to over-simplify how long a pipeline is, but basically you: LOAD, DECODE, EXECUTE, AND STORE, in that order. At this level, there is no parallelism, is's very step by step in the sense that you can't decode an instruction before you load it, you can't execute an instruction until it has been decoded, and you can't store the result after the instruction has been executed.
 
Joined
Nov 4, 2005
Messages
11,984 (1.72/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s) 55" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
Yes I am aware, as I am in the process of getting my degree in computer science. C++, Networking, and other classes.

A single thread on a CPU might run the four, but if we have a hardware scheduler that reads ahead and prefetches data "branching" and then performs the decode at twice the rate, programs shaders to do the work, and then they execute it and store it in the contiguous memory pool what difference does it make if the CPU transistors do it, or if the same instruction is run 5,000 times in the program, the GPU transistors do it.

Pretty simple actually, GPU's already do 90% of this work to keep up with demand. The hardest part would be resource tracking, but again, if they solve it and the performance increase is only 25% better on average, they win.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,171 (2.81/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
A single thread on a CPU might run the four, but if we have a hardware scheduler that reads ahead and prefetches data "branching" and then performs the decode at twice the rate, programs shaders to do the work, and then they execute it and store it in the contiguous memory pool what difference does it make if the CPU transistors do it, or if the same instruction is run 5,000 times in the program, the GPU transistors do it.

Except you can't process a regular application through a pipeline like a GPU has because GPU data is all the same where a computer program has multiple different instructions per clock cycle. A GPU is given a large set of data and told to do a single task to all of it, so it does it the same way. A CPU is instruction after instruction, there isn't a whole lot that represents what the GPU can do.

A shader is small because it has a limited number of instructions it can perform and has no control mechanism, no write back. There is no concept of threads in a GPU, it is an array of one or more sets of data that will have the same operation performed on the entire set. A shader is also SIMD, not MIMD as you're describing.

Where a CPU can carry out instructions like "move 10 bytes from memory location A to memory location B," A GPU does something more like "multiply every item in the array by 1.43."

Pretty simple actually, GPU's already do 90% of this work to keep up with demand. The hardest part would be resource tracking, but again, if they solve it and the performance increase is only 25% better on average, they win.

If it is so simple, why hasn't anyone else figured it out, I'm sill convinced that you don't quite know what you're talking about.

Yes I am aware, as I am in the process of getting my degree in computer science. C++, Networking, and other classes.

I do have a bachelors degree in computer science not to mention I'm employed as a systems admin and a developer.
 
Top