System Name | SolarwindMobile |
---|---|
Processor | AMD FX-9800P RADEON R7, 12 COMPUTE CORES 4C+8G |
Motherboard | Acer Wasp_BR |
Cooling | It's Copper. |
Memory | 2 x 8GB SK Hynix/HMA41GS6AFR8N-TF |
Video Card(s) | ATI/AMD Radeon R7 Series (Bristol Ridge FP4) [ACER] |
Storage | TOSHIBA MQ01ABD100 1TB + KINGSTON RBU-SNS8152S3128GG2 128 GB |
Display(s) | ViewSonic XG2401 SERIES |
Case | Acer Aspire E5-553G |
Audio Device(s) | Realtek ALC255 |
Power Supply | PANASONIC AS16A5K |
Mouse | SteelSeries Rival |
Keyboard | Ducky Channel Shine 3 |
Software | Windows 10 Home 64-bit (Version 1607, Build 14393.969) |
LuLz. The problem is the shared L2 of a module is not fast enough to feed two threads. /end story. There's no big mystery as to why BD is slow. I knew it before the CPU was out.
System Name | SolarwindMobile |
---|---|
Processor | AMD FX-9800P RADEON R7, 12 COMPUTE CORES 4C+8G |
Motherboard | Acer Wasp_BR |
Cooling | It's Copper. |
Memory | 2 x 8GB SK Hynix/HMA41GS6AFR8N-TF |
Video Card(s) | ATI/AMD Radeon R7 Series (Bristol Ridge FP4) [ACER] |
Storage | TOSHIBA MQ01ABD100 1TB + KINGSTON RBU-SNS8152S3128GG2 128 GB |
Display(s) | ViewSonic XG2401 SERIES |
Case | Acer Aspire E5-553G |
Audio Device(s) | Realtek ALC255 |
Power Supply | PANASONIC AS16A5K |
Mouse | SteelSeries Rival |
Keyboard | Ducky Channel Shine 3 |
Software | Windows 10 Home 64-bit (Version 1607, Build 14393.969) |
OK. Please explain to me how, then, moving two threads from running on two cores within a module, to one thread per module, is faster?
System Name | Quantumville™ |
---|---|
Processor | Intel Core i7-2700K @ 4GHz |
Motherboard | Asus P8Z68-V PRO/GEN3 |
Cooling | Noctua NH-D14 |
Memory | 16GB (2 x 8GB Corsair Vengeance Black DDR3 PC3-12800 C9 1600MHz) |
Video Card(s) | MSI RTX 2080 SUPER Gaming X Trio |
Storage | Samsung 850 Pro 256GB | WD Black 4TB | WD Blue 6TB |
Display(s) | ASUS ROG Strix XG27UQR (4K, 144Hz, G-SYNC compatible) | Asus MG28UQ (4K, 60Hz, FreeSync compatible) |
Case | Cooler Master HAF 922 |
Audio Device(s) | Creative Sound Blaster X-Fi Fatal1ty PCIe |
Power Supply | Corsair AX1600i |
Mouse | Microsoft Intellimouse Pro - Black Shadow |
Keyboard | Yes |
Software | Windows 10 Pro 64-bit |
LuLz. The problem is the shared L2 of a module is not fast enough to feed two threads. /end story. There's no big mystery as to why BD is slow. I knew it before the CPU was out.
Isn't it also slow because there's only 4 FPU's in the 8 core model? I was aghast when I first saw this.
System Name | SolarwindMobile |
---|---|
Processor | AMD FX-9800P RADEON R7, 12 COMPUTE CORES 4C+8G |
Motherboard | Acer Wasp_BR |
Cooling | It's Copper. |
Memory | 2 x 8GB SK Hynix/HMA41GS6AFR8N-TF |
Video Card(s) | ATI/AMD Radeon R7 Series (Bristol Ridge FP4) [ACER] |
Storage | TOSHIBA MQ01ABD100 1TB + KINGSTON RBU-SNS8152S3128GG2 128 GB |
Display(s) | ViewSonic XG2401 SERIES |
Case | Acer Aspire E5-553G |
Audio Device(s) | Realtek ALC255 |
Power Supply | PANASONIC AS16A5K |
Mouse | SteelSeries Rival |
Keyboard | Ducky Channel Shine 3 |
Software | Windows 10 Home 64-bit (Version 1607, Build 14393.969) |
Isn't it also slow because there's only 4 FPU's in the 8 core model? I was aghast when I first saw this.
System Name | Quantumville™ |
---|---|
Processor | Intel Core i7-2700K @ 4GHz |
Motherboard | Asus P8Z68-V PRO/GEN3 |
Cooling | Noctua NH-D14 |
Memory | 16GB (2 x 8GB Corsair Vengeance Black DDR3 PC3-12800 C9 1600MHz) |
Video Card(s) | MSI RTX 2080 SUPER Gaming X Trio |
Storage | Samsung 850 Pro 256GB | WD Black 4TB | WD Blue 6TB |
Display(s) | ASUS ROG Strix XG27UQR (4K, 144Hz, G-SYNC compatible) | Asus MG28UQ (4K, 60Hz, FreeSync compatible) |
Case | Cooler Master HAF 922 |
Audio Device(s) | Creative Sound Blaster X-Fi Fatal1ty PCIe |
Power Supply | Corsair AX1600i |
Mouse | Microsoft Intellimouse Pro - Black Shadow |
Keyboard | Yes |
Software | Windows 10 Pro 64-bit |
There is not jsut 4 FPUs. There are 4 256-bit FPUs, which can each handle dual 128-bit operations. Nearly nothing currently uses the 256-bit capability.
It's 4 Floating Point Units but you have two units for each core if it was 256bit Units you wouldn't be complaining
System Name | SolarwindMobile |
---|---|
Processor | AMD FX-9800P RADEON R7, 12 COMPUTE CORES 4C+8G |
Motherboard | Acer Wasp_BR |
Cooling | It's Copper. |
Memory | 2 x 8GB SK Hynix/HMA41GS6AFR8N-TF |
Video Card(s) | ATI/AMD Radeon R7 Series (Bristol Ridge FP4) [ACER] |
Storage | TOSHIBA MQ01ABD100 1TB + KINGSTON RBU-SNS8152S3128GG2 128 GB |
Display(s) | ViewSonic XG2401 SERIES |
Case | Acer Aspire E5-553G |
Audio Device(s) | Realtek ALC255 |
Power Supply | PANASONIC AS16A5K |
Mouse | SteelSeries Rival |
Keyboard | Ducky Channel Shine 3 |
Software | Windows 10 Home 64-bit (Version 1607, Build 14393.969) |
Now I think about it, what is the word size of an FPU on previous 64-bit processors? Should be the same of AMD and Intel, I'd expect. (Yes, I know I could google it, but I'd rather you guys just explain it to me. )
System Name | Quantumville™ |
---|---|
Processor | Intel Core i7-2700K @ 4GHz |
Motherboard | Asus P8Z68-V PRO/GEN3 |
Cooling | Noctua NH-D14 |
Memory | 16GB (2 x 8GB Corsair Vengeance Black DDR3 PC3-12800 C9 1600MHz) |
Video Card(s) | MSI RTX 2080 SUPER Gaming X Trio |
Storage | Samsung 850 Pro 256GB | WD Black 4TB | WD Blue 6TB |
Display(s) | ASUS ROG Strix XG27UQR (4K, 144Hz, G-SYNC compatible) | Asus MG28UQ (4K, 60Hz, FreeSync compatible) |
Case | Cooler Master HAF 922 |
Audio Device(s) | Creative Sound Blaster X-Fi Fatal1ty PCIe |
Power Supply | Corsair AX1600i |
Mouse | Microsoft Intellimouse Pro - Black Shadow |
Keyboard | Yes |
Software | Windows 10 Pro 64-bit |
System Name | Quantumville™ |
---|---|
Processor | Intel Core i7-2700K @ 4GHz |
Motherboard | Asus P8Z68-V PRO/GEN3 |
Cooling | Noctua NH-D14 |
Memory | 16GB (2 x 8GB Corsair Vengeance Black DDR3 PC3-12800 C9 1600MHz) |
Video Card(s) | MSI RTX 2080 SUPER Gaming X Trio |
Storage | Samsung 850 Pro 256GB | WD Black 4TB | WD Blue 6TB |
Display(s) | ASUS ROG Strix XG27UQR (4K, 144Hz, G-SYNC compatible) | Asus MG28UQ (4K, 60Hz, FreeSync compatible) |
Case | Cooler Master HAF 922 |
Audio Device(s) | Creative Sound Blaster X-Fi Fatal1ty PCIe |
Power Supply | Corsair AX1600i |
Mouse | Microsoft Intellimouse Pro - Black Shadow |
Keyboard | Yes |
Software | Windows 10 Pro 64-bit |
System Name | SolarwindMobile |
---|---|
Processor | AMD FX-9800P RADEON R7, 12 COMPUTE CORES 4C+8G |
Motherboard | Acer Wasp_BR |
Cooling | It's Copper. |
Memory | 2 x 8GB SK Hynix/HMA41GS6AFR8N-TF |
Video Card(s) | ATI/AMD Radeon R7 Series (Bristol Ridge FP4) [ACER] |
Storage | TOSHIBA MQ01ABD100 1TB + KINGSTON RBU-SNS8152S3128GG2 128 GB |
Display(s) | ViewSonic XG2401 SERIES |
Case | Acer Aspire E5-553G |
Audio Device(s) | Realtek ALC255 |
Power Supply | PANASONIC AS16A5K |
Mouse | SteelSeries Rival |
Keyboard | Ducky Channel Shine 3 |
Software | Windows 10 Home 64-bit (Version 1607, Build 14393.969) |
When someone can tell us why the L2 cache seems to be slow, it might be more clear why BD "sucks".
System Name | Lailalo |
---|---|
Processor | Ryzen 9 5900X Boosts to 4.95Ghz |
Motherboard | Asus TUF Gaming X570-Plus (WIFI |
Cooling | Noctua |
Memory | 32GB DDR4 3200 Corsair Vengeance |
Video Card(s) | XFX 7900XT 20GB |
Storage | Samsung 970 Pro Plus 1TB, Crucial 1TB MX500 SSD, Segate 3TB |
Display(s) | LG Ultrawide 29in @ 2560x1080 |
Case | Coolermaster Storm Sniper |
Power Supply | XPG 1000W |
Mouse | G602 |
Keyboard | G510s |
Software | Windows 10 Pro / Windows 10 Home |
We all know AMDs in Microsofts pocket.They will do what ever they can software/driver wise to help them.
L3 is shared between ALL cores. The problem isn't multithreaded workloads. THe problem with BD is that single-threaded perforamcne is lower than even Thuban. Teh most obvious difference, to me, between the two, is cache design and speed.The L2 has to handle writes from two L1ds and it is handled by the WCC unit(The WCC can combine 4 x 8192Kb or send 4KB to the L2, write through)
The problem with multithreading once again won't be there
Memory Sub System isn't really the problem other than the L3 but the L3 problem only starts with more than one module being used
System Name | SolarwindMobile |
---|---|
Processor | AMD FX-9800P RADEON R7, 12 COMPUTE CORES 4C+8G |
Motherboard | Acer Wasp_BR |
Cooling | It's Copper. |
Memory | 2 x 8GB SK Hynix/HMA41GS6AFR8N-TF |
Video Card(s) | ATI/AMD Radeon R7 Series (Bristol Ridge FP4) [ACER] |
Storage | TOSHIBA MQ01ABD100 1TB + KINGSTON RBU-SNS8152S3128GG2 128 GB |
Display(s) | ViewSonic XG2401 SERIES |
Case | Acer Aspire E5-553G |
Audio Device(s) | Realtek ALC255 |
Power Supply | PANASONIC AS16A5K |
Mouse | SteelSeries Rival |
Keyboard | Ducky Channel Shine 3 |
Software | Windows 10 Home 64-bit (Version 1607, Build 14393.969) |
L3 is shared between ALL cores. The problem isn't multithreaded workloads. THe problem with BD is that single-threaded perforamcne is lower than even Thuban. Teh most obvious difference, to me, between the two, is cache design and speed.
Nobody cares about BD's multi-threaded performance. I'm not sure we're on the same topic here.
System Name | My pc |
---|---|
Processor | Ryzen 5 3600 |
Motherboard | Asus Rog b450-f |
Cooling | Cooler master 120mm aio |
Memory | 16gb ddr4 3200mhz |
Video Card(s) | MSI Ventus 3x 3070 |
Storage | 2tb intel nvme and 2tb generic ssd |
Display(s) | Generic dell 1080p overclocked to 75hz |
Case | Phanteks enthoo |
Power Supply | 650w of borderline fire hazard |
Mouse | Some wierd Chinese vertical mouse |
Keyboard | Generic mechanical keyboard |
Software | Windows ten |
OK. Please explain to me how, then, moving two threads from running on two cores within a module, to one thread per module, is faster, on workloads that don't require that the shared resources are used exclusively per thread?
Processor | Intel i7-10700k |
---|---|
Motherboard | Gigabyte Aurorus Ultra z490 |
Cooling | Corsair H100i RGB |
Memory | 32GB (4x8GB) Corsair Vengeance DDR4-3200MHz |
Video Card(s) | MSI Gaming Trio X 3070 LHR |
Display(s) | ASUS MG278Q / AOC G2590FX |
Case | Corsair X4000 iCue |
Audio Device(s) | Onboard |
Power Supply | Corsair RM650x 650W Fully Modular |
Software | Windows 10 |
System Name | Gamers PC |
---|---|
Processor | AMD Phenom II X4 965 BE @ 3.80 GHz |
Motherboard | MSI 790FX-GD70 AM3 |
Cooling | Corsair H50 Cooler |
Memory | Corsair XMS3 4GB (2x2GB) DDR3-1333 |
Video Card(s) | XFX Radeon HD 5770 1GB GDDR5 |
Storage | 2 x WD Caviar Green 1TB SATA300 w/64MB Buffer (RAID 0) |
Display(s) | Samsung 2494SW 1080p 24" WS LCD HD |
Case | CM HAF 932 Full Tower Case |
Audio Device(s) | Creative SB X-FI TITANIUM -PCIE x 1 |
Power Supply | Corsair TX Series CMPSU-650TX (650W) |
Software | Windows 7 Ultimate 64-bit |
Something wrong there Ive seen tests done that shows a 4C4M beats out a 4C2M setup in almost all tests done. And the higher you scale the CPU clock the better the 4C4M becomes versus the 4C2M. This sharing within the bulldozer design needs some real fine tuning IMO.On my system running a 4 threaded program on 4 cores ( 2 modules) is the same speed as running a 4 threaded program on 4 cores ( 4 modules)
On Cinebench anyway, not sure about anything else as I've not tested it.
But Cinebench should be a program that would highlight this right?
System Name | MY PC |
---|---|
Processor | E8400 @ 3.80Ghz > Q9650 3.60Ghz |
Motherboard | Maximus Formula |
Cooling | D5, 7/16" ID Tubing, Maze4 with Fuzion CPU WB |
Memory | XMS 8500C5D @ 1066MHz |
Video Card(s) | HD 2900 XT 858/900 to 4870 to 5870 (Keep Vreg area clean) |
Storage | 2 |
Display(s) | 24" |
Case | P180 |
Audio Device(s) | X-fi Plantinum |
Power Supply | Silencer 750 |
Software | XP Pro SP3 to Windows 7 |
Benchmark Scores | This varies from one driver to another. |
System Name | My pc |
---|---|
Processor | Ryzen 5 3600 |
Motherboard | Asus Rog b450-f |
Cooling | Cooler master 120mm aio |
Memory | 16gb ddr4 3200mhz |
Video Card(s) | MSI Ventus 3x 3070 |
Storage | 2tb intel nvme and 2tb generic ssd |
Display(s) | Generic dell 1080p overclocked to 75hz |
Case | Phanteks enthoo |
Power Supply | 650w of borderline fire hazard |
Mouse | Some wierd Chinese vertical mouse |
Keyboard | Generic mechanical keyboard |
Software | Windows ten |
There 3 problems with Bulldozer in order. If AMD can fix this in time for Piledriver, then they would have the ability to compete with Intel much better.
1 -It lacks hand-tuned optimisation (somebody already mention this)
2 -Dispatch Unit needs major tweaking
3 -L1 and L2 cache needs a speed boost.
Something wrong there Ive seen tests done that shows a 4C4M beats out a 4C2M setup in almost all tests done. And the higher you scale the CPU clock the better the 4C4M becomes versus the 4C2M. This sharing within the bulldozer design needs some real fine tuning IMO.
Processor | 5800x(2)/5700g/5600x/5600g/2700x/1700x/1700 |
---|---|
Motherboard | MSI B550 Carbon (2)/ MSI z490 Unify/Asus Strix B550-F/MSI B450 Tomahawk (3) |
Cooling | EK AIO 360 (2)/EK AIO 240, Arctic Cooling Freezer II 280/EVGA CLC 280/Noctua D15/Cryorig M9(2) |
Memory | 32 GB Ballistix Elite/32 GB TridentZ/16GB Mushkin Redline Black/16 GB Dominator |
Video Card(s) | Asus Strix RTX3060/EVGA 970(2)/Asus 750 ti/Old Quadros |
Storage | Samsung 970 EVO M.2 NVMe 500GB/WD Black M.2 NVMe 500GB/Adata 500gb NVMe |
Display(s) | Acer 1080p 22"/ (3) Samsung 22" 1080p |
Case | (2) Lian Li Lancool II Mesh/Corsair 4000D /Phanteks Eclipse 500a/Be Quiet Pure Base 500/Bones of HAF |
Power Supply | EVGA Supernova 850G(2)/EVGA Supernova GT 650w/Phantek Amps 750w/Seasonic Focus 750w |
Mouse | Generic Black wireless (5) |
Keyboard | Generic Black wireless (5) |
Software | Win 10/Ubuntu |
As far as I can tell, it's more than having workloads balanced.
Now, there's a differnce in Windows 8 and Windows 7, in how workloads are managed in a CPU, due to Windows 8 allowing what is called "core parking". This is basically fully shutting off a core when it's not in use, for power savings. Naturally, such control needs to be finely tuned so that threads do not stall, and bringing similar functionality to Windows 7 is what this patch s supposed to be all about. The ability to dynamically move threads from one core to the next without stalling the thread is not really a big thing, and if it really was an issue with the FP scheduler, there'd be much more than just a 10% boost possible...sometimes it would be a doubling of speed.
That said, no, I do not think there is any "saving grace" for BD in this. I really feel the L2 cache is to slow, and the numbers seem to agree. When someone can tell us why the L2 cache seems to be slow, it might be more clear why BD "sucks".
Price the 8150 @ $200, and it's killer. There's really nothing wrong with BD's design. The only thing that makes it look wrong is the pricing, and that's because everyone considers BD to compete with SB(rightly so).
System Name | RyzenGtEvo/ Asus strix scar II |
---|---|
Processor | Amd R5 5900X/ Intel 8750H |
Motherboard | Crosshair hero8 impact/Asus |
Cooling | 360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK |
Memory | Corsair Vengeance Rgb pro 3600cas14 16Gb in four sticks./16Gb/16GB |
Video Card(s) | Powercolour RX7900XT Reference/Rtx 2060 |
Storage | Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme |
Display(s) | Samsung UAE28"850R 4k freesync.dell shiter |
Case | Lianli 011 dynamic/strix scar2 |
Audio Device(s) | Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset |
Power Supply | corsair 1200Hxi/Asus stock |
Mouse | Roccat Kova/ Logitech G wireless |
Keyboard | Roccat Aimo 120 |
VR HMD | Oculus rift |
Software | Win 10 Pro |
Benchmark Scores | 8726 vega 3dmark timespy/ laptop Timespy 6506 |