• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Demonstrates 7nm Radeon Vega Instinct HPC Accelerator

Joined
Sep 17, 2014
Messages
22,452 (6.03/day)
Location
The Washing Machine
Processor 7800X3D
Motherboard MSI MAG Mortar b650m wifi
Cooling Thermalright Peerless Assassin
Memory 32GB Corsair Vengeance 30CL6000
Video Card(s) ASRock RX7900XT Phantom Gaming
Storage Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s) Gigabyte G34QWC (3440x1440)
Case Lian Li A3 mATX White
Audio Device(s) Harman Kardon AVR137 + 2.1
Power Supply EVGA Supernova G2 750W
Mouse Steelseries Aerox 5
Keyboard Lenovo Thinkpad Trackpoint II
Software W11 IoT Enterprise LTSC
Benchmark Scores Over 9000
Yes, I think Hawaii was the wake up call for AMD because it showed an architecture that really was no longer up to snuff for gaming purposes (too hot, too hungry). Fury X was the HBM test case project as a double edged blade to use in the high end gaming segment for one last time, and Vega represents the completed U-turn to new marketplaces and segments (IGP as well), the 56 and 64 gaming versions of it are just bonus.

'Refinements'... that's just a rebrand on a smaller node, right? :p
 

T4C Fantasy

CPU & GPU DB Maintainer
Staff member
Joined
May 7, 2012
Messages
2,566 (0.56/day)
Location
Rhode Island
System Name Whaaaat Kiiiiiiid!
Processor Intel Core i9-12900K @ Default
Motherboard Gigabyte Z690 AORUS Elite AX
Cooling Corsair H150i AIO Cooler
Memory Corsair Dominator Platinum 32GB DDR4-3200
Video Card(s) EVGA GeForce RTX 3080 FTW3 ULTRA @ Default
Storage Samsung 970 PRO 512GB + Crucial MX500 2TB x3 + Crucial MX500 4TB + Samsung 980 PRO 1TB
Display(s) 27" LG 27MU67-B 4K, + 27" Acer Predator XB271HU 1440P
Case Thermaltake Core X9 Snow
Audio Device(s) Logitech G935 Headset
Power Supply SeaSonic Platinum 1050W Snow Silent
Mouse Logitech G903 Lightspeed
Keyboard Logitech G915
Software Windows 11 Pro
Benchmark Scores FFXV: 19329
Yes, I think Hawaii was the wake up call for AMD because it showed an architecture that really was no longer up to snuff for gaming purposes (too hot, too hungry). Fury X was the HBM test case project as a double edged blade to use in the high end gaming segment for one last time, and Vega represents the completed U-turn to new marketplaces and segments (IGP as well), the 56 and 64 gaming versions of it are just bonus.

'Refinements'... that's just a rebrand on a smaller node, right? :p
yeah well AMD words can be deceiving but hey thats definitely the case for Nvidia too xD
 
Last edited:
Joined
Dec 28, 2012
Messages
3,884 (0.89/day)
System Name Skunkworks 3.0
Processor 5800x3d
Motherboard x570 unify
Cooling Noctua NH-U12A
Memory 32GB 3600 mhz
Video Card(s) asrock 6800xt challenger D
Storage Sabarent rocket 4.0 2TB, MX 500 2TB
Display(s) Asus 1440p144 27"
Case Old arse cooler master 932
Power Supply Corsair 1200w platinum
Mouse *squeak*
Keyboard Some old office thing
Software Manjaro
Considering the clockspeed is 1200 on 4096bit it actually does matter, thats i think 300mhz higher with 2x bus than vega 10

And fury was like 500mhz with 4gb hbm huge difference
Like I said, hardware numbers dont matter, performance does.

If this chip clocks at 1200 MHz but only hits 1080ti performance after the 1100 series is out, it will be another flop, because having 1200mhz doesnt matter if the chip cant deliver. And after the 300, 400, 500, and vega series, I'm not holding my breath that AMD is going to compete well.

This chip looks promising, but so did vega 64, and we all saw how that went. If this new chip is only 1080ti level, my 480 replacement will most likely have to be an nvidia chip instead of an AMD chip, and I like my 480, so I'd prefer another AMD GPU.
 

bug

Joined
May 22, 2015
Messages
13,775 (3.96/day)
Processor Intel i5-12600k
Motherboard Asus H670 TUF
Cooling Arctic Freezer 34
Memory 2x16GB DDR4 3600 G.Skill Ripjaws V
Video Card(s) EVGA GTX 1060 SC
Storage 500GB Samsung 970 EVO, 500GB Samsung 850 EVO, 1TB Crucial MX300 and 2TB Crucial MX500
Display(s) Dell U3219Q + HP ZR24w
Case Raijintek Thetis
Audio Device(s) Audioquest Dragonfly Red :D
Power Supply Seasonic 620W M12
Mouse Logitech G502 Proteus Core
Keyboard G.Skill KM780R
Software Arch Linux + Win10
Like I said, hardware numbers dont matter, performance does.

If this chip clocks at 1200 MHz but only hits 1080ti performance after the 1100 series is out, it will be another flop, because having 1200mhz doesnt matter if the chip cant deliver. And after the 300, 400, 500, and vega series, I'm not holding my breath that AMD is going to compete well.

This chip looks promising, but so did vega 64, and we all saw how that went. If this new chip is only 1080ti level, my 480 replacement will most likely have to be an nvidia chip instead of an AMD chip, and I like my 480, so I'd prefer another AMD GPU.
Actually, the initial Vega looked rather lame, but then people couldn't stress enough how that was just the "professional" SKU and the gaming oriented one will be tweaked and come with magic drivers.
To this day I can't figure out, if Vega is so great, why hasn't it turned into a mid-range SKU to crush both the GTX 1060 and the RX480/580.
 
Low quality post by robb
Joined
Apr 29, 2018
Messages
129 (0.05/day)
Yeah and what creates this bandwidth? The speed of the ram. It's a very important role for graphics cards. As i said you can have 900Gb/s a second running at 100Mhz or 400Gb/s running at 800Mhz. The difference is that the 800Mhz with a much slower total bandwidth can / will be faster then the 900Gb/s. It just depends on what you are using this for. Compare a highway with 2 lanes with a limit of 180mph or a highway with 6 lanes and a limit of 80mph. Which one do you think is faster for the quick operations?

It was a design trade-off. They design primarily for professional market and create a second division from the same chip for just gaming. Nvidia on forehand cuts the FP64 performance by 1/64 to prevent people from buying gamers cards that are intended for the professional market. AMD does something in a simular way. But they both where professional chips at some point before going into gaming.
you are an idiot. the effective bandwidth is what matters and it's irrelevant how it achieves that.

2048 cores
128 tmus
64 rops
128 bit bus
16000 mhz memory speed

2048 cores
128 tmus
64 rops
256 bit bus
8000 mhz memory speed


BOTH of those would perform exactly the same if the same architecture and running at the same clock speeds. The fact that one uses 128-bit bus and the other uses 256-bit bus is 100% irrelevant if the effective memory bandwidth is the same and everything else is equal.
 

bug

Joined
May 22, 2015
Messages
13,775 (3.96/day)
Processor Intel i5-12600k
Motherboard Asus H670 TUF
Cooling Arctic Freezer 34
Memory 2x16GB DDR4 3600 G.Skill Ripjaws V
Video Card(s) EVGA GTX 1060 SC
Storage 500GB Samsung 970 EVO, 500GB Samsung 850 EVO, 1TB Crucial MX300 and 2TB Crucial MX500
Display(s) Dell U3219Q + HP ZR24w
Case Raijintek Thetis
Audio Device(s) Audioquest Dragonfly Red :D
Power Supply Seasonic 620W M12
Mouse Logitech G502 Proteus Core
Keyboard G.Skill KM780R
Software Arch Linux + Win10
you are an idiot. the effective bandwidth is what matters and it's irrelevant how it achieves that.

2048 cores
128 tmus
64 rops
128 bit bus
16000 mhz memory speed

2048 cores
128 tmus
64 rops
256 bit bus
8000 mhz memory speed


BOTH of those would perform exactly the same if the same architecture and running at the same clock speeds. The fact that one uses 128-bit bus and the other uses 256-bit bus is 100% irrelevant if the effective memory bandwidth is the same and everything else is equal.
Technically you're right. What other were trying to say (I think) is that running at a higher frequency tends to mean less latency, which can influence some apps. But that's just one, very specific instance.
 
Joined
Feb 3, 2017
Messages
3,756 (1.32/day)
Processor Ryzen 7800X3D
Motherboard ROG STRIX B650E-F GAMING WIFI
Memory 2x16GB G.Skill Flare X5 DDR5-6000 CL36 (F5-6000J3636F16GX2-FX5)
Video Card(s) INNO3D GeForce RTX™ 4070 Ti SUPER TWIN X2
Storage 2TB Samsung 980 PRO, 4TB WD Black SN850X
Display(s) 42" LG C2 OLED, 27" ASUS PG279Q
Case Thermaltake Core P5
Power Supply Fractal Design Ion+ Platinum 760W
Mouse Corsair Dark Core RGB Pro SE
Keyboard Corsair K100 RGB
VR HMD HTC Vive Cosmos
Bandwidth is roughly bus width multiplied by memory speed (transfer rate). HBM so far is much slower but sits on a much wider bus.
Video RAM is sensitive to bandwidth much more than it is to latency.

The choice between HBM2, GDDR5 or GDDR5X is not about performance. Given the right combination of speed and bus width they can all get similar enough results.
Considerations are about cost, space on PCB/interposer, power consumption etc.

Technically you're right. What other were trying to say (I think) is that running at a higher frequency tends to mean less latency, which can influence some apps. But that's just one, very specific instance.
That depends on what frequency we are talking about. The frequency at which you can request data from GDDR5(X) and HBM is actually (roughly) the same. GDDR5(X) can just transfer a lot of data back a lot (8-16 times) faster than it can be requested (per pin or effectively per same width of memory bus).
 
Last edited:
  • Like
Reactions: bug

bug

Joined
May 22, 2015
Messages
13,775 (3.96/day)
Processor Intel i5-12600k
Motherboard Asus H670 TUF
Cooling Arctic Freezer 34
Memory 2x16GB DDR4 3600 G.Skill Ripjaws V
Video Card(s) EVGA GTX 1060 SC
Storage 500GB Samsung 970 EVO, 500GB Samsung 850 EVO, 1TB Crucial MX300 and 2TB Crucial MX500
Display(s) Dell U3219Q + HP ZR24w
Case Raijintek Thetis
Audio Device(s) Audioquest Dragonfly Red :D
Power Supply Seasonic 620W M12
Mouse Logitech G502 Proteus Core
Keyboard G.Skill KM780R
Software Arch Linux + Win10
Again, this is all theoretical. If the same bandwidth is achieved by two cards, but one runs the memory at 100MHz while the other runs at 1,000MHz, the latter can a have a tenth of the former's latency (assuming that the data is already available to read on the next clock cycle - it usually isn't).
At least that's my understanding/guess about what previous posters were trying to say.
 
Joined
Feb 3, 2017
Messages
3,756 (1.32/day)
Processor Ryzen 7800X3D
Motherboard ROG STRIX B650E-F GAMING WIFI
Memory 2x16GB G.Skill Flare X5 DDR5-6000 CL36 (F5-6000J3636F16GX2-FX5)
Video Card(s) INNO3D GeForce RTX™ 4070 Ti SUPER TWIN X2
Storage 2TB Samsung 980 PRO, 4TB WD Black SN850X
Display(s) 42" LG C2 OLED, 27" ASUS PG279Q
Case Thermaltake Core P5
Power Supply Fractal Design Ion+ Platinum 760W
Mouse Corsair Dark Core RGB Pro SE
Keyboard Corsair K100 RGB
VR HMD HTC Vive Cosmos
Memory frequency can be directly compared on the same type of memory.
The spec used across different types is data transfer rate per pin (per each bit of memory bus).

When we are talking about bandwidth, that is simple. When we are talking about latency, it gets more complicated. While the actual memory reads are in the same range (actual DRAM clock speeds should remain around 1 GHz), word sizes differ - 16b/32b/64b for GDDR5(X) and 128b/256b for HBM2 - as well as addressing and read delays. I do not remember reading anything indepth about that and testing this is not too easy.

Take these three cards for example:
- Vega 64: https://gaming.radeon.com/en/product/vega/radeon-rx-vega-64/
- GTX 1080Ti: https://www.nvidia.com/en-us/geforce/products/10series/geforce-gtx-1080-ti/
- RX 580: https://www.amd.com/en/products/graphics/radeon-rx-580

Vega 64 (HBM2):
Memory Data Rate: 1.89Gbps
Memory Speed: 945MHz
Memory Interface: 2048-bit
Memory Bandwidth: Up to 484GB/s


945MHz memory on 2048-bit memory bus. Memory data rate is 1.89Gbps - 945MHz at dual data rate
Memory bandwidth is up to 484 GB/s = 1.89 Gbps x 2048 bits of bus / 8 bits per byte

GTX 1080Ti (GDDR5X):
Memory Speed: 11 Gbps
Memory Interface Width: 352-bit
Memory Bandwidth (GB/sec): 484


Memory bandwidth is up to 484 GB/s = 11 Gbps x 352 bits of bus / 8 bits per byte

Note that they do not list the memory clock in specs. Memory speed is actually the data rate spec. This is because there are multiple clocks for GDDR5(X) and none of these are exactly descriptive in a simple way. Knowing what the memory does, lets see what we can deduce about speeds.
GDDR5X is either dual or quad data rate, so at maximum I/O bus is running at quarter of the data rate - 11Gbps / 4 = 2.75 GHz
Data is being requested and read at a quarter of the I/O bus frequency (thanks to 16n prefetch) - 2.75 GHz / 4 = 0.687 GHz

Prefetch on GDDR5 means that where the memory needs to transfer out 32 bits of data at a time, it can internally load 8 (or 16 for GDDR5X) times as much from DRAM into prefetch buffers and for the next 8 (or 16) transfers send data from the buffer instead of going and loading up more data.

RX580 (GDDR5):
Memory speed (effective): 8 Gbps
Memory Interface: 256-bit
Max. Memory Bandwidth: 256 GB/s


Memory bandwidth is up to 256 GB/s = 8 Gbps x 256 bits of bus / 8 bits per byte

Again, memory speed is actually the data rate spec.
GDDR5 is also dual data rate, so I/O bus is running at half the data rate - 8 Gbps / 2 = 4 GHz
Data is being requested and read at a quarter of the I/O bus frequency (thanks to 8n prefetch) - 4 GHz / 4 = 1 GHz
 

T4C Fantasy

CPU & GPU DB Maintainer
Staff member
Joined
May 7, 2012
Messages
2,566 (0.56/day)
Location
Rhode Island
System Name Whaaaat Kiiiiiiid!
Processor Intel Core i9-12900K @ Default
Motherboard Gigabyte Z690 AORUS Elite AX
Cooling Corsair H150i AIO Cooler
Memory Corsair Dominator Platinum 32GB DDR4-3200
Video Card(s) EVGA GeForce RTX 3080 FTW3 ULTRA @ Default
Storage Samsung 970 PRO 512GB + Crucial MX500 2TB x3 + Crucial MX500 4TB + Samsung 980 PRO 1TB
Display(s) 27" LG 27MU67-B 4K, + 27" Acer Predator XB271HU 1440P
Case Thermaltake Core X9 Snow
Audio Device(s) Logitech G935 Headset
Power Supply SeaSonic Platinum 1050W Snow Silent
Mouse Logitech G903 Lightspeed
Keyboard Logitech G915
Software Windows 11 Pro
Benchmark Scores FFXV: 19329
Joined
Jun 10, 2014
Messages
2,987 (0.78/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
you are an idiot. the effective bandwidth is what matters and it's irrelevant how it achieves that.
<snip>
BOTH of those would perform exactly the same if the same architecture and running at the same clock speeds. The fact that one uses 128-bit bus and the other uses 256-bit bus is 100% irrelevant if the effective memory bandwidth is the same and everything else is equal.
I just want to add, they will theoretically perform similar, but not necessarily exactly the same, it all depends on the GPU architecture. GPU memory controllers are currently structures as multiple 64-bit controllers, each can only communicate with one cluster/GPC at the time. Having fewer faster memory controllers would require faster scheduling to keep up, while having too many controllers will increase the risk of congestion on one of them. So it all comes down to a balancing act; how the clusters, memory controllers and the scheduler work together. Simply doing a major change on one of them without redesigning the others will create bottlenecks.

It might be wise to distinguish between theoretical specs and actual performance. Just look at:
Vega 64: 4096 cores, 10215 GFlop/s, 483.8 GB/s.
GTX 1080: 2560 cores, 8228 GFlop/s, 320.2 GB/s.
I wonder which one performs better…
Compared to:
GTX 1080 Ti: 3584 cores, 10609 GFlop/s, 484.3 GB/s.
As we all know, Vega 64 have resources comparable to GTX 1080 Ti, so it's not the lack of resources like many AMD fans claim, but the lack of proper resource management.
In conclusion, theoretical specs might be similar on paper, but their actual performance will depend on the complete design.

Again, this is all theoretical. If the same bandwidth is achieved by two cards, but one runs the memory at 100MHz while the other runs at 1,000MHz, the latter can a have a tenth of the former's latency (assuming that the data is already available to read on the next clock cycle - it usually isn't).
If I may add, memory latency is substantial, for DDR it's 50-60 ns for access, even more for GDDR. When you're talking of speeds of 1000 MHz and above, the latency factor becomes negligible, and higher clocks more or less just impacts the bandwidth.

AMD already said that there will be no gaming 7nm VEGA

So stop dreaming :-D
AMD have said all along that Vega 20 is targeting the "professional" market.
AMD can certainly change their mind, but as of this moment their position is still unchanged. But why would they bring Vega 20 to the consumer market? It only scales well with simple comute workloads, and Vega 20 is a GPU built for full fp64 support, which have no relevance for consumers.
 
Joined
Mar 21, 2016
Messages
2,508 (0.79/day)
Bandwidth is roughly bus width multiplied by memory speed (transfer rate). HBM so far is much slower but sits on a much wider bus.
Video RAM is sensitive to bandwidth much more than it is to latency.

The choice between HBM2, GDDR5 or GDDR5X is not about performance. Given the right combination of speed and bus width they can all get similar enough results.
Considerations are about cost, space on PCB/interposer, power consumption etc.

That depends on what frequency we are talking about. The frequency at which you can request data from GDDR5(X) and HBM is actually (roughly) the same. GDDR5(X) can just transfer a lot of data back a lot (8-16 times) faster than it can be requested (per pin or effectively per same width of memory bus).
True latency is what really matters. Minimal latency with not enough bandwidth isn't good and tons of bandwidth with too high latency isn't good either. That said synchronization will come into play and make latency or bandwidth appear better at times for certain applications use cases. I wonder if GDDR5 suffers from more texture flickering/pop in than HBM since it's has more sequential burst reads while HBM seems better suited for random reads to stream in data on the fly. Vega's HBM is better load balanced, but it's latency and bandwidth are a problem. The bus width wasn't' wide given the HBM clock speed scaling for starters which is why Fury ended up with better bandwidth overall. I think the bus width and higher HBM clock speed will certainly help it a fair amount in a apples to apples comparison to it's predecessor. I'd have liked to have seen more a bump in the HBM clock speed, but it's at least a step in the right direction. Perhaps the HBM will overclock better this time around or there will be a clock refresh on the HBM itself not too far off.

Again, this is all theoretical. If the same bandwidth is achieved by two cards, but one runs the memory at 100MHz while the other runs at 1,000MHz, the latter can a have a tenth of the former's latency (assuming that the data is already available to read on the next clock cycle - it usually isn't).
At least that's my understanding/guess about what previous posters were trying to say.
As far as merely a clock speed to clock speed comparison yes that is true. That's why minimum frame rates improve the most with memory overclocking they are the most latency sensitive and yes synchronization comes into play. The minimum frame rate is a big deal for 4K atm since GPU's haven't trivialized them quite yet like they have for 1080p and even 1440p.

It might be wise to distinguish between theoretical specs and actual performance. Just look at:
Vega 64: 4096 cores, 10215 GFlop/s, 483.8 GB/s.
GTX 1080: 2560 cores, 8228 GFlop/s, 320.2 GB/s.
I wonder which one performs better…
Compared to:
GTX 1080 Ti: 3584 cores, 10609 GFlop/s, 484.3 GB/s.
As we all know, Vega 64 have resources comparable to GTX 1080 Ti, so it's not the lack of resources like many AMD fans claim, but the lack of proper resource management.
In conclusion, theoretical specs might be similar on paper, but their actual performance will depend on the complete design.
TMUs/ROPs and pixel clock speed also come into play. If I remember right Vega has less in one of those area's and lower overall latency because of the GDDR5 clock speed being much higher even despite a more narrow bus it's still better on a clock speed per bit basis if I'm not mistaken. That's w/o even getting into the compression advantage either.
 
Last edited:
Joined
Aug 29, 2014
Messages
103 (0.03/day)
I learned not to get excited about AMD video card performance claims. The will not take the performance crown from Nvidia :(
 
Joined
Apr 12, 2013
Messages
7,531 (1.77/day)
Again, this is all theoretical. If the same bandwidth is achieved by two cards, but one runs the memory at 100MHz while the other runs at 1,000MHz, the latter can a have a tenth of the former's latency (assuming that the data is already available to read on the next clock cycle - it usually isn't).
At least that's my understanding/guess about what previous posters were trying to say.
Are we talking about effective mem speed, like 16000MHZ for GDDR5x, or actual speeds? I Think I asked this before ~ but we don't have latency numbers to compare HBM vs GDDR6 or GDDR5x, do we? The VRAM does QDR so it isn't actually running at 8000MHz or whatever, unlike desktop memory which is DDR but runs at similar speeds.
 
Joined
Jul 9, 2015
Messages
3,413 (1.00/day)
System Name M3401 notebook
Processor 5600H
Motherboard NA
Memory 16GB
Video Card(s) 3050
Storage 500GB SSD
Display(s) 14" OLED screen of the laptop
Software Windows 10
Benchmark Scores 3050 scores good 15-20% lower than average, despite ASUS's claims that it has uber cooling.
the performance crown
Which GPU are you using now, the author of 3 posts?

Yes, I think Hawaii was the wake up call for AMD because it showed an architecture that really was no longer up to snuff for gaming purposes
Yeah, that 290x merely beating nVidia Titan was sooo bad.

(too hot, too hungry)
That fanboy virtual reality bubble never ceases to amaze me:


https://www.tomshardware.com/reviews/radeon-r9-290x-hawaii-review,3650-29.html
 

bug

Joined
May 22, 2015
Messages
13,775 (3.96/day)
Processor Intel i5-12600k
Motherboard Asus H670 TUF
Cooling Arctic Freezer 34
Memory 2x16GB DDR4 3600 G.Skill Ripjaws V
Video Card(s) EVGA GTX 1060 SC
Storage 500GB Samsung 970 EVO, 500GB Samsung 850 EVO, 1TB Crucial MX300 and 2TB Crucial MX500
Display(s) Dell U3219Q + HP ZR24w
Case Raijintek Thetis
Audio Device(s) Audioquest Dragonfly Red :D
Power Supply Seasonic 620W M12
Mouse Logitech G502 Proteus Core
Keyboard G.Skill KM780R
Software Arch Linux + Win10
Which GPU are you using now, the author of 3 posts?


Yeah, that 290x merely beating nVidia Titan was sooo bad.


That fanboy virtual reality bubble never ceases to amaze me:


https://www.tomshardware.com/reviews/radeon-r9-290x-hawaii-review,3650-29.html
Yeah, that card wasn't power hungry at all: https://www.techpowerup.com/reviews/AMD/R9_290X/28.html
Oh wait, silly me, it was offering performance at an affordable price: https://www.techpowerup.com/reviews/AMD/R9_290X/29.html

Though to be honest the 280 and 285 were really good cards. But 290X was clearly pushed beyond that architecture's sweet spot.
 
Joined
Jul 9, 2015
Messages
3,413 (1.00/day)
System Name M3401 notebook
Processor 5600H
Motherboard NA
Memory 16GB
Video Card(s) 3050
Storage 500GB SSD
Display(s) 14" OLED screen of the laptop
Software Windows 10
Benchmark Scores 3050 scores good 15-20% lower than average, despite ASUS's claims that it has uber cooling.
Joined
Jun 10, 2014
Messages
2,987 (0.78/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
AMD is exhausting their last die shrink until 2030 to achieve the performance of Pascal
TSMC 7 nm is comparable to Intel 10 nm, so we can expect at least one more shrink before it stops.

But your point is right; this might be the last "good shrink" for a long time. Moving an inefficient to a new node will not make it great, and since the competition will also move to the new node, the gap is only going to increase.
 

bug

Joined
May 22, 2015
Messages
13,775 (3.96/day)
Processor Intel i5-12600k
Motherboard Asus H670 TUF
Cooling Arctic Freezer 34
Memory 2x16GB DDR4 3600 G.Skill Ripjaws V
Video Card(s) EVGA GTX 1060 SC
Storage 500GB Samsung 970 EVO, 500GB Samsung 850 EVO, 1TB Crucial MX300 and 2TB Crucial MX500
Display(s) Dell U3219Q + HP ZR24w
Case Raijintek Thetis
Audio Device(s) Audioquest Dragonfly Red :D
Power Supply Seasonic 620W M12
Mouse Logitech G502 Proteus Core
Keyboard G.Skill KM780R
Software Arch Linux + Win10
TSMC 7 nm is comparable to Intel 10 nm, so we can expect at least one more shrink before it stops.

But your point is right; this might be the last "good shrink" for a long time. Moving an inefficient to a new node will not make it great, and since the competition will also move to the new node, the gap is only going to increase.
The may not be that bad. Intel invented tick-tock precisely because moving to a new node while at the same time implementing a new architecture has been traditionally too challenging. It may be more cost effective for AMD (and in turn for us) to move to 7nm first and change the architecture later. Even if that means we'll be stuck with Vega for a while longer.
 
Joined
Jun 10, 2014
Messages
2,987 (0.78/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
The may not be that bad. Intel invented tick-tock precisely because moving to a new node while at the same time implementing a new architecture has been traditionally too challenging. It may be more cost effective for AMD (and in turn for us) to move to 7nm first and change the architecture later. Even if that means we'll be stuck with Vega for a while longer.
Sure, which is basically what they are already doing with the Vega 20.

You see people arguing that a shrunk Vega will matter, but we know it wouldn't even come close to Pascal in efficiency. AMD isn't even planning a consumer Vega on 7 nm.
 

bug

Joined
May 22, 2015
Messages
13,775 (3.96/day)
Processor Intel i5-12600k
Motherboard Asus H670 TUF
Cooling Arctic Freezer 34
Memory 2x16GB DDR4 3600 G.Skill Ripjaws V
Video Card(s) EVGA GTX 1060 SC
Storage 500GB Samsung 970 EVO, 500GB Samsung 850 EVO, 1TB Crucial MX300 and 2TB Crucial MX500
Display(s) Dell U3219Q + HP ZR24w
Case Raijintek Thetis
Audio Device(s) Audioquest Dragonfly Red :D
Power Supply Seasonic 620W M12
Mouse Logitech G502 Proteus Core
Keyboard G.Skill KM780R
Software Arch Linux + Win10
Sure, which is basically what they are already doing with the Vega 20.

You see people arguing that a shrunk Vega will matter, but we know it wouldn't even come close to Pascal in efficiency. AMD isn't even planning a consumer Vega on 7 nm.
Eh, like anyone cares about matching Pascal, two years after Pascal's release... The only way for people to be interested in that would be if the cards would sell for a lot less than Pascal counterparts. Thanks to Vega being stuck with HBM, that's not going to happen.
I was strictly talking about the "Moving an inefficient to a new node will not make it great" part - it will not make it great, but maybe that's not the point. Though God knows what AMD is thinking.
 
Joined
Jan 15, 2015
Messages
362 (0.10/day)
I learned not to get excited about AMD video card performance claims. The will not take the performance crown from Nvidia :(
AMD being small is unfortunate for competition because the company tries to create a Jack-of-All-Trades design. While this can be profitable for the company, as we've seen with Zen, it's not the path to class-leading performance. When your competitors have more money than you do they have more luxuries. Money = luxury budget. It's no different from the "real world" experience of people. It's why Xerox could afford to have PARC, for a time. For most people, gaming is a luxury business. Capturing PC gaming market profits is good, as is having robust pioneering R&D (like a PARC) but the more luxurious something is (distance from the Jack-of-All-Trades middle), the harder it will be to convince a board to approve it. This is compounded as the company shrinks in comparison with its competitors.

Zen has been a huge win but it came with compromises over pure performance in the PC enthusiast space, like low clocks/minimal overclocking and latency. The node is a big factor but Zen cores could have been designed for more performance at higher energy consumption and increased die space. AMD specifically said it wanted to make a core that could scale all the way from very low-wattage mobile to high-performance. That's nice but the word "scale" isn't a magic wand that erases the benefit of having more targeted designs. Those cost more money, though, to make and usually involve more risk. The key with Zen was to do just enough and keep it affordable. The high efficiency of the SMT was really a big design win, which is a bit droll considering its bet on CMT previously.

Gaming enthusiasts also are competing with AI/science/workstation/supercomputing, crypto, the laptop space, "consoles", etc. AMD wants one core that can do it all but what we would like is a special core for each niche — like a big core x86 with a loose library for high performance instead of just bigger core count. If AMD were bigger and richer it could do more of that. Crypto is probably more of a target for AMD's designers than gaming, outside of the console space. Crypto seems to fit better with the AI/science/workstation/supercomputing "pro compute" area. And, it, not PC gaming, holds the promise of demand higher than supply.

I wonder, for CPUs, if Big-Little will become a desktop trend. It seems to offer the benefits of turbo to a greater degree. Create several more powerful cores and supplement them with smaller ones. I'm sure articles have been written on this. But, maybe we'll someday see a desktop CPU with two large high-performance cores surrounded by smaller ones. Of course, like the enthusiast-grade gaming GPU, the enthusiast gamer x86 CPU seems to be mostly a thing of the past — or an afterthought at best. What advantage do those two big cores have for servers, for instance? I suppose it could be useful for some pro markets, though, like financial analysis (e.g. high-speed trading). With that, though, I don't see why supercomputers can't be harnessed. Is it really necessary to focus on a high-clock micro for those systems? Is the latency that bad?
 
Last edited:
Top