• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA GP100 Silicon Moves to Testing Phase

64K

Joined
Mar 13, 2014
Messages
6,773 (1.73/day)
Processor i7 7700k
Motherboard MSI Z270 SLI Plus
Cooling CM Hyper 212 EVO
Memory 2 x 8 GB Corsair Vengeance
Video Card(s) Temporary MSI RTX 4070 Super
Storage Samsung 850 EVO 250 GB and WD Black 4TB
Display(s) Temporary Viewsonic 4K 60 Hz
Case Corsair Obsidian 750D Airflow Edition
Audio Device(s) Onboard
Power Supply EVGA SuperNova 850 W Gold
Mouse Logitech G502
Keyboard Logitech G105
Software Windows 10
Joined
Jul 18, 2007
Messages
2,693 (0.42/day)
System Name panda
Processor 6700k
Motherboard sabertooth s
Cooling raystorm block<black ice stealth 240 rad<ek dcc 18w 140 xres
Memory 32gb ripjaw v
Video Card(s) 290x gamer<ntzx g10<antec 920
Storage 950 pro 250gb boot 850 evo pr0n
Display(s) QX2710LED@110hz lg 27ud68p
Case 540 Air
Audio Device(s) nope
Power Supply 750w superflower
Mouse g502
Keyboard shine 3 with grey, black and red caps
Software win 10
Benchmark Scores http://hwbot.org/user/marsey99/

rtwjunkie

PC Gaming Enthusiast
Supporter
Joined
Jul 25, 2008
Messages
13,994 (2.35/day)
Location
Louisiana
Processor Core i9-9900k
Motherboard ASRock Z390 Phantom Gaming 6
Cooling All air: 2x140mm Fractal exhaust; 3x 140mm Cougar Intake; Enermax ETS-T50 Black CPU cooler
Memory 32GB (2x16) Mushkin Redline DDR-4 3200
Video Card(s) ASUS RTX 4070 Ti Super OC 16GB
Storage 1x 1TB MX500 (OS); 2x 6TB WD Black; 1x 2TB MX500; 1x 1TB BX500 SSD; 1x 6TB WD Blue storage (eSATA)
Display(s) Infievo 27" 165Hz @ 2560 x 1440
Case Fractal Design Define R4 Black -windowed
Audio Device(s) Soundblaster Z
Power Supply Seasonic Focus GX-1000 Gold
Mouse Coolermaster Sentinel III (large palm grip!)
Keyboard Logitech G610 Orion mechanical (Cherry Brown switches)
Software Windows 10 Pro 64-bit (Start10 & Fences 3.0 installed)
throw in the nv greed factor and it is $650.

welcome to 2015 nvidia, amd has been waiting a for you.

You call it greed, but in the free market, companies only charge what people are willing to pay. As long as people are buying boatloads of a product at a given price, it won't drop.

If I manufactured a widget, and people were willing to buy all I made at $200, despite some people saying it's too expensive, I'm going to sell them for $200, because I have my own expenses and bills to pay for and my family to provide for.
 
Last edited:

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.46/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
My point was that there is nothing FP32 can't do that FP16 can but there is a lot FP32 can't do that FP64 can. We should be moving towards FP64 and not away from it unless the architecture is designed in a such away to allow four FP16, two FP32, or one FP64 operation to be completed simultaneously. I know that is not true of FP64 so I doubt it is true of FP16 either. It makes more sense to make a FP64 monster and feed it with FP16 and FP32 workload than to focus on FP16 and nerf FP64.
 
Joined
Apr 2, 2011
Messages
2,810 (0.56/day)
So, the price tag associated with this particular product indicates nothing about the final price tag. When you ship something like this you do the replacement value of the goods+20%. As this is a one-off process, and the goods are theoretically having a setup and confirmation run just for this part, the value will be huge. You've got setup time, operator time, development time, etc... Combine this with a limited engineering sample of the HBM2 memory, and you've got an astronomically priced one-off card. This is why people don't commission custom GPUs.

As far as performance, I'd settle for a 50% performance improvement with a much smaller cooling budget. Nvidia has done well with their recent offerings, but not having GDDR5 and actually running these cards cooler would allow for either less power draw (can you say gaming rig on more often?), or a smaller size allowing for amazing performance.



Cynic inside me also has one question. Is the cited 60-90% improvement raw compute, or improvements based around DX12 mathematics. Given this is basically FUD, we'll only find out next year. Set anticipation to reasonable...


Edit:
Tripping speed-traps already? Sounds like this is gonna be one fast processor!! ba bum tss!!:D

Seriously though, goodbye 28nm, you shall not be missed, about time we moved to a smaller process :rockout:

I can see the sentiment, but heartily disagree. 28nm has had some amazing times, due in no small part to not leaving in a reasonable fashion. We've seen the 6xx, 7xx, and 9xx series (arguably the 9xx is the best DX11 will be getting) from Nvidia. We've seen the 7xxx, 2xx, and 3xx series from AMD (arguably the cards that made bitcoin famous/infamous). 28nm is something really worth celebrating.

That said, it's time for 16/14nm. Good lord, do we need a change.



Surprisingly though, think back on zombie processes. There are still parts built on the 45nm process, SB-e was 32nm with a 65nm PCH, and even the flagship PCH of Intel (Z170) is built on the 22nm lithographic process (CPUs really are the exception to the rule that low cost processes drive silicon adoption, but somebody has to be the trailblazer).
 
Last edited:
Joined
Apr 19, 2011
Messages
2,198 (0.44/day)
Location
So. Cal.
I take it this is just a the GP100 TSCM silicon chip... I suppose these can have some functionality and operational parameters checked; however it's not until they have such chips and HMB2 added to a interposer that any real design validation can be made. It seems like it would be at least 6-8 months of everything going correct and moving forward without any glitch's before Nvidia has the professional (Tesla) versions fully vetted, finalized ready to intro to HPC clients. Then real production would commence, and it probably a good 10-12 weeks to have complete packaged interposers and mount to PCB's. So figure end 2016 for Tesla cards start moving to fulfill the HPC orders. This news is interesting, but it would be more substantive for gamers if this was about the GP104, though much of the work is congruent and should quickly parlay into the first gaming card.
 
Joined
Apr 26, 2009
Messages
517 (0.09/day)
Location
You are here.
System Name Prometheus
Processor Intel i7 14700K
Motherboard ASUS ROG STRIX B760-I
Cooling Noctua NH-D12L
Memory Corsair 32GB DDR5-7200
Video Card(s) MSI RTX 4070Ti Ventus 3X OC 12GB
Storage WD Black SN850 1TB
Display(s) DELL U4320Q 4K
Case SSUPD Meshroom D Fossil Gray
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Corsair SF750 Platinum SFX
Mouse Razer Orochi V2
Keyboard Nuphy Air75 V2 White
Software Windows 11 Pro x64
welcome to 2015 nvidia, amd has been waiting a for you.

Actually nVIDIA is way ahead of AMD. HBM1 doesn't seem to provide any tangible performance benefits at this time, and the 4GB limitation will hit everyone like a brick towards the second half of next year. So it made ZERO sense for nVIDIA to make a HBM1 product.

HBM2 provides a MINIMUM of 16GB of VRAM at twice the speed/bandwidth. If AMD survives until HBM2 you'll see marketing from them that will make the Fury X look like a school project. But since AMD is very-very-very quiet about HBM2, I don't think they'll get there too soon.
 
Joined
Jul 31, 2014
Messages
481 (0.13/day)
System Name Diablo | Baal | Mephisto | Andariel
Processor i5-3570K@4.4GHz | 2x Xeon X5675 | i7-4710MQ | i7-2640M
Motherboard Asus Sabertooth Z77 | HP DL380 G6 | Dell Precision M4800 | Lenovo Thinkpad X220 Tablet
Cooling Swiftech H220-X | Chassis cooled (6 fans + HS) | dual-fanned heatpipes | small-fanned heatpipe
Memory 32GiB DDR3-1600 CL9 | 96GiB DDR3-1333 ECC RDIMM | 32GiB DDR3L-1866 CL11 | 8GiB DDR3L-1600 CL11
Video Card(s) Dual GTX 670 in SLI | Embedded ATi ES1000 | Quadro K2100M | Intel HD 3000
Storage many, many SSDs and HDDs....
Display(s) 1 Dell U3011 + 2x Dell U2410 | HP iLO2 KVMoIP | 3200x1800 Sharp IGZO | 1366x768 IPS with Wacom pen
Case Corsair Obsidian 550D | HP DL380 G6 Chassis | Dell Precision M4800 | Lenovo Thinkpad X220 Tablet
Audio Device(s) Auzentech X-Fi HomeTheater HD | None | On-board | On-board
Power Supply Corsair AX850 | Dual 750W Redundant PSU (Delta) | Dell 330W+240W (Flextronics) | Lenovo 65W (Delta)
Mouse Logitech G502, Logitech G700s, Logitech G500, Dell optical mouse (emergency backup)
Keyboard 1985 IBM Model F 122-key, Ducky YOTT MX Black, Dell AT101W, 1994 IBM Model M, various integrated
Software FAAAR too much to list
My point was that there is nothing FP32 can't do that FP16 can but there is a lot FP32 can't do that FP64 can. We should be moving towards FP64 and not away from it unless the architecture is designed in a such away to allow four FP16, two FP32, or one FP64 operation to be completed simultaneously. I know that is not true of FP64 so I doubt it is true of FP16 either. It makes more sense to make a FP64 monster and feed it with FP16 and FP32 workload than to focus on FP16 and nerf FP64.

It's a speed vs accuracy argument, and for most consumer cases, FP32 is plenty. In some niches, FP16 is plenty, and for a minor increase in die are, you can pretty much double performance by using FP16 instead of FP32 (neural nets, low-fidelity image processing and redering effects). And quite obviously, demand from enough very well-connected devs is there too, else nV wouldn't bother at all. Expect AMD to come up with something very similar.

I take it this is just a the GP100 TSCM silicon chip... I suppose these can have some functionality and operational parameters checked; however it's not until they have such chips and HMB2 added to a interposer that any real design validation can be made. It seems like it would be at least 6-8 months of everything going correct and moving forward without any glitch's before Nvidia has the professional (Tesla) versions fully vetted, finalized ready to intro to HPC clients. Then real production would commence, and it probably a good 10-12 weeks to have complete packaged interposers and mount to PCB's. So figure end 2016 for Tesla cards start moving to fulfill the HPC orders. This news is interesting, but it would be more substantive for gamers if this was about the GP104, though much of the work is congruent and should quickly parlay into the first gaming card.

Interposer assembly would be likely be done at the fab, since the interposer is really just another silicon die you drop other dies on, and where else beside fabs would you find the tech to do such an assembly?

Most likely SK Hynix is shipping full HBM2 wafers straight to TSMC for final assembly on the interposers.
 
Joined
Apr 5, 2010
Messages
734 (0.14/day)
Location
Israel
System Name PC ?
Processor AMD Ryzen 9 5950X
Motherboard Gigabyte X570 AORUS XTREME
Cooling NZXT Kraken X62
Memory 64gb of G.Skill Trident Z Neo 32GB 3600 / CL16
Video Card(s) Sapphire RX 7900 XTX NITRO+
Storage C:/ADATA XPG SX8200 Pro 2TB - D:/7TB of Storage (WD-Bx2) - X:/Samsung 840 EVO 1TB
Display(s) Samsung Neo G9 57"
Case Corsair 1000D
Audio Device(s) Cambridge Audio CXA60 + Klipsch RP-160M
Power Supply Seasonic PRIME Ultra Titanium 1000TR
Mouse Logitech G900
Keyboard Logitech G Pro Keyboard
Software Windows 10 Pro (64bit)
Actually nVIDIA is way ahead of AMD. HBM1 doesn't seem to provide any tangible performance benefits at this time, and the 4GB limitation will hit everyone like a brick towards the second half of next year. So it made ZERO sense for nVIDIA to make a HBM1 product.

HBM2 provides a MINIMUM of 16GB of VRAM at twice the speed/bandwidth. If AMD survives until HBM2 you'll see marketing from them that will make the Fury X look like a school project. But since AMD is very-very-very quiet about HBM2, I don't think they'll get there too soon.

HBM1 is four time faster than GDRR5, so you won't see anything new with HBM2, just higher price tag.

"School project" just remember that AMD released first GDRR5 too .
i think some of nvidia team is going to this school.
 
Joined
Apr 26, 2009
Messages
517 (0.09/day)
Location
You are here.
System Name Prometheus
Processor Intel i7 14700K
Motherboard ASUS ROG STRIX B760-I
Cooling Noctua NH-D12L
Memory Corsair 32GB DDR5-7200
Video Card(s) MSI RTX 4070Ti Ventus 3X OC 12GB
Storage WD Black SN850 1TB
Display(s) DELL U4320Q 4K
Case SSUPD Meshroom D Fossil Gray
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Corsair SF750 Platinum SFX
Mouse Razer Orochi V2
Keyboard Nuphy Air75 V2 White
Software Windows 11 Pro x64
Releasing "first" something is like posting "First!" on a YouTube video. The idea is to actually gain something by using that technology. When you have the competition's 256bit/384bit GDDR5 cards matching or outperforming your HBM1 cards, that you invested heavily in and that also introduce availability problems, how is it a smart move?

AMD also released the first 1GHz processor by a few hours. They also released the first desktop 64 bit processor. They also released the first desktop processor with an integrated memory controller. None of those things gave them any tangible advantages. Some did, but they had no idea about how to exploit them. And look at them now. It's not about being "first", it's about perfecting something until it makes sense to implement it in an actual product. Which is something Apple does for years.

HBM1 doesn't make sense now. It did one or two years ago, about that time when nVIDIA presented their "test" HBM board. They showed it to us, as in "we're working on it", and when it makes sense, they'll release it.
 
Joined
Apr 5, 2010
Messages
734 (0.14/day)
Location
Israel
System Name PC ?
Processor AMD Ryzen 9 5950X
Motherboard Gigabyte X570 AORUS XTREME
Cooling NZXT Kraken X62
Memory 64gb of G.Skill Trident Z Neo 32GB 3600 / CL16
Video Card(s) Sapphire RX 7900 XTX NITRO+
Storage C:/ADATA XPG SX8200 Pro 2TB - D:/7TB of Storage (WD-Bx2) - X:/Samsung 840 EVO 1TB
Display(s) Samsung Neo G9 57"
Case Corsair 1000D
Audio Device(s) Cambridge Audio CXA60 + Klipsch RP-160M
Power Supply Seasonic PRIME Ultra Titanium 1000TR
Mouse Logitech G900
Keyboard Logitech G Pro Keyboard
Software Windows 10 Pro (64bit)
Releasing "first" something is like posting "First!" on a YouTube video. The idea is to actually gain something by using that technology. When you have the competition's 256bit/384bit GDDR5 cards matching or outperforming your HBM1 cards, that you invested heavily in and that also introduce availability problems, how is it a smart move?

AMD also released the first 1GHz processor by a few hours. They also released the first desktop 64 bit processor. They also released the first desktop processor with an integrated memory controller. None of those things gave them any tangible advantages. Some did, but they had no idea about how to exploit them. And look at them now. It's not about being "first", it's about perfecting something until it makes sense to implement it in an actual product. Which is something Apple does for years.

HBM1 doesn't make sense now. It did one or two years ago, about that time when nVIDIA presented their "test" HBM board. They showed it to us, as in "we're working on it", and when it makes sense, they'll release it.
Yeah right i still remember 7970 and 680 , when most nvidia cards had less memory than amd and almost every nvidia user told me, it useless to have 3-4gb memory on a videocard. where 680 is now ? while amd is still selling same old chip under new name .. and it's still pretty good.
 
Last edited:
Joined
Apr 19, 2011
Messages
2,198 (0.44/day)
Location
So. Cal.
Interposer assembly would be likely be done at the fab, since the interposer is really just another silicon die you drop other dies on, and where else beside fabs would you find the tech to do such an assembly?

Most likely SK Hynix is shipping full HBM2 wafers straight to TSMC for final assembly on the interposers.

Well, I don't know... have we heard anything that TSMC is making the interposers? I suppose that could be a possibility especially if HBM2 has some different technique. TSMC may still have a older fab process that they could transition to making a interposer, but IDK if they'd even get into the packaging on the interposer. If they intended to offer that on a large scale I'd think they would've like to have started cutting their teeth doing it for AMD.

AMD has interposer coming from UMC made on their 65nm process, while then Amkor handling the packaging and assembly of chip and HBM onto the interposer.
 
Last edited:
Joined
Sep 7, 2011
Messages
2,785 (0.58/day)
Location
New Zealand
System Name MoneySink
Processor 2600K @ 4.8
Motherboard P8Z77-V
Cooling AC NexXxos XT45 360, RayStorm, D5T+XSPC tank, Tygon R-3603, Bitspower
Memory 16GB Crucial Ballistix DDR3-1600C8
Video Card(s) GTX 780 SLI (EVGA SC ACX + Giga GHz Ed.)
Storage Kingston HyperX SSD (128) OS, WD RE4 (1TB), RE2 (1TB), Cav. Black (2 x 500GB), Red (4TB)
Display(s) Achieva Shimian QH270-IPSMS (2560x1440) S-IPS
Case NZXT Switch 810
Audio Device(s) onboard Realtek yawn edition
Power Supply Seasonic X-1050
Software Win8.1 Pro
Benchmark Scores 3.5 litres of Pale Ale in 18 minutes.
I take it this is just a the GP100 TSCM silicon chip... I suppose these can have some functionality and operational parameters checked; however it's not until they have such chips and HMB2 added to a interposer that any real design validation can be made.
That isn't how it works. The raw silicon is wired into a test rig that simulates the memory subsystems (along with other parameters such as PCI-E interface for bus simulation/validation for platform and multi-GPU) The silicon is then put through a series of logic runtime verification and validation protocol (RTL - register transfer level test/validation) -to make sure the logic blocks work as intended - power management (inc clock gating and RTAPI), and power verification to ensure that the individual logic blocks exhibit was within design parameters. Obviously there is a ton of other validation also, but the test/verification team do not need the whole card (or interposer in this case) to validate the silicon - interface with any DDR3 RAM would suffice
It seems like it would be at least 6-8 months of everything going correct and moving forward without any glitch's before Nvidia has the professional (Tesla) versions fully vetted, finalized ready to intro to HPC clients. Then real production would commence, and it probably a good 10-12 weeks to have complete packaged interposers and mount to PCB's. So figure end 2016 for Tesla cards start moving to fulfill the HPC orders.
The initial timetable was to be early Q3. Deviation on that obviously depends upon a number of factors - whether the silicon needs revision, yields of 16nmFF+ ( I am surprised that the TSMC naysayers haven't commented on the companies ability to produce a monolothic GPU on the process), and HBM2 production timetable. Samsung have said volume production should start in early 2016. All in all, the best actual indicator would be an announced HPC contract, of which I don't think there are any at the moment (the announced HPC contracts are for Volta in 2018).
This news is interesting, but it would be more substantive for gamers if this was about the GP104, though much of the work is congruent and should quickly parlay into the first gaming card.
True enough. GP100 won't see the light of day - at least initially, as a GeForce card. Nvidia seems to have a built in ready market for Tesla SKUs judging the groundswell in deep learning programming taking place. Makes no sense to sell a card to a gamer for $1K when you can easily sell the same product for $4-5K to a company investing in coding for artificial neural networking.
Well, I don't know... have we heard anything that TSMC is making the interposers? I suppose that could be a possibility especially if HBM2 has some different technique. TSMC may still have a older fab process that they could transition to making a interposer, but IDK if they'd even get into the packaging on the interposer. If they intended to offer that on a large scale I'd think they would've like to have started cutting their teeth doing it for AMD.
AFAIK, most foundries looking to get into interposer manufacture were waiting on test/validation tooling to catch up with what is largely an old foundry process (which probably added to the Fury delays). This chart gives an idea of who is involved in the die stacking business, and in what capacity (it isn't up to the minute, but is pretty current).


AMD has interposer coming from UMC made on their 65nm process, while then Amkor handling the packaging and assembly of chip and HBM onto the interposer.
Correct, and I wouldn't be surprised to see Amkor (or ASE) responsible for the interposer integration for Pascal either.
 
Last edited:
Joined
Jul 31, 2014
Messages
481 (0.13/day)
System Name Diablo | Baal | Mephisto | Andariel
Processor i5-3570K@4.4GHz | 2x Xeon X5675 | i7-4710MQ | i7-2640M
Motherboard Asus Sabertooth Z77 | HP DL380 G6 | Dell Precision M4800 | Lenovo Thinkpad X220 Tablet
Cooling Swiftech H220-X | Chassis cooled (6 fans + HS) | dual-fanned heatpipes | small-fanned heatpipe
Memory 32GiB DDR3-1600 CL9 | 96GiB DDR3-1333 ECC RDIMM | 32GiB DDR3L-1866 CL11 | 8GiB DDR3L-1600 CL11
Video Card(s) Dual GTX 670 in SLI | Embedded ATi ES1000 | Quadro K2100M | Intel HD 3000
Storage many, many SSDs and HDDs....
Display(s) 1 Dell U3011 + 2x Dell U2410 | HP iLO2 KVMoIP | 3200x1800 Sharp IGZO | 1366x768 IPS with Wacom pen
Case Corsair Obsidian 550D | HP DL380 G6 Chassis | Dell Precision M4800 | Lenovo Thinkpad X220 Tablet
Audio Device(s) Auzentech X-Fi HomeTheater HD | None | On-board | On-board
Power Supply Corsair AX850 | Dual 750W Redundant PSU (Delta) | Dell 330W+240W (Flextronics) | Lenovo 65W (Delta)
Mouse Logitech G502, Logitech G700s, Logitech G500, Dell optical mouse (emergency backup)
Keyboard 1985 IBM Model F 122-key, Ducky YOTT MX Black, Dell AT101W, 1994 IBM Model M, various integrated
Software FAAAR too much to list
That isn't how it works. The raw silicon is wired into a test rig that simulates the memory subsystems (along with other parameters such as PCI-E interface for bus simulation/validation for platform and multi-GPU) The silicon is then put through a series of logic runtime verification and validation protocol (RTL - register transfer level test/validation) -to make sure the logic blocks work as intended - power management (inc clock gating and RTAPI), and power verification to ensure that the individual logic blocks exhibit was within design parameters. Obviously there is a ton of other validation also, but the test/verification team do not need the whole card (or interposer in this case) to validate the silicon - interface with any DDR3 RAM would suffice

Oh yeah, forgot they could do that..

You wouldn't get much use out of testing using lower clockspeeds though, since that gives no info on actual power and thermal properties, and HBM/HBM2 are already running at (comparatively) low speeds, so lowering further wouldn't be too useful either in reducing crosstalk and general noise to make it work over longer distances. Also, I don't think you could sub in DDR3, since HBM (afaik) uses completely different electrical protocols for data transfer... As for testing PCI-E & RTL, it's not too hard to do such a thing on hardware if you're willing to sacrifice runtime speed, but then you don't gain anything since you miss timing issues (at least, that was my experience with one of my FPGA projects...). Personally, I think this is the first revision of final chips going out, possibly on a very custom interposer/PCB combo with a pile more memory debugging tools and measurement points...

Of course, I could be completely wrong.. It's not like I've worked on tapeout before, just inferring a lot...
 
Joined
Apr 19, 2011
Messages
2,198 (0.44/day)
Location
So. Cal.
HMB2 added to a interposer that any real design validation can be made.

Obviously there is a ton of other validation also, but the test/verification team do not need the whole card (or interposer in this case) to validate the silicon - interface with any DDR3 RAM would suffice.
Correct, probably not the proper choice of words; design validation can be done at the chip level and they'll have a good idea of the performance they should see. I think there will be just a little more work testing (at least another layer for the complete package) that hadn't been there in past... before the rubber can meet the road.
 
Joined
Sep 7, 2011
Messages
2,785 (0.58/day)
Location
New Zealand
System Name MoneySink
Processor 2600K @ 4.8
Motherboard P8Z77-V
Cooling AC NexXxos XT45 360, RayStorm, D5T+XSPC tank, Tygon R-3603, Bitspower
Memory 16GB Crucial Ballistix DDR3-1600C8
Video Card(s) GTX 780 SLI (EVGA SC ACX + Giga GHz Ed.)
Storage Kingston HyperX SSD (128) OS, WD RE4 (1TB), RE2 (1TB), Cav. Black (2 x 500GB), Red (4TB)
Display(s) Achieva Shimian QH270-IPSMS (2560x1440) S-IPS
Case NZXT Switch 810
Audio Device(s) onboard Realtek yawn edition
Power Supply Seasonic X-1050
Software Win8.1 Pro
Benchmark Scores 3.5 litres of Pale Ale in 18 minutes.
Personally, I think this is the first revision of final chips going out, possibly on a very custom interposer/PCB combo with a pile more memory debugging tools and measurement points...
At this stage in development that is probably a given. I think the GP100 at this stage looks more like something out of Frankenstein's lab than any polished and recognizable consumer product. The last test/verification silicon I saw looked like a throwback to wire-wrapped DIY/homebuilts from the early 70's.
Also, I don't think you could sub in DDR3, since HBM (afaik) uses completely different electrical protocols for data transfer...
Probably poor choice of example on my part. I was trying to convey that to verify/validate the memory controllers it should be possible to verify via single channel DDR ( rather than the whole HBM wide interface). Is it not possible to have multiple single channel DDR emulate HBM for test/verification purposes?
 
Last edited:
Joined
Jul 31, 2014
Messages
481 (0.13/day)
System Name Diablo | Baal | Mephisto | Andariel
Processor i5-3570K@4.4GHz | 2x Xeon X5675 | i7-4710MQ | i7-2640M
Motherboard Asus Sabertooth Z77 | HP DL380 G6 | Dell Precision M4800 | Lenovo Thinkpad X220 Tablet
Cooling Swiftech H220-X | Chassis cooled (6 fans + HS) | dual-fanned heatpipes | small-fanned heatpipe
Memory 32GiB DDR3-1600 CL9 | 96GiB DDR3-1333 ECC RDIMM | 32GiB DDR3L-1866 CL11 | 8GiB DDR3L-1600 CL11
Video Card(s) Dual GTX 670 in SLI | Embedded ATi ES1000 | Quadro K2100M | Intel HD 3000
Storage many, many SSDs and HDDs....
Display(s) 1 Dell U3011 + 2x Dell U2410 | HP iLO2 KVMoIP | 3200x1800 Sharp IGZO | 1366x768 IPS with Wacom pen
Case Corsair Obsidian 550D | HP DL380 G6 Chassis | Dell Precision M4800 | Lenovo Thinkpad X220 Tablet
Audio Device(s) Auzentech X-Fi HomeTheater HD | None | On-board | On-board
Power Supply Corsair AX850 | Dual 750W Redundant PSU (Delta) | Dell 330W+240W (Flextronics) | Lenovo 65W (Delta)
Mouse Logitech G502, Logitech G700s, Logitech G500, Dell optical mouse (emergency backup)
Keyboard 1985 IBM Model F 122-key, Ducky YOTT MX Black, Dell AT101W, 1994 IBM Model M, various integrated
Software FAAAR too much to list
At this stage in development that is probably a given. I think the GP100 at this stage looks more like something out of Frankenstein's lab than any polished and recognizable consumer product. The last test/verification silicon I saw looked like a throwback to wire-wrapped DIY/homebuilts from the early 70's.

I'm not so sure.. have you seen the really low-level devkits Intel sells? They clip onto the topside pads of the CPUs and give you essentially complete control over the CPU's pipeline, including (to the best of my knowledge at least..) full ability to stop the processor and step through instructions, and even go back.. Among the various bits they grant, they also pull a bunch of info about strange stuff like hit rates and such... They may well be using near-final packages with the pins broken out by the PCB instead...

Either way, can't be too far off now... 6-9months I'd say...
 
Joined
Jun 13, 2012
Messages
1,388 (0.31/day)
Processor i7-13700k
Motherboard Asus Tuf Gaming z790-plus
Cooling Coolermaster Hyper 212 RGB
Memory Corsair Vengeance RGB 32GB DDR5 7000mhz
Video Card(s) Asus Dual Geforce RTX 4070 Super ( 2800mhz @ 1.0volt, ~60mhz overlock -.1volts)
Storage 1x Samsung 980 Pro PCIe4 NVme, 2x Samsung 1tb 850evo SSD, 3x WD drives, 2 seagate
Display(s) Acer Predator XB273u 27inch IPS G-Sync 165hz
Power Supply Corsair RMx Series RM850x (OCZ Z series PSU retired after 13 years of service)
Mouse Logitech G502 hero
Keyboard Logitech G710+
If people want to make statements about the pricing in a negative manner it seems churlish given the recent history going back to what.... 8800 Ultra days? Flagship = most expensive (unless of course your flagship card under performs your main flagship card but you still sell it for $650 anyway).
Neither brand produces 'affordable' flagships these days. Unfortunately.
throw in the nv greed factor and it is $650.
welcome to 2015 nvidia, amd has been waiting a for you.
Wow amazing how quick everyone forgot that Fury X was gonna be a 850$ card but thanks to Nvidia putting 980ti at 650$, Fury X end up being sold at 650$. Lets not forget about gtx970 being sold at 330$ which FORCED 290x at time to near same price from its what 500$ price tag. So lately its been nvidia forcing prices down not AMD.

60 - 90% performance improvment. Well this certainly looks promising, but I guess the price will also be TITAN like.

Anyway I'll be waiting for a full GP104 based product.
60-90% being its from nvidia does can be held credible since nvidia does have track record of telling truth performance wise of their card (cue the people to bring up the gtx970 issue).
 
Joined
Jul 31, 2014
Messages
481 (0.13/day)
System Name Diablo | Baal | Mephisto | Andariel
Processor i5-3570K@4.4GHz | 2x Xeon X5675 | i7-4710MQ | i7-2640M
Motherboard Asus Sabertooth Z77 | HP DL380 G6 | Dell Precision M4800 | Lenovo Thinkpad X220 Tablet
Cooling Swiftech H220-X | Chassis cooled (6 fans + HS) | dual-fanned heatpipes | small-fanned heatpipe
Memory 32GiB DDR3-1600 CL9 | 96GiB DDR3-1333 ECC RDIMM | 32GiB DDR3L-1866 CL11 | 8GiB DDR3L-1600 CL11
Video Card(s) Dual GTX 670 in SLI | Embedded ATi ES1000 | Quadro K2100M | Intel HD 3000
Storage many, many SSDs and HDDs....
Display(s) 1 Dell U3011 + 2x Dell U2410 | HP iLO2 KVMoIP | 3200x1800 Sharp IGZO | 1366x768 IPS with Wacom pen
Case Corsair Obsidian 550D | HP DL380 G6 Chassis | Dell Precision M4800 | Lenovo Thinkpad X220 Tablet
Audio Device(s) Auzentech X-Fi HomeTheater HD | None | On-board | On-board
Power Supply Corsair AX850 | Dual 750W Redundant PSU (Delta) | Dell 330W+240W (Flextronics) | Lenovo 65W (Delta)
Mouse Logitech G502, Logitech G700s, Logitech G500, Dell optical mouse (emergency backup)
Keyboard 1985 IBM Model F 122-key, Ducky YOTT MX Black, Dell AT101W, 1994 IBM Model M, various integrated
Software FAAAR too much to list
60-90% being its from nvidia does can be held credible since nvidia does have track record of telling truth performance wise of their card (cue the people to bring up the gtx970 non-issue).

FTFY :p
 
Joined
Feb 24, 2009
Messages
3,516 (0.61/day)
System Name Money Hole
Processor Core i7 970
Motherboard Asus P6T6 WS Revolution
Cooling Noctua UH-D14
Memory 2133Mhz 12GB (3x4GB) Mushkin 998991
Video Card(s) Sapphire Tri-X OC R9 290X
Storage Samsung 1TB 850 Evo
Display(s) 3x Acer KG240A 144hz
Case CM HAF 932
Audio Device(s) ADI (onboard)
Power Supply Enermax Revolution 85+ 1050w
Mouse Logitech G602
Keyboard Logitech G710+
Software Windows 10 Professional x64
Releasing "first" something is like posting "First!" on a YouTube video. The idea is to actually gain something by using that technology. When you have the competition's 256bit/384bit GDDR5 cards matching or outperforming your HBM1 cards, that you invested heavily in and that also introduce availability problems, how is it a smart move?

AMD also released the first 1GHz processor by a few hours. They also released the first desktop 64 bit processor. They also released the first desktop processor with an integrated memory controller. None of those things gave them any tangible advantages. Some did, but they had no idea about how to exploit them. And look at them now. It's not about being "first", it's about perfecting something until it makes sense to implement it in an actual product. Which is something Apple does for years.

HBM1 doesn't make sense now. It did one or two years ago, about that time when nVIDIA presented their "test" HBM board. They showed it to us, as in "we're working on it", and when it makes sense, they'll release it.

Actually releasing the first 64 bit processor is about the only thing AMD has done that hand significant tangible benefits. AMD has to license x86, but Intel has to license x64 from AMD since AMD was the author that everyone went with instead of Intel's implementation.
 
Joined
Apr 30, 2012
Messages
3,881 (0.85/day)
It's a speed vs accuracy argument, and for most consumer cases, FP32 is plenty. In some niches, FP16 is plenty, and for a minor increase in die are, you can pretty much double performance by using FP16 instead of FP32 (neural nets, low-fidelity image processing and redering effects). And quite obviously, demand from enough very well-connected devs is there too, else nV wouldn't bother at all. Expect AMD to come up with something very similar.

Its more directional to where Nvidia wants to take Pascal with CUDA

New Features in CUDA 7.5 - 16-bit Floating Point (FP16) Data

as far as AMD


Fiji should have an improvement over Tonga. One would expect Artic Islands to improve from Fiji.

Game wise I just think it will be used for artsy post processing effects..
 
Last edited:
Joined
Feb 24, 2009
Messages
3,516 (0.61/day)
System Name Money Hole
Processor Core i7 970
Motherboard Asus P6T6 WS Revolution
Cooling Noctua UH-D14
Memory 2133Mhz 12GB (3x4GB) Mushkin 998991
Video Card(s) Sapphire Tri-X OC R9 290X
Storage Samsung 1TB 850 Evo
Display(s) 3x Acer KG240A 144hz
Case CM HAF 932
Audio Device(s) ADI (onboard)
Power Supply Enermax Revolution 85+ 1050w
Mouse Logitech G602
Keyboard Logitech G710+
Software Windows 10 Professional x64

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.46/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
It's a speed vs accuracy argument, and for most consumer cases, FP32 is plenty. In some niches, FP16 is plenty, and for a minor increase in die are, you can pretty much double performance by using FP16 instead of FP32 (neural nets, low-fidelity image processing and redering effects). And quite obviously, demand from enough very well-connected devs is there too, else nV wouldn't bother at all. Expect AMD to come up with something very similar.
GCN 1.2 already supports FP16 but they nerfed FP64 to get it.

Edit: @Xzibit beat me to it.
 
Joined
Sep 7, 2011
Messages
2,785 (0.58/day)
Location
New Zealand
System Name MoneySink
Processor 2600K @ 4.8
Motherboard P8Z77-V
Cooling AC NexXxos XT45 360, RayStorm, D5T+XSPC tank, Tygon R-3603, Bitspower
Memory 16GB Crucial Ballistix DDR3-1600C8
Video Card(s) GTX 780 SLI (EVGA SC ACX + Giga GHz Ed.)
Storage Kingston HyperX SSD (128) OS, WD RE4 (1TB), RE2 (1TB), Cav. Black (2 x 500GB), Red (4TB)
Display(s) Achieva Shimian QH270-IPSMS (2560x1440) S-IPS
Case NZXT Switch 810
Audio Device(s) onboard Realtek yawn edition
Power Supply Seasonic X-1050
Software Win8.1 Pro
Benchmark Scores 3.5 litres of Pale Ale in 18 minutes.
Joined
Jun 13, 2012
Messages
1,388 (0.31/day)
Processor i7-13700k
Motherboard Asus Tuf Gaming z790-plus
Cooling Coolermaster Hyper 212 RGB
Memory Corsair Vengeance RGB 32GB DDR5 7000mhz
Video Card(s) Asus Dual Geforce RTX 4070 Super ( 2800mhz @ 1.0volt, ~60mhz overlock -.1volts)
Storage 1x Samsung 980 Pro PCIe4 NVme, 2x Samsung 1tb 850evo SSD, 3x WD drives, 2 seagate
Display(s) Acer Predator XB273u 27inch IPS G-Sync 165hz
Power Supply Corsair RMx Series RM850x (OCZ Z series PSU retired after 13 years of service)
Mouse Logitech G502 hero
Keyboard Logitech G710+
Why would you choose to highlight a report from over two months ago over reports from two days ago stating volume production in Q1 ?
Problem with report as AMD getting "priority" as claimed in that other guys story, Is AMD even close to being ready with artic islands to make use of that so called priority. Could see Pascal in q2 but is AMD gonna have one before q3? if not then screwing nvidia over when they are willing to pay for the chips now and use them would be kinda stupid.
 
Top