• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA Unlocks GPU System Processor (GSP) for Improved System Performance

AleksandarK

News Editor
Staff member
Joined
Aug 19, 2017
Messages
2,654 (0.99/day)
In 2016, NVIDIA announced that the company is working on replacing its Fast Logic Controller processor codenamed Falcon with a new GPU System Processor (GSP) solution based on RISC-V Instruction Set Architecture (ISA). This novel RISC-V processor is codenamed NV-RISCV and has been used as GPU's controller core, coordinating everything in the massive pool of GPU cores. Today, NVIDIA has decided to open this NV-RISCV CPU to a broader spectrum of applications starting with 510.39 drivers. According to the NVIDIA documents, this is only available in the select GPUs for now, mainly data-centric Tesla accelerators.

NVIDIA Documents said:
Some GPUs include a GPU System Processor (GSP) which can be used to offload GPU initialization and management tasks. This processor is driven by the firmware file /lib/firmware/nvidia/510.39.01/gsp.bin. A few select products currently use GSP by default, and more products will take advantage of GSP in future driver releases.
Offloading tasks which were traditionally performed by the driver on the CPU can improve performance due to lower latency access to GPU hardware internals.



As this document shows, many tasks like GPU management and initialization were performed by the driver on the CPU. The CPU is traditionally external (relative to the GPU), resulting in higher latencies when requests are made. A CPU embedded into the GPU results in instant delivery of requested data/action, enabling lower latencies and improving performance. We have yet to see what NVIDIA can do with it and how significant the performance penalty was using old ways when the GSP was not enabled. This also points a new direction for GPUs and accelerators alike, an independent state where CPUs get integrated on-die instead of depending on external hardware.

So far, only select GPUs get their GSP unlocked, and the complete list can be found in the document and the image above. It is advised to check the webise for the record, as NVIDIA can update it at any time.

View at TechPowerUp Main Site
 
Joined
Apr 16, 2021
Messages
49 (0.04/day)
Location
Bavaria, Germany
System Name Monster
Processor AMD Ryzen 9 3950X
Motherboard Gigabyte X570 Aorus Xtreme (rev1.0)
Cooling Custom Loop, CPU only currently, watercool Heatkiller IV Pro
Memory G.Skill Trident Z Neo 4x16 GB DDR4-3600C16
Video Card(s) Asus ROG Strix RTX 3090 O24G
Storage Samsung 960 Evo 500 GB, Seagate FireCuda 510 2 TB, Seagate Barracuda 5400 RPM 4 TB
Display(s) Asus PG279Q, Benq SW240, Samsung S24D340
Case Phanteks Enthoo Luxe 2
Audio Device(s) Focusrite Clarett 2Pre USB, EV RE320, Beyerdynamic DT 1990 Pro
Power Supply Corsair AX1200i
Mouse Logitech G502 Lightspeed (and Powerplay)
Keyboard Logitech G910 Orion Spark
This could actually be interesting for higher end gaming I'm guessing, but how it could actually apply to gaming is to be seen.
 
Joined
Dec 17, 2011
Messages
359 (0.08/day)
This can be the next big thing after Ray Tracing. Consoles since 2013 have had a CPU + GPU connected to a unified GDDR memory pool so developers must have developed some optimizations regarding this. Those optimizations can be bought over to discrete GPUs having something like 4 Alder Lake E cores and reduce the gaming load on CPUs.
 

silentbogo

Moderator
Staff member
Joined
Nov 20, 2013
Messages
5,560 (1.37/day)
Location
Kyiv, Ukraine
System Name WS#1337
Processor Ryzen 7 5700X3D
Motherboard ASUS X570-PLUS TUF Gaming
Cooling Xigmatek Scylla 240mm AIO
Memory 64GB DDR4-3600(4x16)
Video Card(s) MSI RTX 3070 Gaming X Trio
Storage ADATA Legend 2TB
Display(s) Samsung Viewfinity Ultra S6 (34" UW)
Case ghetto CM Cosmos RC-1000
Audio Device(s) ALC1220
Power Supply SeaSonic SSR-550FX (80+ GOLD)
Mouse Logitech G603
Keyboard Modecom Volcano Blade (Kailh choc LP)
VR HMD Google dreamview headset(aka fancy cardboard)
Software Windows 11, Ubuntu 24.04 LTS
This novel RISC-V processor is codenamed NV-RISCV and has been used for an unknown period as GPU's controller core
Not sure why is it "unknown". The talk about switching FALCON to NV-RISCV has been around for over 5 years. Everything since Turing uses the new scheduler.
This could actually be interesting for higher end gaming I'm guessing, but how it could actually apply to gaming is to be seen.
This can be the next big thing after Ray Tracing. Consoles since 2013 have had a CPU + GPU connected to a unified GDDR memory pool so developers must have developed some optimizations regarding this.
It's not a "big" thing. Both falcon and NV-RISCV are tiny microcontrollers built into each GPU who's primary purpose in the system (I mean GPU as a "system") is to do scheduling. It also does other things, which do not require tons of compute power, like validating firmware and managing platform security. Basically it's an equivalent of PSP or ME from Nvidia. Nothing really useful or revolutionary, and it's not going to replace your CPU. Maybe it'll be useful for GPGPU compute, so you can tweak the thread/block scheduling more efficiently, but that's about as far as use cases go.
 
Joined
Oct 30, 2008
Messages
1,768 (0.30/day)
System Name Lailalo
Processor Ryzen 9 5900X Boosts to 4.95Ghz
Motherboard Asus TUF Gaming X570-Plus (WIFI
Cooling Noctua
Memory 32GB DDR4 3200 Corsair Vengeance
Video Card(s) XFX 7900XT 20GB
Storage Samsung 970 Pro Plus 1TB, Crucial 1TB MX500 SSD, Segate 3TB
Display(s) LG Ultrawide 29in @ 2560x1080
Case Coolermaster Storm Sniper
Power Supply XPG 1000W
Mouse G602
Keyboard G510s
Software Windows 10 Pro / Windows 10 Home
Miners be like...
Jack Nicholson Yes GIF by The Taboo Group
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,965 (3.71/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
is to do scheduling
I think it's more than scheduling, but even if it's just scheduling it could be a huge thing for serious compute (and miners)--you can basically customize your GPU.
It's questionable though how much NVIDIA opens this up
 
Joined
May 4, 2009
Messages
1,972 (0.35/day)
Location
Bulgaria
System Name penguin
Processor R7 5700G
Motherboard Asrock B450M Pro4
Cooling Some CM tower cooler that will fit my case
Memory 4 x 8GB Kingston HyperX Fury 2666MHz
Video Card(s) IGP
Storage ADATA SU800 512GB
Display(s) 27' LG
Case Zalman
Audio Device(s) stock
Power Supply Seasonic SS-620GM
Software win10
They would need to expose it through some kind of an api and each existing game/application would have to be modified to specifically target that api...which is kinda unlikely. This is more useful for HPC scenarios where you rewrite your code almost daily to optimize it to be faster or use less resources. If certain tasks can bypass hundreds of little CPU > PCIE > ( RAM > PCIE ) > GPU steps and instead start one task that only runs in EPU <> GPU mode, then that's always preferable.
 

AleksandarK

News Editor
Staff member
Joined
Aug 19, 2017
Messages
2,654 (0.99/day)
Not sure why is it "unknown". The talk about switching FALCON to NV-RISCV has been around for over 5 years. Everything since Turing uses the new scheduler.
We are not sure exactly where its starts. For that NVIDIA has to clarify. Where did you find that it starts with Turing? :)
 

silentbogo

Moderator
Staff member
Joined
Nov 20, 2013
Messages
5,560 (1.37/day)
Location
Kyiv, Ukraine
System Name WS#1337
Processor Ryzen 7 5700X3D
Motherboard ASUS X570-PLUS TUF Gaming
Cooling Xigmatek Scylla 240mm AIO
Memory 64GB DDR4-3600(4x16)
Video Card(s) MSI RTX 3070 Gaming X Trio
Storage ADATA Legend 2TB
Display(s) Samsung Viewfinity Ultra S6 (34" UW)
Case ghetto CM Cosmos RC-1000
Audio Device(s) ALC1220
Power Supply SeaSonic SSR-550FX (80+ GOLD)
Mouse Logitech G603
Keyboard Modecom Volcano Blade (Kailh choc LP)
VR HMD Google dreamview headset(aka fancy cardboard)
Software Windows 11, Ubuntu 24.04 LTS
I think it's more than scheduling
I mentioned a few more things, but it's hard to tell, because just like with ME and PSP we don't really know what it does.
EDIT: Apparently it's also taking care of power management and display outputs.
We are not sure exactly where its starts. For that NVIDIA has to clarify. Where did you find that it starts with Turing? :)
Kinda weird question, given that you have the answer in your article's links

Turing and later GPUs are capable of using the GSP firmware by setting the kernel module parameter NVreg_EnableGpuFirmware=1.
 

bug

Joined
May 22, 2015
Messages
13,843 (3.95/day)
Processor Intel i5-12600k
Motherboard Asus H670 TUF
Cooling Arctic Freezer 34
Memory 2x16GB DDR4 3600 G.Skill Ripjaws V
Video Card(s) EVGA GTX 1060 SC
Storage 500GB Samsung 970 EVO, 500GB Samsung 850 EVO, 1TB Crucial MX300 and 2TB Crucial MX500
Display(s) Dell U3219Q + HP ZR24w
Case Raijintek Thetis
Audio Device(s) Audioquest Dragonfly Red :D
Power Supply Seasonic 620W M12
Mouse Logitech G502 Proteus Core
Keyboard G.Skill KM780R
Software Arch Linux + Win10
They would need to expose it through some kind of an api and each existing game/application would have to be modified to specifically target that api...which is kinda unlikely. This is more useful for HPC scenarios where you rewrite your code almost daily to optimize it to be faster or use less resources. If certain tasks can bypass hundreds of little CPU > PCIE > ( RAM > PCIE ) > GPU steps and instead start one task that only runs in EPU <> GPU mode, then that's always preferable.
If the ISA is documented, you don't need an(other) API.
 
Joined
Aug 20, 2007
Messages
21,542 (3.40/day)
System Name Pioneer
Processor Ryzen R9 9950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory 64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage Intel 905p Optane 960GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64 / Windows 11 Enterprise IoT 2024
Joined
Oct 4, 2017
Messages
706 (0.27/day)
Location
France
Processor RYZEN 7 5800X3D
Motherboard Aorus B-550I Pro AX
Cooling HEATKILLER IV PRO , EKWB Vector FTW3 3080/3090 , Barrow res + Xylem DDC 4.2, SE 240 + Dabel 20b 240
Memory Viper Steel 4000 PVS416G400C6K
Video Card(s) EVGA 3080Ti FTW3
Storage XPG SX8200 Pro 512 GB NVMe + Samsung 980 1TB
Display(s) Dell S2721DGF
Case NR 200
Power Supply CORSAIR SF750
Mouse Logitech G PRO
Keyboard Meletrix Zoom 75 GT Silver
Software Windows 11 22H2
Could this be a solution to mitigate the driver overhead issues on their mainstream lineups ?
 

Stefem

New Member
Joined
Mar 17, 2019
Messages
12 (0.01/day)
We are not sure exactly where its starts. For that NVIDIA has to clarify. Where did you find that it starts with Turing? :)
It was also part of Volta, with that GPU NVIDIA actually found a bug in the RISC-V ISA (an actual one, not like other "unnamed" company which ended up being their own bug :laugh:) triggered by the extreme speed of that processor and it may be present in older architectures alongside the original FALCON microcontroller

Maybe it'll be useful for GPGPU compute, so you can tweak the thread/block scheduling more efficiently, but that's about as far as use cases go.
It's useful for gaming too
 
Joined
Jun 22, 2006
Messages
1,097 (0.16/day)
System Name Beaver's Build
Processor AMD Ryzen 9800X3D
Motherboard Asus TUF Gaming X670E Plus WiFi
Cooling Corsair H115i RGB PLATINUM 97 CFM Liquid
Memory G.SKILL Trident Z5 Neo DDR5-6000 CL30 RAM 32GB (2x16GB)
Video Card(s) NVIDIA GeForce RTX 4090 Founders Edition
Storage WD_BLACK 8TB SN850X NVMe
Display(s) Alienware AW3225QF 32" 4K 240 Hz OLED
Case Fractal Design Design Define R6 USB-C
Audio Device(s) Focusrite 2i4 USB Audio Interface
Power Supply SuperFlower LEADEX TITANIUM 1600W
Mouse Razer DeathAdder V2
Keyboard Corsair K70 RGB Pro
Software Microsoft Windows 11 Pro
Benchmark Scores 3dmark = https://www.3dmark.com/spy/51229598
Didn't coreteks allude to a co-processor before the launch of 30 series cards?
This might be it.
"GeForce RTX 3080 and RTX 3090 rumored to pack 'traversal coprocessor' Read more: https://www.tweaktown.com/news/7320...ored-to-pack-traversal-coprocessor/index.html" -- https://www.tweaktown.com/news/7320...ored-to-pack-traversal-coprocessor/index.html

 

Stefem

New Member
Joined
Mar 17, 2019
Messages
12 (0.01/day)
Didn't coreteks allude to a co-processor before the launch of 30 series cards?
This might be it.
Who's coreteks? the dumba... one that claimed Ampere had a separated processor for RT mounted on the back?
Seriously, why are some of you guys listening to those... people playing at being an expert on the internet? the amount of BS and mislead that people like that spread is incredible
 
Joined
Jun 22, 2006
Messages
1,097 (0.16/day)
System Name Beaver's Build
Processor AMD Ryzen 9800X3D
Motherboard Asus TUF Gaming X670E Plus WiFi
Cooling Corsair H115i RGB PLATINUM 97 CFM Liquid
Memory G.SKILL Trident Z5 Neo DDR5-6000 CL30 RAM 32GB (2x16GB)
Video Card(s) NVIDIA GeForce RTX 4090 Founders Edition
Storage WD_BLACK 8TB SN850X NVMe
Display(s) Alienware AW3225QF 32" 4K 240 Hz OLED
Case Fractal Design Design Define R6 USB-C
Audio Device(s) Focusrite 2i4 USB Audio Interface
Power Supply SuperFlower LEADEX TITANIUM 1600W
Mouse Razer DeathAdder V2
Keyboard Corsair K70 RGB Pro
Software Microsoft Windows 11 Pro
Benchmark Scores 3dmark = https://www.3dmark.com/spy/51229598
Who's coreteks? the dumba... one that claimed Ampere had a separated processor for RT mounted on the back?
Seriously, why are some of you guys listening to those... people playing at being an expert on the internet? the amount of BS and mislead that people like that spread is incredible

NVIDIA Ampere “Traversal Coprocessor” Won’t be a Separate Chip; Likely an On-Die Component

"The TTU (coprocessor) continuously interacts with the L1 cache which would be a slow process if the component off-die. Finally, both the “Top-Level” and the “Bottom Level” BVH Traversal as well as the Ray Transformation and Ray/Triangle Intersection Testing (Basically the entire RT pipeline) has access to the SM L0 cache which would only be ideal if the “coprocessor” is an on-die component."
1642604419033.png
 

Stefem

New Member
Joined
Mar 17, 2019
Messages
12 (0.01/day)

NVIDIA Ampere “Traversal Coprocessor” Won’t be a Separate Chip; Likely an On-Die Component

"The TTU (coprocessor) continuously interacts with the L1 cache which would be a slow process if the component off-die. Finally, both the “Top-Level” and the “Bottom Level” BVH Traversal as well as the Ray Transformation and Ray/Triangle Intersection Testing (Basically the entire RT pipeline) has access to the SM L0 cache which would only be ideal if the “coprocessor” is an on-die component."
View attachment 233112
Yep, it made no sense
 
Joined
Aug 6, 2020
Messages
729 (0.46/day)
So, if they already have this thing shipping, why havn't they used this hardware scheduler to accelerate multi-CPU scaling in DX12/Vulkan drivers yet?
 

silentbogo

Moderator
Staff member
Joined
Nov 20, 2013
Messages
5,560 (1.37/day)
Location
Kyiv, Ukraine
System Name WS#1337
Processor Ryzen 7 5700X3D
Motherboard ASUS X570-PLUS TUF Gaming
Cooling Xigmatek Scylla 240mm AIO
Memory 64GB DDR4-3600(4x16)
Video Card(s) MSI RTX 3070 Gaming X Trio
Storage ADATA Legend 2TB
Display(s) Samsung Viewfinity Ultra S6 (34" UW)
Case ghetto CM Cosmos RC-1000
Audio Device(s) ALC1220
Power Supply SeaSonic SSR-550FX (80+ GOLD)
Mouse Logitech G603
Keyboard Modecom Volcano Blade (Kailh choc LP)
VR HMD Google dreamview headset(aka fancy cardboard)
Software Windows 11, Ubuntu 24.04 LTS
It's useful for gaming too
Oyeah... gives devs of modern AAA titles the opportunity to f#@-up few more things :D
So, if they already have this thing shipping, why havn't they used this hardware scheduler to accelerate multi-CPU scaling in DX12/Vulkan drivers yet?
Multi-CPU scaling has nothing to do with this topic. Hardware Accelerated Scheduling is already a part of windows, and since the option no longer exists in windows settings and the corresponding registry key is set to 2, I can safely assume it's enabled by default now.
 
Joined
Aug 20, 2007
Messages
21,542 (3.40/day)
System Name Pioneer
Processor Ryzen R9 9950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory 64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage Intel 905p Optane 960GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64 / Windows 11 Enterprise IoT 2024

silentbogo

Moderator
Staff member
Joined
Nov 20, 2013
Messages
5,560 (1.37/day)
Location
Kyiv, Ukraine
System Name WS#1337
Processor Ryzen 7 5700X3D
Motherboard ASUS X570-PLUS TUF Gaming
Cooling Xigmatek Scylla 240mm AIO
Memory 64GB DDR4-3600(4x16)
Video Card(s) MSI RTX 3070 Gaming X Trio
Storage ADATA Legend 2TB
Display(s) Samsung Viewfinity Ultra S6 (34" UW)
Case ghetto CM Cosmos RC-1000
Audio Device(s) ALC1220
Power Supply SeaSonic SSR-550FX (80+ GOLD)
Mouse Logitech G603
Keyboard Modecom Volcano Blade (Kailh choc LP)
VR HMD Google dreamview headset(aka fancy cardboard)
Software Windows 11, Ubuntu 24.04 LTS
Does for me.
That's weird. I could've swore mine disappeared since 21H1, but just checked in case I'm hallucinating - and it's there again...
 

Mussels

Freshwater Moderator
Joined
Oct 6, 2004
Messages
58,413 (7.91/day)
Location
Oystralia
System Name Rainbow Sparkles (Power efficient, <350W gaming load)
Processor Ryzen R7 5800x3D (Undervolted, 4.45GHz all core)
Motherboard Asus x570-F (BIOS Modded)
Cooling Alphacool Apex UV - Alphacool Eisblock XPX Aurora + EK Quantum ARGB 3090 w/ active backplate
Memory 2x32GB DDR4 3600 Corsair Vengeance RGB @3866 C18-22-22-22-42 TRFC704 (1.4V Hynix MJR - SoC 1.15V)
Video Card(s) Galax RTX 3090 SG 24GB: Underclocked to 1700Mhz 0.750v (375W down to 250W))
Storage 2TB WD SN850 NVME + 1TB Sasmsung 970 Pro NVME + 1TB Intel 6000P NVME USB 3.2
Display(s) Phillips 32 32M1N5800A (4k144), LG 32" (4K60) | Gigabyte G32QC (2k165) | Phillips 328m6fjrmb (2K144)
Case Fractal Design R6
Audio Device(s) Logitech G560 | Corsair Void pro RGB |Blue Yeti mic
Power Supply Fractal Ion+ 2 860W (Platinum) (This thing is God-tier. Silent and TINY)
Mouse Logitech G Pro wireless + Steelseries Prisma XL
Keyboard Razer Huntsman TE ( Sexy white keycaps)
VR HMD Oculus Rift S + Quest 2
Software Windows 11 pro x64 (Yes, it's genuinely a good OS) OpenRGB - ditch the branded bloatware!
Benchmark Scores Nyooom.
Getting some serious envy looking at those 80GB cards in the specs list :O
Imagine the fun cooling that much GDDR6X...

Is this going to be a driver drop in replacement for hardware scheduling? Will they tie it in with the launch of things like RTX-IO, so they have a cute little CPU smashing the numbers ahead of AMD?
 
Joined
Aug 20, 2007
Messages
21,542 (3.40/day)
System Name Pioneer
Processor Ryzen R9 9950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory 64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage Intel 905p Optane 960GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64 / Windows 11 Enterprise IoT 2024
Getting some serious envy looking at those 80GB cards in the specs list :O
Wait, 80GB cards? Wut?

Oh yes, HPC. I thought you meant a users System Specs, lol.
 
Last edited:
Top