• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel-powered Aurora Supercomputer Ranks Fastest for AI

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,205 (7.55/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
At ISC High Performance 2024, Intel announced in collaboration with Argonne National Laboratory and Hewlett Packard Enterprise (HPE) that the Aurora supercomputer has broken the exascale barrier at 1.012 exaflops and is the fastest AI system in the world dedicated to AI for open science, achieving 10.6 AI exaflops. Intel will also detail the crucial role of open ecosystems in driving AI-accelerated high performancehigh -performance computing (HPC). "The Aurora supercomputer surpassing exascale will allow it to pave the road to tomorrow's discoveries. From understanding climate patterns to unraveling the mysteries of the universe, supercomputers serve as a compass guiding us toward solving truly difficult scientific challenges that may improve humanity," said Ogi Brkic, Intel vice president and general manager of Data Center AI Solutions.

Designed as an AI-centric system from its inception, Aurora will allow researchers to harness generative AI models to accelerate scientific discovery. Significant progress has been made in Argonne's early AI-driven research. Success stories include mapping the human brain's 80 billion neurons, high-energy particle physics enhanced by deep learning, and drug design and discovery accelerated by machine learning, among others. The Aurora supercomputer is an expansive system with 166 racks, 10,624 compute blades, 21,248 Intel Xeon CPU Max Series processors, and 63,744 Intel Data Center GPU Max Series units, making it one of the world's largest GPU clusters.



Aurora also includes the largest open, Ethernet-based supercomputing interconnect on a single system of 84,992 HPE slingshot fabric endpoints. Aurora supercomputer came in second on the high-performance LINPACK (HPL) benchmark but broke the exascale barrier at 1.012 exaflops utilizing 9,234 nodes, only 87% of the system. Aurora supercomputer also secured the third spot on the high-performance conjugate gradient (HPCG) benchmark at 5,612 TeraFLOPS per second (TF/s) with 39% of the machine. This benchmark aims to assess more realistic scenarios providing insights into communication and memory access patterns, which are important factors in real-world HPC applications. It complements benchmarks like LINPACK by offering a comprehensive view of a system's capabilities.

At the heart of the Aurora supercomputer is the Intel Data Center GPU Max Series. The Intel Xe GPU architecture is foundational to the Max Series, featuring specialized hardware like matrix and vector compute blocks optimized for both AI and HPC tasks. The Intel Xe architecture's design that delivers unparalleled compute performance is the reason the Aurora supercomputer secured the top spot in the high-performance LINPACK-mixed precision (HPL-MxP) benchmark - which best highlights the importance of AI workloads in HPC.

The Xe architecture's parallel processing capabilities excel in managing the intricate matrix-vector operations inherent in neural network AI computation. These compute cores are pivotal in accelerating matrix operations crucial for deep learning models. Complemented by Intel's suite of software tools, including Intel oneAPI DPC++/C++ Compiler, a rich set of performance libraries, and optimized AI frameworks and tools, the Xe architecture fosters an open ecosystem for developers that is characterized by flexibility and scalability across various devices and form factors.

In his special session at ISC 2024, on Tuesday, May 14 at 6:45 p.m., (GMT+2) Hall 4, Congress Center Hamburg, Germany, CEO Andrew Richards of Codeplay, an Intel company, will address the growing demand for accelerated computing and software in HPC and AI. He will highlight the importance of oneAPI, offering a unified programming model across diverse architectures. Built on open standards, oneAPI empowers developers to craft code that seamlessly runs on different hardware platforms without extensive modifications or vendor lock-in. This is also the goal of the Linux Foundation's Unified Acceleration Foundation (UXL), in which Arm, Google, Intel, Qualcomm and others are developing an open ecosystem for all accelerators and unified heterogeneous compute on open standards to break proprietary lock-in. The UXL Foundation is adding more members to its growing coalition.


Meanwhile, Intel Tiber Developer Cloud is expanding its compute capacity with new state-of-the-art hardware platforms and new service capabilities allowing enterprises and developers to evaluate the latest Intel architecture, to innovate and optimize AI models and workloads quickly, and then to deploy AI models at scale. New hardware includes previews of Intel Xeon 6 E-core and P-core systems for select customers, and large-scale Intel Gaudi 2-based and Intel Data Center GPU Max Series-based clusters. New capabilities include Intel Kubernetes Service for cloud-native AI training and inference workloads and multiuser accounts.

New supercomputers being deployed with Intel Xeon CPU Max Series and Intel Data Center GPU Max Series technologies underscore Intel's goal to advance HPC and AI. Systems include Euro-Mediterranean Centre on Climate Change's (CMCC) Cassandra to accelerate climate change modeling; Italian National Agency for New Technologies, Energy and Sustainable Economic Development's (ENEA) CRESCO 8 to enable breakthroughs in fusion energy; Texas Advanced Computing Center (TACC), which is in full production to enable data analysis in biology to supersonic turbulence flows and atomistic simulations on a wide range of materials; as well as United Kingdom Atomic Energy Authority (UKAEA) to solve memory-bound problems that underpin the design of future fusion powerplants.

The result from the mixed-precision AI benchmark will be foundational for Intel's next-generation GPU for AI and HPC, code-named Falcon Shores. Falcon Shores will leverage the next-generation Intel Xe architecture with the best of Intel Gaudi. This integration enables a unified programming interface.

Early performance results on Intel Xeon 6 with P-cores and Multiplexer Combined Ranks (MCR) memory at 8800 megatransfers per second (MT/s) deliver up to 2.3x performance improvement for real-world HPC applications, like Nucleus for European Modeling of the Ocean (NEMO), when compared to the previous generation, setting a strong foundation as the preferred host CPU choice for HPC solutions.

View at TechPowerUp Main Site
 
Joined
Sep 6, 2013
Messages
3,328 (0.81/day)
Location
Athens, Greece
System Name 3 desktop systems: Gaming / Internet / HTPC
Processor Ryzen 5 5500 / Ryzen 5 4600G / FX 6300 (12 years latter got to see how bad Bulldozer is)
Motherboard MSI X470 Gaming Plus Max (1) / MSI X470 Gaming Plus Max (2) / Gigabyte GA-990XA-UD3
Cooling Νoctua U12S / Segotep T4 / Snowman M-T6
Memory 32GB - 16GB G.Skill RIPJAWS 3600+16GB G.Skill Aegis 3200 / 16GB JUHOR / 16GB Kingston 2400MHz (DDR3)
Video Card(s) ASRock RX 6600 + GT 710 (PhysX)/ Vega 7 integrated / Radeon RX 580
Storage NVMes, ONLY NVMes/ NVMes, SATA Storage / NVMe boot(Clover), SATA storage
Display(s) Philips 43PUS8857/12 UHD TV (120Hz, HDR, FreeSync Premium) ---- 19'' HP monitor + BlitzWolf BW-V5
Case Sharkoon Rebel 12 / CoolerMaster Elite 361 / Xigmatek Midguard
Audio Device(s) onboard
Power Supply Chieftec 850W / Silver Power 400W / Sharkoon 650W
Mouse CoolerMaster Devastator III Plus / CoolerMaster Devastator / Logitech
Keyboard CoolerMaster Devastator III Plus / CoolerMaster Devastator / Logitech
Software Windows 10 / Windows 10&Windows 11 / Windows 10
Intel managed to score one win with over 2 times the power consumption and while Nvidia is getting ready to annihilate everything in AI benchmarks with it's latest chips.
In the end Intel will be slower than Nvidia in AI, slower in everything else compared to AMD and a joke in efficiency compared to either AMD or Nvidia.
 
Joined
Dec 12, 2016
Messages
1,802 (0.62/day)
I hate to pile on but Aurora was a joke from start to finish. A cautionary tale of how not to deploy a supercomputer. Delayed multiple times it launched with only half its nodes back in Nov 2023. Target performance was suppose to be over 2 exaflops.

It’s time the general market realizes that the corporate structure at Intel is no longer able to run the company. It can only run the company into the ground.

Edit: “Designed as an AI-centric system from its inception”. Funny nothing about AI was mentioned In the 2015 announcement.

 
Last edited:
Joined
Jan 8, 2017
Messages
9,424 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
If it doesn't have shared memory it is not "a computer". It is a cluster.
Strange take, it obviously is a computer, I don't know what shared memory is even supposed to mean in this context.
 
Joined
Sep 6, 2013
Messages
3,328 (0.81/day)
Location
Athens, Greece
System Name 3 desktop systems: Gaming / Internet / HTPC
Processor Ryzen 5 5500 / Ryzen 5 4600G / FX 6300 (12 years latter got to see how bad Bulldozer is)
Motherboard MSI X470 Gaming Plus Max (1) / MSI X470 Gaming Plus Max (2) / Gigabyte GA-990XA-UD3
Cooling Νoctua U12S / Segotep T4 / Snowman M-T6
Memory 32GB - 16GB G.Skill RIPJAWS 3600+16GB G.Skill Aegis 3200 / 16GB JUHOR / 16GB Kingston 2400MHz (DDR3)
Video Card(s) ASRock RX 6600 + GT 710 (PhysX)/ Vega 7 integrated / Radeon RX 580
Storage NVMes, ONLY NVMes/ NVMes, SATA Storage / NVMe boot(Clover), SATA storage
Display(s) Philips 43PUS8857/12 UHD TV (120Hz, HDR, FreeSync Premium) ---- 19'' HP monitor + BlitzWolf BW-V5
Case Sharkoon Rebel 12 / CoolerMaster Elite 361 / Xigmatek Midguard
Audio Device(s) onboard
Power Supply Chieftec 850W / Silver Power 400W / Sharkoon 650W
Mouse CoolerMaster Devastator III Plus / CoolerMaster Devastator / Logitech
Keyboard CoolerMaster Devastator III Plus / CoolerMaster Devastator / Logitech
Software Windows 10 / Windows 10&Windows 11 / Windows 10
Edit: “Designed as an AI-centric system from its inception”. Funny nothing about AI was mentioned In the 2015 announcement.
The only one who will say that they where thinking AI in 2015 and I'll believe him, is Huang. Everybody else was either sleeping, or didn't had the means, financial or hardware, to set such a goal.
Not to mention that if Intel was thinking AI in 2015, it should have been the main competitor to Nvidia today.
 
Joined
Oct 31, 2020
Messages
84 (0.06/day)
Processor 5800X3D
Motherboard ROG Strix X570-F Gaming
Cooling Arctic Liquid Freezer II 280
Memory G Skill F4-3800C14-8GTZN
Video Card(s) PowerColor RX 6900xt Red Devil
Storage Samsung SSD 970 EVO Plus 250GB [232 GB], Samsung SSD 970 EVO Plus 500GB
Display(s) Samsung C32HG7xQQ (DisplayPort)
Case Graphite Series™ 730T Full-Tower Case
Power Supply Corsair RM1000x
Mouse Basillisk X Hyperspeed
Keyboard Blackwidow Ultimate
Software Win 10 Home

Attachments

  • image_2024-05-13_145252046.png
    image_2024-05-13_145252046.png
    66.4 KB · Views: 27
Joined
Dec 12, 2016
Messages
1,802 (0.62/day)
On related supercomputer news, AMD has increased its number of systems to 157 from a low of 2 systems in June 2019, an almost 8000% increase in 5 years. ARM based systems are at 16 now (Fujitsu and Nvidia). Intel continues to drop and has no hope of reversing its downward trajectory on the Top 500 list.

Edit: For co-accelerators: Intel -5- AMD -14- Nvidia -A WHOLE LOT MORE!-
 
Joined
Dec 28, 2012
Messages
3,857 (0.89/day)
System Name Skunkworks 3.0
Processor 5800x3d
Motherboard x570 unify
Cooling Noctua NH-U12A
Memory 32GB 3600 mhz
Video Card(s) asrock 6800xt challenger D
Storage Sabarent rocket 4.0 2TB, MX 500 2TB
Display(s) Asus 1440p144 27"
Case Old arse cooler master 932
Power Supply Corsair 1200w platinum
Mouse *squeak*
Keyboard Some old office thing
Software Manjaro
I hate to pile on but Aurora was a joke from start to finish. A cautionary tale of how not to deploy a supercomputer. Delayed multiple times it launched with only half its nodes back in Nov 2023. Target performance was suppose to be over 2 exaflops.

It’s time the general market realizes that the corporate structure at Intel is no longer able to run the company. It can only run the company into the ground.

Edit: “Designed as an AI-centric system from its inception”. Funny nothing about AI was mentioned In the 2015 announcement.

I wonder how much longer Pat will last, he gotta be able to deliver things on time, or if they are late, blow expectations away. He's currently doing neither.
 
Joined
May 25, 2014
Messages
294 (0.08/day)
I wonder how much longer Pat will last, he gotta be able to deliver things on time, or if they are late, blow expectations away. He's currently doing neither.
Yeah not really seeing anything positive out of intel, the products, if anything, are getting worse or staying bad. There is insane product segmentation and naming. Too much power consumption. Too much heat. Not enough features.

Intel keeps bragging about their process nodes roadmap, but dont see this paying off in real products, TSMC is handing intels a$$ to it. If arrowlake is a dissapointment then they are in catastrophic trouble, dont see how removing hyperthreading, and adding more e-cores will change anything. Rather have 8 really powerfull cores, and 3D vcache, and less power consumption. For a server, don't want hetergenous architecture, its screws things up badly, and where's AVX512, why can't consumer space have more than just enough lanes to run a GPU and m.2 nvme only? Stuff is stagnating here boys.
 
Joined
Nov 4, 2005
Messages
11,972 (1.72/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s) 55" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
Remember everyone, Intel didn’t lie, they just candy coated the kickback truth for some.
 
Joined
Sep 23, 2008
Messages
311 (0.05/day)
Location
Richmond, VA
Processor i7-14700k
Motherboard MSI Z790 Carbon Wifi
Cooling DeepCool LS720
Memory 32gb GSkill DDR5-6400 CL32 Trident Z5
Video Card(s) Intel ARC A770 LE
Storage 990 Pro 1tb, 980 Pro 512gb, WD black 4tb
Display(s) 3 x HP EliteDisplay E273
Case Corsair 5000D Airflow
Power Supply Corsair RM850x
Mouse Logitec MK520
Keyboard Logitec MK520
Software Win 11 Pro 64bit
Benchmark Scores Cinebench R23 Multi 35805
slower in everything else compared to AMD and a joke in efficiency compared to either AMD or Nvidia.
Other than games...what is AMD faster than Intel at?
 
Joined
Mar 18, 2023
Messages
862 (1.42/day)
System Name Never trust a socket with less than 2000 pins
Strange take, it obviously is a computer, I don't know what shared memory is even supposed to mean in this context.

All cores in the computer can reach all RAM locations. So that you can use threads or processes for parallelism without having to do inter-machine communication (aka networking).

A "supercomputer" like this is just an optimized network of individual computers.
 
Joined
Sep 6, 2013
Messages
3,328 (0.81/day)
Location
Athens, Greece
System Name 3 desktop systems: Gaming / Internet / HTPC
Processor Ryzen 5 5500 / Ryzen 5 4600G / FX 6300 (12 years latter got to see how bad Bulldozer is)
Motherboard MSI X470 Gaming Plus Max (1) / MSI X470 Gaming Plus Max (2) / Gigabyte GA-990XA-UD3
Cooling Νoctua U12S / Segotep T4 / Snowman M-T6
Memory 32GB - 16GB G.Skill RIPJAWS 3600+16GB G.Skill Aegis 3200 / 16GB JUHOR / 16GB Kingston 2400MHz (DDR3)
Video Card(s) ASRock RX 6600 + GT 710 (PhysX)/ Vega 7 integrated / Radeon RX 580
Storage NVMes, ONLY NVMes/ NVMes, SATA Storage / NVMe boot(Clover), SATA storage
Display(s) Philips 43PUS8857/12 UHD TV (120Hz, HDR, FreeSync Premium) ---- 19'' HP monitor + BlitzWolf BW-V5
Case Sharkoon Rebel 12 / CoolerMaster Elite 361 / Xigmatek Midguard
Audio Device(s) onboard
Power Supply Chieftec 850W / Silver Power 400W / Sharkoon 650W
Mouse CoolerMaster Devastator III Plus / CoolerMaster Devastator / Logitech
Keyboard CoolerMaster Devastator III Plus / CoolerMaster Devastator / Logitech
Software Windows 10 / Windows 10&Windows 11 / Windows 10
Other than games...what is AMD faster than Intel at?
You do realize that supercomputers are not intent for gaming, right?

I wonder how much longer Pat will last, he gotta be able to deliver things on time, or if they are late, blow expectations away. He's currently doing neither.
He is focused on fabs and how many billions he can get to build them.
Intel stock up as $11B Apollo deal nears completion: WSJ
We can laugh at him constantly for the next 1-2-3 years, but if his plan works, if he builds those fabs, are competitive or even better than TSMCs, Intel will become again a successful behemoth and most of all, it's fate wouldn't be depended on x86 success/survival against ARM or whatever else comes in the future.
 
Joined
Nov 6, 2016
Messages
1,749 (0.60/day)
Location
NH, USA
System Name Lightbringer
Processor Ryzen 7 2700X
Motherboard Asus ROG Strix X470-F Gaming
Cooling Enermax Liqmax Iii 360mm AIO
Memory G.Skill Trident Z RGB 32GB (8GBx4) 3200Mhz CL 14
Video Card(s) Sapphire RX 5700XT Nitro+
Storage Hp EX950 2TB NVMe M.2, HP EX950 1TB NVMe M.2, Samsung 860 EVO 2TB
Display(s) LG 34BK95U-W 34" 5120 x 2160
Case Lian Li PC-O11 Dynamic (White)
Power Supply BeQuiet Straight Power 11 850w Gold Rated PSU
Mouse Glorious Model O (Matte White)
Keyboard Royal Kludge RK71
Software Windows 10
I've got a strong feeling that Intel practically gave this hardware away....why else would anyone base a supercomputer on it when AMD and Nvidia are both objectively better options?
 
Joined
Jan 8, 2017
Messages
9,424 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
All cores in the computer can reach all RAM locations.
This actually isn't even true all of the time, depending on the topology some cores might not in fact have direct accesses to RAM and have to interface with what's effectively an on chip network controller to talk to an available memory controller, which might not even be on the same chip. Your notion of a "computer" is very outdated and I don't think a computer was ever defined how you think it is. How the memory system works is nothing more than an implementation detail, everything that's turing complete is a computer, it makes no sense to say that it's not a computer just because it's comprised of multiple nodes. In a cluster each node can in fact access memory locations from other nodes, that's necessary.

Your PC has a CPU and a GPU, usually with their own separate physical memory, they can't interface with each other directly, they can through software same as in a supercomputer, does your PC count as a cluster ? A cluster is made out of computers but a cluster is not a computer even though they all can access each other's memory, which is the point of having a cluster ?

Other than games...what is AMD faster than Intel at?
Man, it's crazy to think that a couple of years ago the roles would have been completely reversed but the argument would have been the same. :roll:
 
Last edited:
Joined
Mar 18, 2023
Messages
862 (1.42/day)
System Name Never trust a socket with less than 2000 pins
This actually isn't even true all of the time, depending on the topology some cores might not in fact have direct accesses to RAM and have to interface with what's effectively an on chip network controller to talk to an available memory controller, which might not even be on the same chip.

I actually have systems with 4 NUMA banks. The point is that virtual addresses on all cores in all CPUs are mapped to the physical RAM somewhere in the machine. So while the hardware might access a given piece of RAM through some other core's memory controller that is transparent to the software I am running. That is no longer the case with networked "computers" such as this one.
 
Joined
Jun 29, 2018
Messages
536 (0.23/day)
This actually isn't even true all of the time, depending on the topology some cores might not in fact have direct accesses to RAM and have to interface with what's effectively an on chip network controller to talk to an available memory controller, which might not even be on the same chip. Your notion of a "computer" is very outdated and I don't think a computer was ever defined how you think it is. How the memory system works is nothing more than an implementation detail, everything that's turing complete is a computer, it makes no sense to say that it's not a computer just because it's comprised of multiple nodes.
Even in NUMA designs the physical address space is still uniform. What I mean by that is that a core in one module/chiplet/socket can access every memory location by physical address, regardless of how it's achieved and how long it takes.
In most HPC clusters, even with technologies like RDMA, there is no uniform physical address space so that's not possible. There will be an explicit translation layer from physical address space of one node to physical address space of another. It can be made to look like it's uniform with PGAS, but it's not at the hardware level.
I wrote "most" because there are some specialized designs that do have uniform physical address space across multiple nodes like IBM Power10 with PowerAXON, and that is done at the hardware level.

Your PC has a CPU and a GPU, usually with their own separate physical memory, they can't interface with each other directly, they can through software same as in a supercomputer, does your PC count as a cluster ?
Sure they can interface directly. That's one of the ways GPU drivers communicate with hardware - via PCI BARs (previously in up to 256MB windows, but with ReBAR this can be exceeded). This mechanism can also be used to facilitate direct communication between PCIe devices like network cards, an example of which is NVIDIA GPUDirect RDMA.
 
Joined
Jan 8, 2017
Messages
9,424 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
I wrote "most" because there are some specialized designs that do have uniform physical address space across multiple nodes like IBM Power10 with PowerAXON, and that is done at the hardware level.
I don't understand the relevance of whether they do have a uniform physical address space at the hardware level or not, they still share memory.

The point is that virtual addresses on all cores in all CPUs are mapped to the physical RAM somewhere in the machine.
I just don't see how that could ever mean something is not a computer. Also that virtual address space can map to anything, it can be system memory, disk or memory from an entirely different machine for that matter. You can absolutely have the same virtual memory space across however many nodes you want.
 
Joined
Mar 18, 2023
Messages
862 (1.42/day)
System Name Never trust a socket with less than 2000 pins
You can absolutely have the same virtual memory space across however many nodes you want.

That requires very extensive software support that is practically not used in high performance computing. Most people use MPI, which is explicit networking and destroys the simple (for software) model.
 
Joined
Jun 29, 2018
Messages
536 (0.23/day)
I don't understand the relevance of whether they do have a uniform physical address space at the hardware level or not, they still share memory.
I just don't see how that could ever mean something is not a computer.
Only systems with uniform physical address spaces truly share memory. Systems without that are clusters of computers at most, and not singular computers.
You can't run normal software on a cluster node and expect it to magically be able to utilize memory on every node of it.
Also that virtual address space can map to anything, [...]
Why do you switch from physical to virtual in this argument?
[...] or memory from an entirely different machine for that matter. You can absolutely have the same virtual memory space across however many nodes you want.
Not easily.
 
Joined
Jan 8, 2017
Messages
9,424 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
That requires very extensive software support that is practically not used
In practice this is always the case but this applies to literally everything though, you always optimize to reduce communication between parts in a system, it doesn't mean the system can't act as a whole.

Why do you switch from physical to virtual in this argument?
I didn't, the other guy brought it up.

Not easily.
What does it matter if it's easy or not in this context. The argument boils down to "this isn't a real computer because memory can't be shared" but of course it can, that's the point of having a cluster.
 
Joined
Mar 18, 2023
Messages
862 (1.42/day)
System Name Never trust a socket with less than 2000 pins
Then let me ask you this:

What is, in your opinion, the difference between this supercomputer on one hand and a couple of racks of "normal" machines with very fast networking?

I mean from a software/programming standpoint.
 
Joined
Jun 29, 2018
Messages
536 (0.23/day)
What does it matter if it's easy or not in this context.
It matters because it's not being done routinely. I know of only one modern production implementation and that's IBM Power10. Software-based emulation has heavy drawbacks, and would require extreme network performance in both bandwidth and latency.

The argument boil down to "this isn't a real computer because memory can't be shared" but of course it can, that's the point of having a cluster.
That was never the argument, please re-read the first post you replied to. It was about being a singular computer vs. a cluster of computers. Programming for a cluster is way harder than a singular computer, just like MT programming is more difficult than ST.
Clusters with proper memory sharing are rare.
 
Top