News Posts matching #Neoverse

Return to Keyword Browsing

Arm and Partners Develop AI CPU: Neoverse V3 CSS Made on 2 nm Samsung GAA FET

Yesterday, Arm has announced significant progress in its Total Design initiative. The program, launched a year ago, aims to accelerate the development of custom silicon for data centers by fostering collaboration among industry partners. The ecosystem has now grown to include nearly 30 participating companies, with recent additions such as Alcor Micro, Egis, PUF Security, and SEMIFIVE. A notable development is a partnership between Arm, Samsung Foundry, ADTechnology, and Rebellions to create an AI CPU chiplet platform. This collaboration aims to deliver a solution for cloud, HPC, and AI/ML workloads, combining Rebellions' AI accelerator with ADTechnology's compute chiplet, implemented using Samsung Foundry's 2 nm Gate-All-Around (GAA) FET technology. The platform is expected to offer significant efficiency gains for generative AI workloads, with estimates suggesting a 2-3x improvement over the standard CPU design for LLMs like Llama3.1 with 405 billion parameters.

Arm's approach emphasizes the importance of CPU compute in supporting the complete AI stack, including data pre-processing, orchestration, and advanced techniques like Retrieval-augmented Generation (RAG). The company's Compute Subsystems (CSS) are designed to address these requirements, providing a foundation for partners to build diverse chiplet solutions. Several companies, including Alcor Micro and Alphawave, have already announced plans to develop CSS-powered chiplets for various AI and high-performance computing applications. The initiative also focuses on software readiness, ensuring that major frameworks and operating systems are compatible with Arm-based systems. Recent efforts include the introduction of Arm Kleidi technology, which optimizes CPU-based inference for open-source projects like PyTorch and Llama.cpp. Notably, as Google claims, most AI workloads are being inferenced on CPUs, so creating the most efficient and most performant CPUs for AI makes a lot of sense.

NVIDIA Cancels Dual-Rack NVL36x2 in Favor of Single-Rack NVL72 Compute Monster

NVIDIA has reportedly discontinued its dual-rack GB200 NVL36x2 GPU model, opting to focus on the single-rack GB200 NVL72 and NVL36 models. This shift, revealed by industry analyst Ming-Chi Kuo, aims to simplify NVIDIA's offerings in the AI and HPC markets. The decision was influenced by major clients like Microsoft, who prefer the NVL72's improved space efficiency and potential for enhanced inference performance. While both models perform similarly in AI large language model (LLM) training, the NVL72 is expected to excel in non-parallelizable inference tasks. As a reminder, the NVL72 features 36 Grace CPUs, delivering 2,592 Arm Neoverse V2 cores with 17 TB LPDDR5X memory with 18.4 TB/s aggregate bandwidth. Additionally, it includes 72 Blackwell GB200 SXM GPUs that have a massive 13.5 TB of HBM3e combined, running at 576 TB/s aggregate bandwidth.

However, this shift presents significant challenges. The NVL72's power consumption of around 120kW far exceeds typical data center capabilities, potentially limiting its immediate widespread adoption. The discontinuation of the NVL36x2 has also sparked concerns about NVIDIA's execution capabilities and may disrupt the supply chain for assembly and cooling solutions. Despite these hurdles, industry experts view this as a pragmatic approach to product planning in the dynamic AI landscape. While some customers may be disappointed by the dual-rack model's cancellation, NVIDIA's long-term outlook in the AI technology market remains strong. The company continues to work with clients and listen to their needs, to position itself as a leader in high-performance computing solutions.

MediaTek Joins Arm Total Design to Shape the Future of AI Computing

MediaTek announced today at COMPUTEX 2024 that the company has joined Arm Total Design, a fast-growing ecosystem that aims to accelerate and simplify the development of products based on Arm Neoverse Compute Subsystems (CSS). Arm Neoverse CSS is designed to meet the performance and efficiency needs of AI applications in the data center, infrastructure systems, telecommunications, and beyond.

"Together with Arm, we're enabling our customers' designs to meet the most challenging workloads for AI applications, maximizing performance per watt," said Vince Hu, Corporate Vice President at MediaTek. "We will be working closely with Arm as we expand our footprint into data centers, utilizing our expertise in hybrid computing, AI, SerDes and chiplets, and advance packaging technologies to accelerate AI innovation from the edge to the cloud."

European Supercomputer Chip SiPearl Rhea Delayed, But Upgraded with More Cores

The rollout of SiPearl's much-anticipated Rhea processor for European supercomputers has been pushed back by a year to 2025, but the delay comes with a silver lining - a significant upgrade in core count and potential performance. Originally slated to arrive in 2024 with 72 cores, the homegrown high-performance chip will now pack 80 cores when it eventually launches. This decisive move by SiPearl and its partners is a strategic choice to ensure the utmost quality and capabilities for the flagship European processor. The additional 12 months will allow the engineering teams to further refine the chip's architecture, carry out extensive testing, and optimize software stacks to take full advantage of Rhea's computing power. Now called the Rhea1, the chip is a crucial component of the European Processor Initiative's mission to develop domestic high-performance computing technologies and reduce reliance on foreign processors. Supercomputer-scale simulations spanning climate science, drug discovery, energy research and more all require astonishing amounts of raw compute grunt.

By scaling up to 80 cores based on the latest Arm Neoverse V1, Rhea1 aims to go toe-to-toe with the world's most powerful processors optimized for supercomputing workloads. The SiPearl wants to utilize TSCM's N6 manufacturing process. The CPU will have 256-bit DDR5 memory connections, 104 PCIe 5.0 lanes, and four stacks of HBM2E memory. The roadmap shift also provides more time for the expansive European supercomputing ecosystem to prepare robust software stacks tailored for the upgraded Rhea silicon. Ensuring a smooth deployment with existing models and enabling future breakthroughs are top priorities. While the delay is a setback for SiPearl's launch schedule, the substantial upgrade could pay significant dividends for Europe's ambitions to join the elite ranks of worldwide supercomputer power. All eyes will be on Rhea's delivery in 2025, mainly from Europe's governments, which are funding the project.

Google Launches Axion Arm-based CPU for Data Center and Cloud

Google has officially joined the club of custom Arm-based, in-house-developed CPUs. As of today, Google's in-house semiconductor development team has launched the "Axion" CPU based on Arm instruction set architecture. Using the Arm Neoverse V2 cores, Google claims that the Axion CPU outperforms general-purpose Arm chips by 30% and Intel's processors by a staggering 50% in terms of performance. This custom silicon will fuel various Google Cloud offerings, including Compute Engine, Kubernetes Engine, Dataproc, Dataflow, and Cloud Batch. The Axion CPU, designed from the ground up, will initially support Google's AI-driven services like YouTube ads and Google Earth Engine. According to Mark Lohmeyer, Google Cloud's VP and GM of compute and machine learning infrastructure, Axion will soon be available to cloud customers, enabling them to leverage its performance without overhauling their existing applications.

Google's foray into custom silicon aligns with the strategies of its cloud rivals, Microsoft and Amazon. Microsoft recently unveiled its own AI chip for training large language models and an Arm-based CPU called Cobalt 100 for cloud and AI workloads. Amazon, on the other hand, has been offering Arm-based servers through its custom Graviton CPUs for several years. While Google won't sell these chips directly to customers, it plans to make them available through its cloud services, enabling businesses to rent and leverage their capabilities. As Amin Vahdat, the executive overseeing Google's in-house chip operations, stated, "Becoming a great hardware company is very different from becoming a great cloud company or a great organizer of the world's information."

Arm Launches Next-Generation Neoverse CSS V3 and N3 Designs for Cloud, HPC, and AI Acceleration

Last year, Arm introduced its Neoverse Compute Subsystem (CSS) for the N2 and V2 series of data center processors, providing a reference platform for the development of efficient Arm-based chips. Major cloud service providers like AWS with Graviton 4 and Trainuium 2, Microsoft with Cobalt 100 and Maia 100, and even NVIDIA with Grace CPU and Bluefield DPUs are already utilizing custom Arm server CPU and accelerator designs based on the CSS foundation in their data centers. The CSS allows hyperscalers to optimize Arm processor designs specifically for their workloads, focusing on efficiency rather than outright performance. Today, Arm has unveiled the next generation CSS N3 and V3 for even greater efficiency and AI inferencing capabilities. The N3 design provides up to 32 high-efficiency cores per die with improved branch prediction and larger caches to boost AI performance by 196%, while the V3 design scales up to 64 cores and is 50% faster overall than previous generations.

Both the N3 and V3 leverage advanced features like DDR5, PCIe 5.0, CXL 3.0, and chiplet architecture, continuing Arm's push to make chiplets the standard for data center and cloud architectures. The chiplet approach enables customers to connect their own accelerators and other chiplets to the Arm cores via UCIe interfaces, reducing costs and time-to-market. Looking ahead, Arm has a clear roadmap for its Neoverse platform. The upcoming CSS V4 "Adonis" and N4 "Dionysus" designs will build on the improvements in the N3 and V3, advancing Arm's goal of greater efficiency and performance using optimized chiplet architectures. As more major data center operators introduce custom Arm-based designs, the Neoverse CSS aims to provide a flexible, efficient foundation to power the next generation of cloud computing.

Intel Foundry Services Get 18A Order: Arm-based 64-Core Neoverse SoC

Faraday Technology Corporation, a Taiwanese silicon IP designer, has announced plans to develop a new 64-core system-on-chip (SoC) utilizing Intel's most advanced 18A process technology. The Arm-based SoC will integrate Arm Neoverse compute subsystems (CSS) to deliver high performance and efficiency for data centers, infrastructure edge, and 5G networks. This collaboration brings together Faraday, Arm, and Intel Foundry Services. Faraday will leverage its ASIC design and IP solutions expertise to build the SoC. Arm will provide the Neoverse compute subsystem IP to enable scalable computing. Intel Foundry Services will manufacture the chip using its cutting-edge 18A process, which delivers one of the best-in-class transistor performance.

The new 64-core SoC will be a key component of Faraday's upcoming SoC evaluation platform. This platform aims to accelerate customer development of data center servers, high-performance computing ASICs, and custom SoCs. The platform will also incorporate interface IPs from the Arm Total Design ecosystem for complete implementation and verification. Both Arm and Intel Foundry Services expressed excitement about working with Faraday on this advanced Arm-based custom silicon project. "We're thrilled to see industry leaders like Faraday and Intel on the cutting edge of Arm-based custom silicon development," said an Arm spokesperson. Intel SVP Stuart Pann said, "We are pleased to work with Faraday in the development of the SoC based on Arm Neoverse CSS utilizing our most competitive Intel 18A process technology." The collaboration represents Faraday's strategic focus on leading-edge technologies to meet evolving application requirements. With its extensive silicon IP portfolio and design capabilities, Faraday wants to deliver innovative solutions and break into next-generation computing design.

Microsoft Introduces 128-Core Arm CPU for Cloud and Custom AI Accelerator

During its Ignite conference, Microsoft introduced a duo of custom-designed silicon made to accelerate AI and excel in cloud workloads. First of the two is Microsoft's Azure Cobalt 100 CPU, a 128-core design that features a 64-bit Armv9 instruction set, implemented in a cloud-native design that is set to become a part of Microsoft's offerings. While there aren't many details regarding the configuration, the company claims that the performance target is up to 40% when compared to the current generation of Arm servers running on Azure cloud. The SoC has used Arm's Neoverse CSS platform customized for Microsoft, with presumably Arm Neoverse N2 cores.

The next and hottest topic in the server space is AI acceleration, which is needed for running today's large language models. Microsoft hosts OpenAI's ChatGPT, Microsoft's Copilot, and many other AI services. To help make them run as fast as possible, Microsoft's project Athena now has the name of Maia 100 AI accelerator, which is manufactured on TSMC's 5 nm process. It features 105 billion transistors and supports various MX data formats, even those smaller than 8-bit bit, for maximum performance. Currently tested on GPT 3.5 Turbo, we have yet to see performance figures and comparisons with competing hardware from NVIDIA, like H100/H200 and AMD, with MI300X. The Maia 100 has an aggregate bandwidth of 4.8 Terabits per accelerator, which uses a custom Ethernet-based networking protocol for scaling. These chips are expected to appear in Microsoft data centers early next year, and we hope to get some performance numbers soon.

Socionext Announces Collaboration with Arm and TSMC on 2nm Multi-Core Leading CPU Chiplet Development

Socionext today announced a collaboration with Arm and TSMC for the development of an innovative power-optimized 32-core CPU chiplet in TSMCʼs 2 nm silicon technology, delivering scalable performance for hyperscale data center server, 5/6G infrastructure, DPU and edge-of- network markets.

The engineering samples are targeted to be available in 1H2025. This advanced CPU chiplet proof-of-concept using Arm Neoverse CSS technology is designed for single or multiple instantiations within a single package, along with IO and application-specific custom chiplets to optimize performance for a variety of end applications.

Arm and Synopsys Strengthen Partnership to Accelerate Custom Silicon on Advanced Nodes

Synopsys today announced it has expanded its collaboration with Arm to provide optimized IP and EDA solutions for the newest Arm technology, including the Arm Neoverse V2 platform and Arm Neoverse Compute Subsystem (CSS). Synopsys has joined Arm Total Design where Synopsys will leverage their deep design expertise, the Synopsys.ai full-stack AI-driven EDA suite, and Synopsys Interface, Security, and Silicon Lifecycle Management IP to help mutual customers speed development of their Arm-based CSS solutions. The expanded partnership builds on three decades of collaboration to enable mutual customers to quickly develop specialized silicon at lower cost, with less risk and faster time to market.

"With Arm Total Design, our aim is to enable rapid innovation on Arm Neoverse CSS and engage critical ecosystem expertise at every stage of SoC development," said Mohamed Awad, senior vice president and general manager, Infrastructure Line of Business at Arm. "Our deep technical collaboration with Synopsys to deliver pre-integrated and validated IP and EDA tools will help our mutual customers address the industry's most complex computing challenges with specialized compute."

NVIDIA Unveils Next-Generation GH200 Grace Hopper Superchip Platform With HMB3e

NVIDIA today announced the next-generation NVIDIA GH200 Grace Hopper platform - based on a new Grace Hopper Superchip with the world's first HBM3e processor - built for the era of accelerated computing and generative AI. Created to handle the world's most complex generative AI workloads, spanning large language models, recommender systems and vector databases, the new platform will be available in a wide range of configurations. The dual configuration - which delivers up to 3.5x more memory capacity and 3x more bandwidth than the current generation offering - comprises a single server with 144 Arm Neoverse cores, eight petaflops of AI performance and 282 GB of the latest HBM3e memory technology.

"To meet surging demand for generative AI, data centers require accelerated computing platforms with specialized needs," said Jensen Huang, founder and CEO of NVIDIA. "The new GH200 Grace Hopper Superchip platform delivers this with exceptional memory technology and bandwidth to improve throughput, the ability to connect GPUs to aggregate performance without compromise, and a server design that can be easily deployed across the entire data center."

China Hosts 40% of all Arm-based Servers in the World

The escalating challenges in acquiring high-performance x86 servers have prompted Chinese data center companies to accelerate the shift to Arm-based system-on-chips (SoCs). Investment banking firm Bernstein reports that approximately 40% of all Arm-powered servers globally are currently being used in China. While most servers operate on x86 processors from AMD and Intel, there's a growing preference for Arm-based SoCs, especially in the Chinese market. Several global tech giants, including AWS, Ampere, Google, Fujitsu, Microsoft, and Nvidia, have already adopted or developed Arm-powered SoCs. However, Arm-based SoCs are increasingly favorable for Chinese firms, given the difficulty in consistently sourcing Intel's Xeon or AMD's EPYC. Chinese companies like Alibaba, Huawei, and Phytium are pioneering the development of these Arm-based SoCs for client and data center processors.

However, the US government's restrictions present some challenges. Both Huawei and Phytium, blacklisted by the US, cannot access TSMC's cutting-edge process technologies, limiting their ability to produce competitive processors. Although Alibaba's T-Head can leverage TSMC's latest innovations, it can't license Arm's high-performance computing Neoverse V-series CPU cores due to various export control rules. Despite these challenges, many chip designers are considering alternatives such as RISC-V, an unrestricted, rapidly evolving open-source instruction set architecture (ISA) suitable for designing highly customized general-purpose cores for specific workloads. Still, with the backing of influential firms like AWS, Google, Nvidia, Microsoft, Qualcomm, and Samsung, the Armv8 and Armv9 instruction set architectures continue to hold an edge over RISC-V. These companies' support ensures that the software ecosystem remains compatible with their CPUs, which will likely continue to drive the adoption of Arm in the data center space.

NVIDIA Collaborates With SoftBank Corp. to Power SoftBank's Next-Gen Data Centers Using Grace Hopper Superchip for Generative AI and 5G/6G

NVIDIA and SoftBank Corp. today announced they are collaborating on a pioneering platform for generative AI and 5G/6G applications that is based on the NVIDIA GH200 Grace Hopper Superchip and which SoftBank plans to roll out at new, distributed AI data centers across Japan. Paving the way for the rapid, worldwide deployment of generative AI applications and services, SoftBank will build data centers that can, in collaboration with NVIDIA, host generative AI and wireless applications on a multi-tenant common server platform, which reduces costs and is more energy efficient.

The platform will use the new NVIDIA MGX reference architecture with Arm Neoverse-based GH200 Superchips and is expected to improve performance, scalability and resource utilization of application workloads. "As we enter an era where society coexists with AI, the demand for data processing and electricity requirements will rapidly increase. SoftBank will provide next-generation social infrastructure to support the super-digitalized society in Japan," said Junichi Miyakawa, president and CEO of SoftBank Corp. "Our collaboration with NVIDIA will help our infrastructure achieve a significantly higher performance with the utilization of AI, including optimization of the RAN. We expect it can also help us reduce energy consumption and create a network of interconnected data centers that can be used to share resources and host a range of generative AI applications."

NVIDIA Grace Drives Wave of New Energy-Efficient Arm Supercomputers

NVIDIA today announced a supercomputer built on the NVIDIA Grace CPU Superchip, adding to a wave of new energy-efficient supercomputers based on the Arm Neoverse platform. The Isambard 3 supercomputer to be based at the Bristol & Bath Science Park, in the U.K., will feature 384 Arm-based NVIDIA Grace CPU Superchips to power medical and scientific research, and is expected to deliver 6x the performance and energy efficiency of Isambard 2, placing it among Europe's most energy-efficient systems.

It will achieve about 2.7 petaflops of FP64 peak performance and consume less than 270 kilowatts of power, ranking it among the world's three greenest non-accelerated supercomputers. The project is being led by the University of Bristol, as part of the research consortium the GW4 Alliance, together with the universities of Bath, Cardiff and Exeter.

India Homegrown HPC Processor Arrives to Power Nation's Exascale Supercomputer

With more countries creating initiatives to develop homegrown processors capable of powering powerful supercomputing facilities, India has just presented its development milestone with Aum HPC. Thanks to information from the report by The Next Platform, we learn that India has developed a processor for powering its exascale high-performance computing (HPC) system. Called Aum HPC, the CPU was developed by the National Supercomputing Mission of the Indian government, which funded the Indian Institute of Science, the Department of Science and Technology, the Ministry of Electronics and Information Technology, and C-DAC to design and manufacture the Aum HPC processors and create strong, strong technology independence.

The Aum HPC is based on Armv8.4 CPU ISA and represents a chiplet processor. Each compute chiplet features 48 Arm Zeus Cores based on Neoverse V1 IP, so with two chiplets, the processor has 96 cores in total. Each core gets 1 MB of level two cache and 1 MB of system cache, for 96 MB L2 cache and 96 MB system cache in total. For memory, the processor uses 16-channel 32-bit DDR5-5200 with a bandwidth of 332.8 GB/s. To expand on that, HBM memory is present, and there is 64 GB of HBM3 with four controllers capable of achieving a bandwidth of 2.87 TB/s. As far as connectivity, the Aum HPC processor has 64 PCIe Gen 5 Lanes with CXL enabled. It is manufactured on a 5 nm node from TSMC. With a 3.0 GHz typical and 3.5+ GHz turbo frequency, the Aum HPC processor is rated for a TDP of 300 Watts. It is capable of producing 4.6+ TeraFLOPS per socket. Below are illustrations and tables comparing Aum HPC to Fujitsy A64FX, another Arm HPC-focused design.

Export Regulations Hinder China's Plans for Custom Arm-Based Processors

The United States has recently imposed several sanctions on technology exports to China. These sanctions are designed to restrict the transfer of specific technologies and sensitive information to Chinese entities, particularly those with ties to the Chinese military or government. The primary motivation behind these sanctions is to protect American national security interests, as well as to protect American companies from unfair competition. According to Financial Times, we have information that Chinese tech Giant, Alibaba, can not access Arm licenses for Neoverse V1 technology. Generally, the technology group where Neoverse V-series falls in is called Wassenaar -- multilateral export control regime (MECR) with 42 participating states. This agreement prohibits the sale of technology that could be used for military purposes.

The US argues that Arm's Neoverse V1 IP is not only a product from UK's Arm but a design made in the US as well, meaning that it is a US technology. Since Alibaba's T-Head group responsible for designing processors that go into Alibaba's cloud services can not use Neoverse V1, it has to look for alternative solutions. The Neoverse V1 and V2 can not be sold in China, while Neoverse N1 and N2 can. Alibaba's T-Head engineer argued, "We feel that the western world sees us as second-class people. They won't sell good products to us even if we have money."

Arm Announces Next-Generation Neoverse Cores for High Performance Computing

The demand for data is insatiable, from 5G to the cloud to smart cities. As a society we want more autonomy, information to fuel our decisions and habits, and connection - to people, stories, and experiences.

To address these demands, the cloud infrastructure of tomorrow will need to handle the coming data explosion and the effective processing of evermore complex workloads … all while increasing power efficiency and minimizing carbon footprint. It's why the industry is increasingly looking to the performance, power efficiency, specialized processing and workload acceleration enabled by Arm Neoverse to redefine and transform the world's computing infrastructure.

Microsoft Brings Ampere Altra Arm Processors to Azure Cloud Offerings

Microsoft is announcing the general availability of the latest Azure Virtual Machines featuring the Ampere Altra Arm-based processor. The new virtual machines will be generally available on September 1, and customers can now launch them in 10 Azure regions and multiple availability zones around the world. In addition, the Arm-based virtual machines can be included in Kubernetes clusters managed using Azure Kubernetes Service (AKS). This ability has been in preview and will be generally available over the coming weeks in all the regions that offer the new virtual machines.

Earlier this year, we launched the preview of the new general-purpose Dpsv5 and Dplsv5 and memory optimized Epsv5 Azure Virtual Machine series, built on the Ampere Altra processor. These new virtual machines have been engineered to efficiently run scale-out, cloud-native workloads. Since then, hundreds of customers have tested and experienced firsthand the excellent price-performance that the Arm architecture can provide for web and application servers, open-source databases, microservices, Java and.NET applications, gaming, media servers, and more. Starting today, all Azure customers can deploy these new virtual machines using the Azure portal, SDKs, API, PowerShell, and the command-line interface (CLI).

Alibaba Previews Home-Grown CPUs with 128 Armv9 Cores, DDR5, and PCIe 5.0 Technology

One of the largest cloud providers in China, Alibaba, has today announced a preview for a new instance powered by Yitian 710 processor. The new processor is a collection of Alibaba's efforts to develop a home-grown design capable of powering cloud instances and the infrastructure needed for it and its clients. Without much further ado, the Yitian 710 is based on Armv9 ISA and features 128 cores. Ramping up to 3.2 GHz, these cores are paired with eight-channel DDR5 memory to enable sufficient data transfer. In addition, the CPU supports 96 PCIe 5.0 lanes for IO with storage and accelerators. These are most likely custom designs, and we don't know if they are using a blueprint based on Arm's Neoverse. The CPU is manufactured at TSMC's facilities on 5 nm node and features 60 billion transistors.

Alibaba offers these processors as a part of their Elastic Compute Service (ECS) instance called g8m, where users can select 1/2/4/8/16/32/64/128 vCPUs, where each vCPU is equal to one CPU core physically. Alibaba is running this as a trial option and notes that users should not run production code on these instances, as they will disappear after two months. Only 100 instances are available for now, and they are based in Alibaba's Hangzhou zone in China. The company notes that instances based on Yitian 710 processors offer 100 percent higher efficiency than existing AMD/Intel solutions; however, they don't have any useful data to back it up. The Chinese cloud giant is likely trying to test and see if the home-grown hardware can satisfy the needs of its clients so that they can continue the path to self-sustainability.

NVIDIA Claims Grace CPU Superchip is 2X Faster Than Intel Ice Lake

When NVIDIA announced its Grace CPU Superchip, the company officially showed its efforts of creating an HPC-oriented processor to compete with Intel and AMD. The Grace CPU Superchip combines two Grace CPU modules that use the NVLink-C2C technology to deliver 144 Arm v9 cores and 1 TB/s of memory bandwidth. Each core is Arm Neoverse N2 Perseus design, configured to achieve the highest throughput and bandwidth. As far as performance is concerned, the only detail NVIDIA provides on its website is the estimated SPECrate 2017_int_base score of over 740. Thanks to the colleges over at Tom's Hardware, we have another performance figure to look at.

NVIDIA has made a slide about comparison with Intel's Ice Lake server processors. One Grace CPU Superchip was compared to two Xeon Platinum 8360Y Ice Lake CPUs configured in a dual-socket server node. The Grace CPU Superchip outperformed the Ice Lake configuration by two times and provided 2.3 times the efficiency in WRF simulation. This HPC application is CPU-bound, allowing the new Grace CPU to show off. This is all thanks to the Arm v9 Neoverse N2 cores pairing efficiently with outstanding performance. NVIDIA made a graph showcasing all HPC applications running on Arm today, with many more to come, which you can see below. Remember that NVIDIA provides this information, so we have to wait for the 2023 launch to see it in action.

ARM-based Server Penetration Rate to Reach 22% by 2025 with Cloud Data Centers Leading the Way, Says TrendForce

According to TrendForce research, corporate demand for digital transformation including artificial intelligence and high-performance computing has accelerated in recent years, which has led to increasing adoption of cloud computing. In order to improve service flexibility, the world's major cloud service providers have gradually introduced ARM-based servers. The penetration rate of ARM architecture in data center servers is expected to reach 22% by 2025.

In the past few years, ARM architecture processors have matured in the fields of mobile terminals and Internet of Things but progress in the server field has been relatively slow. However, companies have diversified cloud workloads in recent years and the market has begun to pay attention to the benefits ARM architecture processing can provide to data centers. TrendForce believes that ARM-based processors have three major advantages. First, they can support diverse and rapidly changing workloads and are more scalability and cost-effective. Second, ARM-based processors provide higher customization for different niche markets with a more flexible ecosystem. Third, physical footprint is relatively small which meets the needs of today's micro data centers.

Amazon Announces Arm Based Graviton3 Processors, Opens up EC2 C7g Preview Instances

As Amazon is continuing to grow its AWS business, both with new instances powered by AMD's third generation of EPYC processors and it's new EC2 C5g instances powered by its current Graviton2 processors and Nvidia's T4G tensor core GPUs. However, the company is also opening up its first EC2 C7g preview instances using its brand new Graviton3 processors, which the company claims offer vastly improved performance over the Graviton2 on specific workloads.

EC2 stands for Elastic Compute Cloud and judging by the fact that the Graviton3 is said to have up to twice as past FPU performance for scientific workloads and being twice as fast for cryptographic workloads and up to three times faster for machine learning workloads, you can guess who these new EC2 instances are intended for. Amazon didn't reveal much in terms of technical details about the Graviton3, but it'll utilize DDR5 memory, which makes it one of the first, if not the first server CPU to use DDR5. It's also said to use up to 60 percent less energy than the Graviton2, while delivering up to 25 percent more compute performance. It's implied that it uses the Arm v9 architecture and the Neoverse N2 Arm cores, although this hasn't been officially announced.

Arm Announces Neoverse N2 and V1 Server Platforms

The demands of data center workloads and internet traffic are growing exponentially, and new solutions are needed to keep up with these demands while reducing the current and anticipated growth of power consumption. But the variety of workloads and applications being run today means the traditional one-size-fits all approach to computing is not the answer. The industry demands flexibility; design freedom to achieve the right level of compute for the right application.

As Moore's Law comes to an end, solution providers are seeking specialized processing. Enabling specialized processing has been a focal point since the inception of our Neoverse line of platforms, and we expect these latest additions to accelerate this trend.

SiPearl to Manufacture its 72-Core Rhea HPC SoC at TSMC Facilities

SiPearl has this week announced their collaboration with Open-Silicon Research, the India-based entity of OpenFive, to produce the next-generation SoC designed for HPC purposes. SiPearl is a part of the European Processor Initiative (EPI) team and is responsible for designing the SoC itself that is supposed to be a base for the European exascale supercomputer. In the partnership with Open-Silicon Research, SiPearl expects to get a service that will integrate all the IP blocks and help with the tape out of the chip once it is done. There is a deadline set for the year 2023, however, both companies expect the chip to get shipped by Q4 of 2022.

When it comes to details of the SoC, it is called Rhea and it will be a 72-core Arm ISA based processor with Neoverse Zeus cores interconnected by a mesh. There are going to be 68 mesh network L3 cache slices in between all of the cores. All of that will be manufactured using TSMC's 6 nm extreme ultraviolet lithography (EUV) technology for silicon manufacturing. The Rhea SoC design will utilize 2.5D packaging with many IP blocks stitched together and HBM2E memory present on the die. It is unknown exactly what configuration of HBM2E is going to be present. The system will also see support for DDR5 memory and thus enable two-level system memory by combining HBM and DDR. We are excited to see how the final product looks like and now we wait for more updates on the project.

AWS Arm-based Graviton Processors Sees the Biggest Growth in Instance Share

Amazon Web Services (AWS), the world's largest cloud services provider, has launched its Graviton series of custom processors some time ago. With Graviton, AWS had a plan to bring down the costs of offering some cloud services both for the customer and for the company. By doing that, the company planned to attract new customers offering greater value, and that plan seems to be working out well. When AWS launched its first-generation Graviton processor, the company took everyone by surprise and showed that it is capable of designing and operating its custom processors. The Graviton series of processors is based on the Arm Instruction Set Architecture (ISA) and the latest Graviton 2 series uses Arm Neoverse N1 cores as the base.

Today, thanks to the data from Liftr Insights, we get to see just how many total AWS instances are Graviton based. The data is showing some rather impressive numbers for the period from June 2019, to August 2020. In that timeframe, Intel with its Xeon offerings has seen its presence decrease from 88% to 70%, while AMD has grown from 11% to 20% presence. And perhaps the greatest silent winner here is the Graviton processor, which had massive growth. In the same period, AWS increased Graviton instance number from making up only 1% of all instances, to make up 10% of all instances available. This is a 10-fold increase which is not a small feat, given that data center providers are very difficult when it comes to changing platforms.
Return to Keyword Browsing
Nov 21st, 2024 10:56 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts