News Posts matching #TDP

Return to Keyword Browsing

Chinese Loongson 3D5000 Features 32 Cores and is 4x Faster Than the Average Arm Chip

Amid the push for technology independence, Chinese companies are pushing out more products to satisfy the need for the rapidly soaring demand for domestic data processing silicon. Today, we have information that Chinese Loongson has launched a 3D5000 CPU with as many as 32 cores. Utilizing chiplet technology, the 3D5000 represents a combination of two 16-core 3C5000 processors based on LA464 cores, based on LoongArch ISA that follows the combination of RISC and MIPS ISA design principles. The new chip features 64 MB of L3 cache, supports eight-channel DDR4-3200 ECC memory achieving 50 GB/s, and has five HyperTransport (HT) 3.0 interfaces. The TDP configuration of the chip is officially 300 Watts; however, normal operation is usually at around 150 Watts, with LA464 cores running at 2 GHz.

Scaling of the new chip goes beyond the chiplet, and pours over into system, as 3D5000 supports 2P and 4P configurations, where a single motherboard can become a system of up to 128 cores. To connect them, Loongson uses a 7A2000 bridge chip that is reportedly 400% faster than the previous solution, although we have no information about the last chip bridge. Based on the LGA-4129 package, the chip size is 75.4x58.5×6.5 mm. Regarding performance, Loongson compares it to the average Arm chip that goes into smartphones and claims that its designs are up to four times faster. In SPEC2006, performance reaches 425 points, while maintaining a single TeraFLOP at dual-precision 64-bit format. On the other hand, the processor was built for security, as the chip has a custom hardware-baked security to prevent Spectre and Meltdown, has an on-package Trusted Platform Module (TPM), and has a secret China-made security algorithm with an embedded custom security module that does encryption and decryption at 5 Gbps.

Intel Issues Discontinuation Notice for Many 11th Gen Core Processors

Not entirely unexpected, Intel has started to discontinue its 11th Gen Core processors, also known as Tiger Lake. In a product change notification (PCN) the company has listed no less than five mobile and four desktop parts that the company will stop letting its customers order from the end of June this year, with the last shipment taking place at the end of January 2024.

The discontinued range covers everything from Core i3 to Core i9 models and the full range of discontinued models can be found in the screenshot below. It should be noted that the desktop parts are the B SKU parts that were for example found in Intel's NUC 11 Extreme and are 65 W TDP parts. Most of the mobile parts are still available in products being sold, albeit, most of those products being older SKUs that have been replaced by 12th and 13th Gen Core processors by now. None of the products in the PCN were available directly to end consumers to purchase as far as TPU is aware.

Intel Xeon W-3400/2400 "Sapphire Rapids" Processors Run First Benchmarks

Thanks to the attribution of Puget Systems, we have a preview of Intel's latest Xeon W-3400 and Xeon W-2400 workstation processors based on Sapphire Rapids core technology. Delivering up to 56 cores and 112 threads, these CPUs are paired with up to eight TeraBytes of eight-channel DDR5-4800 memory. For expansion, they offer up to 112 PCIe 5.0 lanes come with up to 350 Watt TDP; some models are unlocked for overclocking. This interesting HEDT family for workstation usage comes at a premium with an MSRP of $5,889 for the top-end SKU, and motherboard prices are also on the pricey side. However, all of this should come as no surprise given the expected performance professionals expect from these chips. Puget Systems has published test results that include: Photoshop, After Effects, Premiere Pro, DaVinci Resolve, Unreal Engine, Cinebench R23.2, Blender, and V-Ray. Note that Puget Systems said that: "While this post has been an interesting preview of the new Xeon processors, there is still a TON of testing we want to do. The optimizations Intel is working on is of course at the top, but there are several other topics we are highly interested in." So we expect better numbers in the future.
Below, you can see the comparison with AMD's competing Threadripper Pro HEDT SKUs, along with power usage using different Windows OS power profiles:

Inspur Announces G7 Server Platform Supports the Latest 4th Gen Intel Xeon Scalable Processors

Inspur Information, a leading IT infrastructure solutions provider, announced that its G7 server platform fully supports 4th Gen Intel Xeon Scalable Processors. The 16 servers making up the brand-new lineup are industry-leading in terms of performance, openness, intelligent operation & maintenance, and sustainability. Compared with the previous generation of Intel-based products, these servers have 61% higher performance and up to 30% higher computing performance per unit of power consumption. The server platform is designed to be deployed in general-purpose computing, critical computing, AI, and other application scenarios.

Inspur Information's brand-new G7 platform was designed with green technology, open-source solutions, security, and intelligence as priorities. It is an industry-leading example of system design, energy efficiency, and operation & maintenance management. G7 servers support diversified computing, with the most comprehensive product lineup in the industry. With a focus on green energy, this product line supports cold plate and immersion cooling schemes, and has unique cooling designs such as T-shaped radiator and advance heat detection with intelligent regulation, which all work together to reduce energy consumption by up to 30%. The new series also supports cloud operation and maintenance for intelligent fault diagnosis with an accuracy rate up to 95%.

Intel Announces 13th Gen Core "Raptor Lake" Mobile Processor Family

Intel today launched its 13th Generation Core "Raptor Lake" mobile processor family. Mobile (notebook) processors make the bulk of Intel's client processor sales, and so the company is targeting a variety of form-factors and markets. The series begins with the 13th Gen Core HX line of enthusiast-segment processors for gamers, on-site creators, and mobile workstations. These processors come with core-counts of up to 8P+16E, making them the first 24-core mobile processors. The chips are classified as having 55 W TDP, with maximum turbo power values set as high as 155 W.

The 13th Gen Core HX family includes three 8P+16E processor models, led by the Core i9-13980HX, a unique 8P+12E model leading the Core i7 pack, the i7-13700HX; some 8P+8E models, followed by 6P+8E and 6P+4E models making up the Core i5 lineup. All parts have 55 W base power, and 157 W max turbo power, at least three of these get the full vPro Enterprise feature-set.

AMD Ryzen 7000 non-X Series to Launch on January 10th

A few months ago, AMD has launched its highly anticipated Ryzen 7000 series of processors based on Zen 4 architecture. However, the company only launched the "X" SKUs (example being 7900X) for now, while the remaining ones are awaiting a launch date. Today, we have information from VideoCardz that confirm AMD's new launch on January 10th, when team red plans to update its remaining processor family with Ryzen 7000 series non-X SKUs. There will be three initial models to choose from Ryzen 9 7900 (12C/24T), Ryzen 7 7700 (8C/16T), and Ryzen 5 7600 (6C/12T). These SKUs follow the traditional Zen 4 path; however, the only distinction from their "X" counterparts is the reduced TDP to 65 Watts, down from up to 170 Watt TDP in some of those models.

A leaked slide from AMD's product presentation regarding these SKUs is a comparison between AMD's own Ryzen 9 5900X and Ryzen 9 7900, where the Zen 4 variant successfully beat the older SKU by a significant percentage. Pricing and further details are listed on the slides below.

AMD Ryzen 7000 non-X Processor SKUs Confirmed with 65W TDP, Boxed Coolers

Ahead of their market debut early January, we got confirmation of the specifications of the three upcoming AMD Ryzen 7000 series non-X processor SKUs. There will indeed only be three new SKUs, the 6-core/12-thread Ryzen 5 7600, the 8-core/16-thread Ryzen 7 7700, and the 12-core/24-thread Ryzen 9 7900; and no 16-core part. All three SKUs have their TDP rated at 65 W, which means that their PIB (processor in box) retail packages will include a stock cooling solution. The 7600 comes with a Wraith Stealth cooler that's capable of handling thermal loads of 65 W TDP processors at stock speeds; while the 7700 and 7900 will include a feature-packed Wraith Prism RGB cooler that's designed for 140 W TDP processors. Since Socket AM5 has cooler compatibility with AM4, AMD could simply be reusing the same coolers it packed with past-generation Ryzen processors.

The Ryzen 5 7600 comes with an MSRP of USD $229, clock speeds of up to 5.10 GHz boost, and targets the likes of the Intel Core i5-13600 or i5-12600. The $329 MSRP Ryzen 7 7700 ticks at speeds of up to 5.30 GHz boost, and is designed to compete with the Core i7-13700 or i7-12700. The Ryzen 9 7900 has an interesting price tag of $429 (MSRP), ticks at speeds of up to 5.40 GHz boost, and purportedly competes against the Core i9-13900 (non-K) and i9-12900. The three chips should be drop-in compatible with Socket AM5 motherboards being sold right now, likely with no need for a BIOS update. Although launch of these three SKUs in January is certain, the company might use the 2023 International CES keynote address by its CEO Dr Lisa Su to either tease or announce the Ryzen 7000X3D processors featuring 3D Vertical Cache memory, which is known to boost gaming performance.

NVIDIA GeForce RTX 4090 16 GB Laptop SKU Spotted in Next-Gen HP Omen 17 Laptop

According to the well-known hardware leaker @momomo_us, HP is preparing the launch of its next-generation Omen 17 gaming laptops. And with a new generation of chips coming to consumers, HP accidentally made some information about laptop SKUs public. Four models are listed, and they represent a combination of Intel's 13th-generation Raptor Lake mobile processors with NVIDIA's Ada Lovelace RTX 40 series graphics cards for the mobile/laptop sector. The four SKUs are: CM2007NQ/CM2005NQ with Core i7-13700HX & RTX 4060 8 GB; CM2001NQ with Core i7-13700HX & RTX 4070 8 GB; CK2007NQ/CK2004NQ with Core i7-13700HX & RTX 4080 12 GB; CK2001NQ with Core i7-13700HX & RTX 4090 16 GB.

The most exciting find here is the appearance of the xx90 series in the mobile/laptop form factor, which has not been the case before. The GeForce RTX 4090 laptop edition is supposedly equipped with 16 GB of VRAM, and the GPU SKU should be a cut-down version of AD102 GPU adjusted for power and clock constraints so it can run within a reasonable TDP. With NVIDIA seemingly giving its clients an RTX 4090 SKU option, we have to wait and see what the CUDA core counts are and how clocks scale in a more restricted laptop environment.

AMD's Navi 31 Might Clock to 3 GHz, Partner Cards Will be Able to Overclock

Based on details from a PCWorld livestream following AMD's launch of the Radeon RX 7000-series, it was revealed that AMD has designed the Navi 31 GPU to be able to scale as high as 3 GHz. In other words, it appears that AMD has power limited its cards, at least for the SKUs that the company has announced so far. This could be for many reasons, but most likely to try to find a balance between power and performance. The details of the 3 GHz scaling did however not come from AMD directly, but rather from Jarred Walton over at Tom's Hardware. That said, the information was apparently shared with the media by AMD at the event.

In the livestream, it was also confirmed that partner cards will be able to overclock, so expect to see some factory overclocked cards, with higher power draw. This could be why, in part, that ASUS went with a much larger cooler on its TUF Gaming Radeon RX 7900-series cards. As ASUS didn't reveal any clock speeds or TDPs of its two cards, we don't really know what to expect, but we'd be surprised if these cards weren't factory overclocked to some degree when they launch in December.

48-Core Russian Baikal-S Processor Die Shots Appear

In December of 2021, we covered the appearance of Russia's home-grown Baikal-S processor, which has 48 cores based on Arm Cortex-A75 cores. Today, thanks to the famous chip photographer Fritzchens Fritz, we have the first die shows that show us exactly how Baikal-S SoC is structured internally and what it is made up of. Manufactured on TSMC's 16 nm process, the Baikal-S BE-S1000 design features 48 Arm Cortex-A75 cores running at a 2.0 GHz base and a 2.5 GHz boost frequency. With a TDP of 120 Watts, the design seems efficient, and the Russian company promises performance comparable to Intel Skylake Xeons or Zen1-based AMD EPYC processors. It also uses a home-grown RISC-V core for management and controlling secure boot sequences.

Below, you can see the die shots taken by Fritzchens Fritz and annotated details by Twitter user Locuza that marked the entire SoC. Besides the core clusters, we see that a slum of cache connects everything, with six 72-bit DDR4-3200 PHYs and memory controllers surrounding everything. This model features a pretty good selection of I/O for a server CPU, as there are five PCIe 4.0 x16 (4x4) interfaces, with three supporting CCIX 1.0. You can check out more pictures below and see the annotations for yourself.

NVIDIA Introduces L40 Omniverse Graphics Card

During its GTC 2022 session, NVIDIA introduced its new generation of gaming graphics cards based on the novel Ada Lovelace architecture. Dubbed NVIDIA GeForce RTX 40 series, it brings various updates like more CUDA cores, a new DLSS 3 version, 4th generation Tensor cores, 3rd generation Ray Tracing cores, and much more, which you can read about here. However, today, we also got a new Ada Lovelace card intended for the data center. Called the L40, NVIDIA updated its previous Ampere-based A40 design. While the NVIDIA website provides sparse, the new L40 GPU uses 48 GB GDDR6 memory with ECC error correction. Using NVLink, you can get 96GBs of VRAM. Paired with an unknown SKU, we assume that it uses AD102 with adjusted frequencies to lower the TDP and allow for passive cooling.

NVIDIA is calling this their Omniverse GPU, as it is a part of the push to separate its GPUs used for graphics and AI/HPC models. The "L" model in the current product stack is used to accelerate graphics, with display ports installed on the GPU, while the "H" models (H100) are there to accelerate HPC/AI installments where visual elements are a secondary task. This is a further separation of the entire GPU market, where the HPC/AI SKUs get their own architecture, and GPUs for graphics processing are built on a new architecture as well. You can see the specifications provided by NVIDIA below.

AMD Ryzen 7000 Undervolting Yields Great Results with Temperatures

AMD Ryzen 7000 "Zen 4" processors can hit up to 95 °C at stock settings, with cooling most appropriate to the TDP level. This is because the PPT (package power tracking) limits for the 170 W TDP processors is as high as 230 W, and for the 105 W TDP models, it's 130 W. After reaching this temperature threshold, the processor begins to downclock itself to lower temperatures. Harukaze5719 discovered that higher than needed core voltages could be at play, and manually undervolting the processors could free up significant thermal headroom, letting the processors hold on to higher boost multipliers better.

Intel Unveils Arc Pro Graphics Cards for Workstations and Professional Software

Intel has today unveiled another addition to its discrete Arc Alchemist graphics card lineup, with a slight preference to the professional consumer market. Intel has prepared three models for creators and entry pro-vis solutions, called Intel Arc Pro graphics cards. All GPUs are AV1 accelerated, have ray tracing support, and are designed to handle AI acceleration inside applications like Adobe Premiere Pro. At the start, we have a small A30M mobile GPU aimed at laptop designs. It has a 3.5 TeraFLOP FP32 capability inside a configurable 35-50 Watt TDP envelope, has eight ray tracing cores, and 4 GB of GDDR6 memory. Its display output connectors depend on OEM's laptop design.

Next, we have the Arc A40 Pro discrete single-slot GPU. Having 3.5 TeraFLOPs of FP32 single-precision performance, it has eight ray tracing cores and 6 GB of GDDR6 memory. The listed maximum TDP for this model is 50 Watts. It has four mini-DP ports for video output, and it can drive two monitors at 8K 60 Hz, one at 5K 240 Hz, two at 5K 120 Hz, or four at 4K 60 Hz refresh rate. Its bigger brother, the Arc A50 Pro, is a dual-slot design with 4.8 TeraFLOPs of single-precision FP32 computing, has eight ray tracing cores, and 6 GB of GDDR6 memory as well. It has the same video output capability as the Arc A40 Pro, with a beefier cooling setup to handle the 75 Watt TDP. All software developed using the OneAPI toolkit can be accelerated using these GPUs. Intel is working with the industry to adapt professional software for Arc Pro graphics.

Potential Ryzen 7000-series CPU Specs and Pricing Leak, Ryzen 9 7950X Expected to hit 5.7 GHz

It's pretty clear that we're getting very close to the launch of AMD's AM5 platform and the Ryzen 7000-series CPUs, with spec details and even pricing brackets tipping up online. Wccftech has posted what the publication believes will be the lineup we can expect to launch in just over a month's time, if rumours are to be believed. The base model is said to be the Ryzen 5 7600X, which the site claims will have a base clock of 4.7 GHz and a boost clock of 5.3 GHz. There's no change in processor core or thread count compared to the current Ryzen 5 5600X, but the L2 cache appears to have doubled, for a total of 38 MB of cache. This is followed by the Ryzen 7 7700X, which starts out a tad slower with a base clock of 4.5 GHz, but it has a slightly higher boost clock of 5.4 GHz. Likewise here, the core and thread count remains unchanged, while the L2 cache also gets a bump here for a total of 40 MB cache. Both these models are said to have a 105 W TDP.

The Ryzen 9 7900X is said to have a 4.7 GHz base clock and a 5.6 GHz boost clock, so a 200 MHz jump up from the Ryzen 7 7700X. This CPU has a total of 76 MB of cache. Finally the Ryzen 9 7950X is said to have the same base clock of 4.5 GHz as the Ryzen 7 7700X, but it has the highest boost clock of all the expected models at 5.7 GHz, while having a total of 80 MB cache. These two SKUs are both said to have a 170 W TDP. Price wise, from top to bottom, we might be looking at somewhere around US$700, US$600, US$300 and US$200, so it seems like AMD has adjusted its pricing downwards by around $100 on the low-end, with the Ryzen 7 part fitting the same price bracket as the Ryzen 7 5700X. The Ryzen 9 7900X seems to have had its price adjusted upwards slightly, while the Ryzen 9 7950X seems to be expected to be priced lower than its predecessors. Take these things with the right helping of scepticism for now, as things can still change before the launch.

AMD Clarifies Ryzen 7000 "Zen 4" TDP and Power Limits: 170W TDP, 230W PPT

The mention of "170 W" in one of the slides of AMD's Computex 2022 reveal of the upcoming Ryzen 7000 "Zen 4" desktop processors, caused quite some confusion as to what that figure meant. AMD issued a structured clarification on the matter, laying to rest the terminology associated with it. Apparently, there will be certain SKUs of Socket AM5 processors with TDP of 170 W. This would be the same classical definition of TDP that AMD has been consistently using. The package-power tracking (PPT), a figure that translates as power limit for the socket, is 230 W.

This does not necessarily mean that there will be a Ryzen 7000-series SKU with 170 W TDP. AMD plans to give AM5 a similar life-cycle to AM4, which is now spanning five generations of Ryzen processors, and the 170 W TDP and 230 W PPT figures only denote design goals for the socket. AMD, in a statement, explained why it needed to make AM5 capable of delivering much higher power than AM4 could—to enable higher CPU core-counts in the future, more on-package hardware, and for new capabilities like power-hungry instruction-sets (think AVX-512). AMD has been calculating PPT as 1.35 times TDP, since the very first generation of Ryzen chips. For a 105 W TDP processor, this means 140 W PPT, and the same formula continues with Ryzen 7000 series (230 W is 1.35x 170 W).
The AMD statement follows.

Russia to Use Chinese Zhaoxin x86 Processors Amidst Restrictions to Replace Intel and AMD Designs

Many companies, including Intel and AMD, have stopped product shipments to Russia amidst the war in Ukraine in the past few months. This has left the Russian state without any new processors from the two prominent x86 designers, thus slowing down the country's technological progress. To overcome this issue, it seems like the solution is embedded in the Chinese Zhaoxin x86 CPUs. According to the latest report from Habr, a motherboard designer called Dannie is embedding Chinese Zhaoxin x86 CPUs into motherboards to provide the motherland with an x86-capable processor. More precisely, the company had designed a BX-Z60A micro-ATX motherboard that embeds Zhaoxin's KaiXian KX-6640MA SoC with eight cores based on LuJiaZui microarchitecture. The SoC is clocked at a frequency range of 2.1-2.7 GHz, carries 4 MB of L2 cache, 16 lanes of PCIe 3.0, and has integrated graphics, all in a 25 Watt TDP.

As far as the motherboard is concerned, it supports two DDR4 memory slots, two PCIe x16 connectors, M.2-2280 and M.2-2230 slots, and three SATA III connectors for storage. For I/O you have USB ports, DisplayPort, HDMI, VGA/D-Sub, GbE, 3.5-mm audio, and additional PS/2 ports. This is a pretty decent selection; however, we don't know the pricing structure. A motherboard with KaiXian KX-6640MA SoC like this is certainly not cheap, so we are left to wonder if this will help Russian users deal with the newly imposed restriction on importing US tech.

NVIDIA GeForce RTX 4090 Twice as Fast as RTX 3090, Features 16128 CUDA Cores and 450W TDP

NVIDIA's next-generation GeForce RTX 40 series of graphics cards, codenamed Ada Lovelace, is shaping up to be a powerful graphics card lineup. Allegedly, we can expect to see a mid-July launch of NVIDIA's newest gaming offerings, where customers can expect some impressive performance. According to a reliable hardware leaker, kopite7kimi, NVIDIA GeForce RTX 4090 graphics card will feature AD102-300 GPU SKU. This model is equipped with 126 Streaming Multiprocessors (SMs), which brings the total number of FP32 CUDA cores to 16128. Compared to the full AD102 GPU with 144 SMs, this leads us to think that there will be an RTX 4090 Ti model following up later as well.

Paired with 24 GB of 21 Gbps GDDR6X memory, the RTX 4090 graphics card has a TDP of 450 Watts. While this number may appear as a very power-hungry design, bear in mind that the targeted performance improvement over the previous RTX 3090 model is expected to be a two-fold scale. Paired with TSMC's new N4 node and new architecture design, performance scaling should follow at the cost of higher TDPs. These claims are yet to be validated by real-world benchmarks of independent tech media, so please take all of this information with a grain of salt and wait for TechPowerUp reviews once the card arrives.

NVIDIA GeForce RTX 3090 Ti Gets Custom 890 Watt XOC BIOS

Extreme overclocking is an enthusiast discipline where overclockers try to push their hardware to extreme limits. Combining powerful cooling solutions like liquid nitrogen (LN2), which reaches sub-zero temperatures alongside modified hardware, the silicon can output tremendous power. Today, we are witnessing a custom XOC (eXtreme OverClocking) BIOS for the NVIDIA GeForce RTX 3090 Ti graphics card that can push the GA102 SKU to impressive 890 Watts of power, representing almost a two-fold increase to the stock TDP. Enthusiasts pursuing large frequencies with their RTX 3090 Ti are likely users of this XOC BIOS. However, most likely, we will see GALAX HOF or EVGA KINGPIN cards with dual 16-pin power connectors utilize this.

As shown below, MEGAsizeGPU, the creator of this BIOS, managed to push his ASUS GeForce RTX 3090 Ti TUF with XOC BIOS to 615 Watts, so KINGPIN and HOF designs will have to be used to draw all the possible heat. The XOC BIOS was uploaded to our VGA BIOS database, however, caution is advised as this can break your graphics card.

AMD Claims Radeon RX 6500M is Faster Than Intel Arc A370M Graphics

A few days ago, Intel announced its first official discrete graphics card efforts, designed for laptops. Called the Arc Alchemist lineup, Intel has designed these SKUs to provide entry-level to high-end options covering a wide range of use cases. Today, AMD has responded with a rather exciting Tweet made by the company's @Radeon Twitter account. The company compared Intel's Arc Alchemist A370M GPU with AMD's Radeon RX 6500M mobile SKUs in the post. These GPUs are made on TSMC's N6 node, feature 4 GB GDDR6 64-bit memory, 1024 FP32 cores, and have the same configurable TDP range of 35-50 Watts.

Below, you can see AMD's benchmarks of the following select games: Hitman 3, Total War Saga: Troy, F1 2021, Strange Brigade (High), and Final Fantasy XIV. The Radeon RX 6500M GPU manages to win in all of these games, thus explaining AMD's "FTW" hashtag on Twitter. Remember that these are vendor-supplied benchmarks runs, so we have to wait for some media results to surface.

AMD Ryzen 7000 Series "Raphael" Processors to Come with up to 170 Watt TDP for 16-Core SKUs

AMD is slowly preparing to transition its consumer base into a new platform and processor architecture with the launch of Ryzen 7000 series processors codenamed Raphael. Based on the new AM5 LGA socket, these processors will come with up to 16 cores and 32 threads at the top-end configurations. Thanks to the latest round of rumors, we managed to find out just what TDP rating two SKUs will carry. According to a well-known leaker @graymon55, AMD is rating the 12-core SKU with a TDP of 105 Watts. On the other hand, the top-end 16-core 7000 series SKU replacing the current Ryzen 9 5950X will carry a large TDP of 170 Watts.

The 170 Watt TDP configuration will likely require better cooling efforts. AMD will probably advise users to invest in better cooling solutions, such as AIO liquid coolers or giant air coolers.

NVIDIA H100 is a Compute Monster with 80 Billion Transistors, New Compute Units and HBM3 Memory

During the GTC 2022 keynote, NVIDIA announced its newest addition to the accelerator cards family. Called NVIDIA H100 accelerator, it is the company's most powerful creation ever. Utilizing 80 billion of TSMC's 4N 4 nm transistors, H100 can output some insane performance, according to NVIDIA. Featuring a new fourth-generation Tensor Core design, it can deliver a six-fold performance increase compared to A100 Tensor Cores and a two-fold MMA (Matrix Multiply Accumulate) improvement. Additionally, new DPX instructions accelerate Dynamic Programming algorithms up to seven times over the previous A100 accelerator. Thanks to the new Hopper architecture, the Streaming Module structure has been optimized for better transfer of large data blocks.

The full GH100 chip implementation features 144 SMs, and 128 FP32 CUDA cores per SM, resulting in 18,432 CUDA cores at maximum configuration. The NVIDIA H100 GPU with SXM5 board form-factor features 132 SMs, totaling 16,896 CUDA cores, while the PCIe 5.0 add-in card has 114 SMs, totaling 14,592 CUDA cores. As much as 80 GB of HBM3 memory surrounds the GPU at 3 TB/s bandwidth. Interestingly, the SXM5 variant features a very large TDP of 700 Watts, while the PCIe card is limited to 350 Watts. This is the result of better cooling solutions offered for the SXM form-factor. As far as performance figures are concerned, the SXM and PCIe versions provide two distinctive figures for each implementation. You can check out the performance estimates in various precision modes below. You can read more about the Hopper architecture and what makes it special in this whitepaper published by NVIDIA.
NVIDIA H100

AMD Announces Zen 3 Threadripper 5000, but only for Professionals

AMD today launched its first Ryzen Threadripper processors based on the "Zen 3" microarchitecture, with the Ryzen Threadripper PRO 5000WX series. Designed to be drop-in compatible with workstations and motherboards based on the AMD WRX80 chipset, these processors come in core-counts of up to 64-core/128-thread, with an enormous I/O offering that includes 8-channel DDR4 memory with ECC support, and a 128-lane PCI-Express 4.0 root complex. The biggest change over the previous generation Threadripper PRO 3000WX series has to be the use of "Zen 3" CCDs, each with 8 CPU cores, sharing a common 32 MB of L3 cache. AMD isn't using the "Zen 3" chiplets with 3DV Cache.

The full AMD PRO management feature-set from Ryzen PRO is available on these processors, including PRO Security, PRO Management, and a special support channel that includes planned parts and software availability. What's more, AMD has been working with ISVs of most professional content-creation software since the past generation of Ryzen Threadripper PRO, to optimize their software for the processors (high core-counts, NUMA topology, etc.). The benefits of these are shared with all generations of Threadrippers. Although all parts in the Threadripper PRO 5000WX series are rated for a TDP of 280 W, AMD claims to have worked on power-management, offering up to 67 percent lower power per core, compared to the competition (2P Xeon Scalable Platinum 8280).

Intel Details Ponte Vecchio Accelerator: 63 Tiles, 600 Watt TDP, and Lots of Bandwidth

During the International Solid-State Circuits Conference (ISSCC) 2022, Intel gave us a more significant look at its upcoming Ponte Vecchio HPC accelerator and how it operates. So far, Intel convinced us that the company created Ponte Vecchio out of 47 tiles glued together in one package. However, the ISSCC presentation shows that the accelerator is structured rather interestingly. There are 63 tiles in total, where 16 are reserved for compute, eight are used for RAMBO cache, two are Foveros base tiles, two represent Xe-Link tiles, eight are HBM2E tiles, and EMIB connection takes up 11 tiles. This totals for about 47 tiles. However, an additional 16 thermal tiles used in Ponte Vecchio regulate the massive TDP output of this accelerator.

What is interesting is that Intel gave away details of the RAMBO cache. This novel SRAM technology uses four banks of 3.75 MB groups total of 15 MB per tile. They are connected to the fabric at 1.3 TB/s connection per chip. In contrast, compute tiles are connected at 2.6 TB/s speeds to the chip fabric. With eight RAMBO cache tiles, we get an additional 120 MB SRAM present. The base tile is a 646 mm² die manufactured in Intel 7 semiconductor process and contains 17 layers. It includes a memory controller, the Fully Integrated Voltage Regulators (FIVR), power management, 16-lane PCIe 5.0 connection, and CXL interface. The entire area of Ponte Vecchio is rather impressive, as 47 active tiles take up 2,330 mm², whereas when we include thermal dies, the total area jumps to 3,100 mm². And, of course, the entire package is much larger at 4,844 mm², connected to the system with 4,468 pins.

NVIDIA GeForce RTX 3080 12 GB Edition Rumored to Launch on January 11th

During the CES 2022 keynote, we have witnessed NVIDIA update its GeForce RTX 30 series family with GeForce RTX 3050 and RTX 3090 Ti. However, this is not an end to NVIDIA's updates to the Ampere generation, as we now hear industry sources from Wccftech suggest that we could see a GeForce RTX 3080 GPU with 12 GB of GDDR6X VRAM enabled, launched as a separate product. Compared to the regular RTX 3080 that carries only 10 GB of GDDR6X, the new 12 GB version is supposed to bring a slight bump up to the specification list. The GA102-220 GPU SKU found inside the 12 GB variant will feature 70 SMs with 8960 CUDA, 70 RT cores, and 280 TMUs.

This represents a minor improvement over the regular GA102-200 silicon inside the 8 GB model. However, the significant difference is the memory organization. With the new 12 GB model, we have a 384-bit memory bus allowing GDDR6X modules to achieve a bandwidth of 912 GB/s, all while running at 19 Gbps speeds. The overall TDP will also receive a bump to 350 Watts, compared to 320 Watts of the regular RTX 3080 model. For more information regarding final clock speeds and pricing, we have to wait for the alleged launch date - January 11th.

AMD Zen 4 "Raphael" Processors Feature Improved Thermal Sensors and Power Management

AMD is slowly preparing the launch of the latest and greatest Ryzen processor family based on the Zen 4 CPU core design. Among various things that are getting an overhaul, the Raphael processor generation is now getting revamped temperature reading and better power management circuitry. According to an Igor's Lab report, AMD has prepared a few new improvements that will make temperature reading and power management easier for PC enthusiasts. Currently, the reported CPU temperature is called Tcontrol (Tctl), which is what the cooling solution sees. If Tctl is high, the fans spin up and cool the system. If Tctl is low, the fans slow down to reduce noise.

With Raphael, the CUR_TEMP (current temperature) output part of Tctl has been upgraded to reflect a much smoother curve, and avoid jittering with fans as they are not spiking so suddenly anymore. This is helping contribute to the noise output and has made it run at a consistent fan speed in the system. Another note about Raphael is a new power management technique. AMD has designed the AM5 platform to avoid sudden power spikes, to maintain maximum efficiency over time. It is a design decision made from the very start, and the CPU will try to constrain itself in the TDP range that it is configured for. For more details about the circuitry, please head over to the Igor's Lab article.
Return to Keyword Browsing
Nov 21st, 2024 06:35 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts