News Posts matching #TOP

Return to Keyword Browsing

NVIDIA Advertises "Premium AI PC" Mocking the Compute Capability of Regular AI PCs

According to the report from BenchLife, NVIDIA has started the marketing campaign push for "Premium AI PC," squarely aimed at the industry's latest trend pushed by Intel, AMD, and Qualcomm for an "AI PC" system, which features a dedicated NPU for processing smaller models locally. NVIDIA's approach comes from a different point of view: every PC with an RTX GPU is a "Premium AI PC," which holds a lot of truth. Generally, GPUs (regardless of the manufacturer) hold more computing potential than the CPU and NPU combined. With NVIDIA's push to include Tensor cores in its GPUs, the company is preparing for next-generation software from vendors and OS providers that will harness the power of these powerful silicon pieces and embed more functionality in the PC.

At the Computex event in Taiwan, there should be more details about Premium AI PCs and general AI PCs. In its marketing materials, NVIDIA compares AI PCs to its Premium AI PCs, which have enhanced capabilities across various applications like image/video editing and upscaling, productivity, gaming, and developer applications. Another relevant selling point is the user base for these Premium AI PCs, which NVIDIA touts to be 100 million users. Those PCs support over 500 AI applications out of the box, highlighting the importance of proper software support. NVIDIA's systems are usually more powerful, with GeForce RTX GPUs reaching anywhere from 100-1300+ TOPS, compared to 40 TOPS of AI PCs. How other AI PC makers plan to fight in the AI PC era remains to be seen, but there is a high chance that this will be the spotlight of the upcoming Computex show.

PC Market Returns to Growth in Q1 2024 with AI PCs to Drive Further 2024 Expansion

Global PC shipments grew around 3% YoY in Q1 2024 after eight consecutive quarters of declines due to demand slowdown and inventory correction, according to the latest data from Counterpoint Research. The shipment growth in Q1 2024 came on a relatively low base in Q1 2023. The coming quarters of 2024 will see sequential shipment growth, resulting in 3% YoY growth for the full year, largely driven by AI PC momentum, shipment recovery across different sectors, and a fresh replacement cycle.

Lenovo's PC shipments were up 8% in Q1 2024 off an easy comparison from last year. The brand managed to reclaim its 24% share in the market, compared to 23% in Q1 2023. HP and Dell, with market shares of 21% and 16% respectively, remained flattish, waiting for North America to drive shipment growth in the coming quarters. Apple's shipment performance was also resilient, with the 2% growth mainly supported by M3 base models.

Sony PlayStation 5 Pro Details Emerge: Faster CPU, More System Bandwidth, and Better Audio

Sony is preparing to launch its next-generation PlayStation 5 Pro console in the Fall of 2024, right around the holidays. We previously covered a few graphics details about the console. However, today, we get more details about the CPU and the overall system, thanks to the exclusive information from Insider Gaming. Starting off, the sources indicate that PS5 Pro system memory will get a 28% bump in bandwidth, where the standard PS5 console had 448 GB/s, and the upgraded PS5 Pro will get 576 GB/s. Apparently, the memory system is more efficient, likely coming from an upgrade in memory from the GDDR6 SDRAM of the regular PS5. The next upgrade is the CPU, which has special modes for the main processor. The CPU uArch is likely the same, with clocks pushed to 3.85 GHz, resulting in a 10% frequency increase.

However, this is only achieved in the "High CPU Frequency Mode," which steals the SoC's power from the GPU and downclocks it slightly to allocate more power to the CPU in highly CPU-intense settings. The GPU we discussed here is an RDNA 3 IP with up to 45% faster graphics rendering. The ray tracing performance can be up to four times higher than the regular PS5, while the entire GPU delivers 33.5 TeraFLOPS of FP32 single-precision computing. This comes from 30 WGP running BVH8 shaders vs the 18 WGPs running BVH4 shaders on the regular PS5. There are PSSR upscalers present, and the GPU can output 8K resolution, which will come with future software updates. Last but not least, the AI front also has a custom AI accelerator capable of 300 8-bit INT8 TOPS and 67 16-bit FP16 TeraFLOPS. Audio codecs are getting some love, as well, with ACV running up to 35% faster.

Groq LPU AI Inference Chip is Rivaling Major Players like NVIDIA, AMD, and Intel

AI workloads are split into two different categories: training and inference. While training requires large computing and memory capacity, access speeds are not a significant contributor; inference is another story. With inference, the AI model must run extremely fast to serve the end-user with as many tokens (words) as possible, hence giving the user answers to their prompts faster. An AI chip startup, Groq, which was in stealth mode for a long time, has been making major moves in providing ultra-fast inference speeds using its Language Processing Unit (LPU) designed for large language models (LLMs) like GPT, Llama, and Mistral LLMs. The Groq LPU is a single-core unit based on the Tensor-Streaming Processor (TSP) architecture which achieves 750 TOPS at INT8 and 188 TeraFLOPS at FP16, with 320x320 fused dot product matrix multiplication, in addition to 5,120 Vector ALUs.

Having massive concurrency with 80 TB/s of bandwidth, the Groq LPU has 230 MB capacity of local SRAM. All of this is working together to provide Groq with a fantastic performance, making waves over the past few days on the internet. Serving the Mixtral 8x7B model at 480 tokens per second, the Groq LPU is providing one of the leading inference numbers in the industry. In models like Llama 2 70B with 4096 token context length, Groq can serve 300 tokens/s, while in smaller Llama 2 7B with 2048 tokens of context, Groq LPU can output 750 tokens/s. According to the LLMPerf Leaderboard, the Groq LPU is beating the GPU-based cloud providers at inferencing LLMs Llama in configurations of anywhere from 7 to 70 billion parameters. In token throughput (output) and time to first token (latency), Groq is leading the pack, achieving the highest throughput and second lowest latency.

TOP500 Update Shows No Exascale Yet, Japanese Fugaku Supercomputer Still at the Top

The 58th annual edition of the TOP500 saw little change in the Top10. The Microsoft Azure system called Voyager-EUS2 was the only machine to shake up the top spots, claiming No. 10. Based on an AMD EPYC processor with 48 cores and 2.45GHz working together with an NVIDIA A100 GPU and 80 GB of memory, Voyager-EUS2 also utilizes a Mellanox HDR Infiniband for data transfer.

While there were no other changes to the positions of the systems in the Top10, Perlmutter at NERSC improved its performance to 70.9 Pflop/s. Housed at the Lawrence Berkeley National Laboratory, Perlmutter's increased performance couldn't move it from its previously held No. 5 spot.

Watercool Unveils Heatkiller Multitop X2 and X3 DDC Pumps and Expansion Top

The new HEATKILLER Multitop series for Lowara/Laing DDC pumps features a unique modular design and expandability. Each top is manufactured from a solid, 30 mm thick acetal block while the internal structure has been optimized for head and flow capacity. The series includes a SINGLETOP, a MULTITOP X2 as well as MULTITOP X3 and an EXPANSION-TOP. The SINGLETOP is designed to operate one DDC pump and, unlike the MULTITOPS, cannot be expanded.

The expandable MULTITOP X2 and X3 are designed to operate two respectively three DDC pumps in series, thereby significantly increasing the delivery head. For the MULTITOP X3, an EXPANSION TOP serves as connection between input and output. An expansion with additional EXPANSION TOPs is easily possible with both the MULTITOP X3 and X2, theoretically allowing infinite scaling.

TOP500 Expands Exaflops Capacity Amidst Low Turnover

The 56th edition of the TOP500 saw the Japanese Fugaku supercomputer solidify its number one status in a list that reflects a flattening performance growth curve. Although two new systems managed to make it into the top 10, the full list recorded the smallest number of new entries since the project began in 1993.

The entry level to the list moved up to 1.32 petaflops on the High Performance Linpack (HPL) benchmark, a small increase from 1.23 petaflops recorded in the June 2020 rankings. In a similar vein, the aggregate performance of all 500 systems grew from 2.22 exaflops in June to just 2.43 exaflops on the latest list. Likewise, average concurrency per system barely increased at all, growing from 145,363 cores six months ago to 145,465 cores in the current list.

Qualcomm Announces First Shipments of Qualcomm Cloud AI 100 Accelerator and Edge Development Kit

Qualcomm Technologies, Inc., a subsidiary of Qualcomm Incorporated, announced the Qualcomm Cloud AI 100, a high-performance AI inference accelerator, is shipping to select worldwide customers. Qualcomm Cloud AI 100 uses advanced signal processing and cutting-edge power efficiency to support AI solutions for multiple environments including the datacenter, cloud edge, edge appliance, and 5G infrastructure. The newly announced Qualcomm Cloud AI 100 Edge Development Kit is engineered to accelerate adoption of edge applications by offering a complete system solution for AI processing up to 24 simultaneous 1080p video streams along with 5G connectivity.

"Qualcomm Technologies is well positioned to support complete edge-to-cloud high performance AI solutions that lead the industry in performance per watt," said Keith Kressin, senior vice president and general manager, computing and edge cloud, Qualcomm Technologies. "Qualcomm Cloud AI 100 is now shipping to select worldwide customers and we look forward to seeing commercial products launch in the first half of 2021."

NVIDIA Responds to Tesla's In-house Full Self-driving Hardware Development

Tesla held an investor panel in the USA yesterday (April 22) with the entire event, focusing on autonomous vehicles, also streamed on YouTube (replay here). There were many things promised in the course of the event, many of which are outside the scope of this website, but the announcement of Tesla's first full self-driving hardware module made the news in more ways than one as reported right here on TechPowerUp. We had noted how Tesla had traditionally relied on NVIDIA (and then Intel) microcontroller units, as well as NVIDIA self-driving modules in the past, but the new in-house built module had stepped away from the green camp in favor of more control over the feature set.

NVIDIA was quick to respond to this, saying Tesla was incorrect in their comparisons, in that the NVIDIA Drive Xavier at 21 TOPS was not the right comparison, and rather it should have been against NVIDIA's own full self-driving hardware the Drive AGX Pegasus capable of 320 TOPS. Oh, and NVIDIA also claimed Tesla erroneously reported Drive Xavier's performance was 21 TOPS instead of 30 TOPS. It is interesting how one company was quick to recognize itself as the unmarked competition, especially at a time when Intel, via their Mobileye division, have also given them a hard time recently. Perhaps this is a sign of things to come in that self-driving cars, and AI computing in general, is getting too big a market to be left to third-party manufacturing, with larger companies opting for in-house hardware itself. This move does hurt NVIDIA's focus in this field, as market speculation is ongoing that they may end up losing other customers following Tesla's departure.

U.S.A. Loses 3rd Place in TOP500 Supercomputer Standings... To Switzerland?

The United States has been being pushed down in the TOP500 standings for some time courtesy China, whom has taken the 1st and 2nd place seats from the US with their Sunway TaihuLight and Tianhe-2 Supercomputers (at a Linpack performance of 93 and 33.9 Petaflops, respectively). It seemed though the crown was stolen from America, 3rd place was relatively safe for the former champs. Not so. America has been pushed right off the podium in the latest TOP500 refresh... not by China though, but Switzerland?

ASUS Intros GeForce GTX 650 DirectCU Pandaren Monk Pet Bundle

ASUS GeForce GTX DirectCU, DirectCU OC, and DirectCU TOP graphics cards now arrive in a special World of Warcraft Mists of Pandaria edition. Customers get a bonus Pandaren Monk Pet downloadable character, coinciding with the release of the latest expansion to the globally-popular MMORPG. The TOP version uses a hand-picked GPU operating 157MHz faster than reference, and all cards make sure adventuring stays cool and quiet with DirectCU thermal innovation. Exclusive ASUS DIGI+ VRM and Super Alloy Power contribute greater stability, overclocking, and product longevity, plus GPU Tweak returns to give gamers easy access to diverse card tuning capabilities.

ASUS Announces the GeForce GTX 680 DirectCU II TOP

The ASUS GeForce GTX 680 DirectCU II TOP graphics card delivers a true flagship product for dedicated PC gamers and performance enthusiasts. The TOP-selected 28 nm NVIDIA GeForce GTX 680 GPU has been overclocked by ASUS to 1201 MHz to boost frame rates in games, offering users 143 MHz over reference. Its ASUS-designed DirectCU II thermal design runs 20% cooler than stock, while the twin 100 mm fans keep noise at bay with 14 dB quieter operation.

ASUS has added 10-phase DIGI+ VRM digitally regulated power delivery with 30% noise reduction, working in tandem with durable Super Alloy Power components that last 2.5 longer than reference. Users can tap the greater overclocking and overvolting capabilities of the card through both the hardware-level VGA Hotwire and the software-level GPU Tweak utility. Also released is the ASUS GeForce GTX 680 DirectCU II OC edition, with a 1019 MHz core capable of a 1084 MHz boost clock. This card uses the same DirectCU II cooler and PCB as the TOP version.

ASUS Introduces New High-End DirectCu II Series Graphics Cards

ASUS has launched an entire range of DirectCU II-enhanced graphics cards that include the latest technologies from both AMD and NVIDIA GPU rosters. On the AMD side, ASUS offers the HD 6970 and HD 6950 graphics cards with DirectCU II while for NVIDIA, the GTX 580, GTX 570 and new GTX 560 Ti all ship with the advanced cooling technology.

Based on ASUS DirectCU, which uses copper heat pipes in direct contact with the GPU core for up to 20% cooler performance, DirectCU II adds a custom cooler that uses twin 100mm fans for a massive 600% increase in airflow on HD 6970, HD 6950, GTX 580, and GTX 570 DirectCU II cards. ASUS has also launched the GTX 560 Ti DirectCU II TOP graphics card with dual 80mm fans for the best performance in its segment through doubled air power.

ASUS Announces First Overclocked Radeon HD 4770 Accelerator

Barely 24 hours into the launch of the Radeon HD 4770, ASUS has announced its factory-overclocked, and overclocker-friendly variant of the ATI Radeon HD 4770 (model: EAH4770 TOP/HTDI/512MD5). Giving it the "TOP" branding, the company upped the clock speeds, and backed the product with its Voltage Tweak technology, and its popular SmartDoctor performance control application. The gains end-users have compared to using a reference design accelerator are two fold.

First, ASUS set the clock speeds at 800 MHz (core), and 850 MHz (memory), both 50 MHz bumps over the reference AMD clock speeds of 750/800 MHz (core/memory). Second, using the value-added features, the voltages can be upped from 0.95 V to 1.2 V, increasing the overclocking headroom. ASUS claims the core could be set at freqencies as high as 971 MHz, and 1150 MHz for the memory: a seemingly massive increment over the reference speeds, which ASUS rounds off as a 35% speed improvement. The card retains the reference AMD design, and the most common cooler design partners are currently using. Its pricing and availability are yet to be known.

ASUS Launches EAH4890 Series with Voltage Tweak for 15% Performance Upgrades

ASUS, world-leading producer of top quality graphic solutions, has today introduced the ASUS EAH4890 Series-the world's first graphics cards to utilize Voltage Tweak technology. With this innovation, users will be able to boost GPU voltages via the SmartDoctor application and enjoy up to an amazing 15% in performance improvement. For users who are looking for a graphics card offering astounding graphical performances even on default settings, the ASUS EAH4890 TOP will enable an 8% performance boost, delivering exhilarating gaming experiences.
Return to Keyword Browsing
May 3rd, 2024 02:25 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts