News Posts matching #GPU

Return to Keyword Browsing

Chinese GPU Maker XCT Faces Financial Crisis and Legal Troubles

Xiangdixian Computing Technology (XCT), once hailed as China's answer to NVIDIA at its peak, is now grappling with severe financial difficulties and legal challenges. The company, which has developed its own line of GPUs based on the Tianjun chips, recently admitted that its progress in "development of national GPU has not yet fully met the company's expectations and is facing certain market adjustment pressures." Despite producing two desktop and one workstation GPU model, XCT has been forced to address rumors of its closure. The company has undergone significant layoffs, but it claims to have retained key research and development staff essential for GPU advancement. Adding to XCT's woes, investors have initiated legal proceedings against the company's founder, Tang Zhimin, claiming he failed to deliver on his commitment to raising 500 million Yuan in Series B funding.

Among the complainants is the state-owned Jiangsu Zhongde Services Trade Industry Investment Fund, which has filed a lawsuit against three companies under Zhimin's control. Further complicating matters, Capitalonline Data Service is reportedly suing XCT for unpaid debts totaling 18.8 million Yuan. There are also claims that the company's bank accounts have been frozen, potentially impeding its ability to continue operations. The situation is further complicated by allegations of corruption within China's semiconductor sector, with reports of executives misappropriating investment funds. With XCT fighting for survival through restructuring efforts, its fate hangs in the balance. Without securing additional funding soon, the company may be forced to close its doors, which will blow China's GPU aspirations.

Samsung Announces New Galaxy Book5 Pro 360

Samsung Electronics today announced the Galaxy Book5 Pro 360, a Copilot+ PC and the first in the all-new Galaxy Book5 series. Performance upgrades made possible by the Intel Core Ultra processors (Series 2) bring next-level computing power, with up to 47 total TOPs NPU - and more than 300 AI-accelerated features across 100+ creativity, productivity, gaming and entertainment apps. Microsoft Phone Link provides access to your Galaxy phone screen on a larger, more immersive PC display, enabling use of fan-favorite Galaxy AI features like Circle to Search with Google, Chat Assist, Live Translate and more. And with the Intel ARC GPU, graphics performance is improved by 17%. When paired with stunning features like the Dynamic AMOLED 2X display with Vision Booster and 10-point multi-touchscreen, Galaxy Book5 Pro 360 allows creation anytime, anywhere.

"The Galaxy Book5 series brings even more cutting-edge AI experiences to Galaxy users around the world who want to enhance and simplify their everyday tasks - a vision made possible by our continued collaboration with longtime industry partners," said Dr. Hark-Sang Kim, EVP & Head of New Computing R&D Team, Mobile eXperience Business at Samsung Electronics. "As one of our most powerful PCs, Galaxy Book5 Pro 360 brings together top-tier performance with Galaxy's expansive mobile AI ecosystem for the ultimate AI PC experience."

NVIDIA Resolves "Blackwell" Yield Issues with New Photomask

During its Q2 2024 earnings call, NVIDIA confirmed that its upcoming Blackwell-based products are facing low-yield challenges. However, the company announced that it has implemented design changes to improve the production yields of its B100 and B200 processors. Despite these setbacks, NVIDIA remains optimistic about its production timeline. The tech giant plans to commence the production ramp of Blackwell GPUs in Q4 2024, with expected shipments worth several billion dollars by the end of the year. In an official statement, NVIDIA explained, "We executed a change to the Blackwell GPU mask to improve production yield." The company also reaffirmed that it had successfully sampled Blackwell GPUs with customers in the second quarter.

However, NVIDIA acknowledged that meeting demand required producing "low-yielding Blackwell material," which impacted its gross margins. During an earnings call, NVIDIA's CEO Jensen Huang assured investors that the supply of B100 and B200 GPUs will be there. He expressed confidence in the company's ability to mass-produce these chips starting in the fourth quarter. The Blackwell B100 and B200 GPUs use TSMC's CoWoS-L packaging technology and a complex design, which prompted rumors about the company facing yield issues with its designs. Reports suggest that initial challenges arose from mismatched thermal expansion coefficients among various components, leading to warping and system failures. However, now the company claims that the fix that solved these problems was a new GPU photomask, which bumped yields back to normal levels.

ASUS Announces ESC N8-E11 AI Server with NVIDIA HGX H200

ASUS today announced the latest marvel in the groundbreaking lineup of ASUS AI servers - ESC N8-E11, featuring the intensely powerful NVIDIA HGX H200 platform. With this AI titan, ASUS has secured its first industry deal, showcasing the exceptional performance, reliability and desirability of ESC N8-E11 with HGX H200, as well as the ability of ASUS to move first and fast in creating strong, beneficial partnerships with forward-thinking organizations seeking the world's most powerful AI solutions.

Shipments of the ESC N8-E11 with NVIDIA HGX H200 are scheduled to begin in early Q4 2024, marking a new milestone in the ongoing ASUS commitment to excellence. ASUS has been actively supporting clients by assisting in the development of cooling solutions to optimize overall PUE, guaranteeing that every ESC N8-E11 unit delivers top-tier efficiency and performance - ready to power the new era of AI.

Intel Dives Deep into Lunar Lake, Xeon 6, and Gaudi 3 at Hot Chips 2024

Demonstrating the depth and breadth of its technologies at Hot Chips 2024, Intel showcased advancements across AI use cases - from the data center, cloud and network to the edge and PC - while covering the industry's most advanced and first-ever fully integrated optical compute interconnect (OCI) chiplet for high-speed AI data processing. The company also unveiled new details about the Intel Xeon 6 SoC (code-named Granite Rapids-D), scheduled to launch during the first half of 2025.

"Across consumer and enterprise AI usages, Intel continuously delivers the platforms, systems and technologies necessary to redefine what's possible. As AI workloads intensify, Intel's broad industry experience enables us to understand what our customers need to drive innovation, creativity and ideal business outcomes. While more performant silicon and increased platform bandwidth are essential, Intel also knows that every workload has unique challenges: A system designed for the data center can no longer simply be repurposed for the edge. With proven expertise in systems architecture across the compute continuum, Intel is well-positioned to power the next generation of AI innovation." -Pere Monclus, chief technology officer, Network and Edge Group at Intel.

AMD Radeon RX 8000 "RDNA 4" GPU Spotted on Geekbench

AMD's upcoming Radeon RX 8000 "RDNA 4" GPU has been spotted on Geekbench, revealing some of its core specifications. These early benchmark appearances indicate that AMD is now testing the new GPUs internally, preparing for a launch expected next year. The leaked GPU, identified as "GFX1201", is believed to be the Navi 48 SKU - the larger of two dies planned for the RDNA 4 family.

It features 28 Compute Units in the Geekbench listing, which in this case refers to Work Group Processors (WGPs). This likely translates to 56 Compute Units positioning it between the current RX 7700 XT (54 CU) and RX 7800 XT (60 CU) models. The clock speed is listed at 2.1 GHz, which seems low compared to current RDNA 3 GPUs that can boost to 2.5-2.6 GHz. However, this is likely due to the early nature of the samples, and we can expect higher frequencies closer to launch. Memory specifications show 16 GB of VRAM, matching current high-end models and suggesting a 256-bit bus interface. Some variants may feature 12 GB VRAM with a 192-bit bus. While not confirmed, previous reports indicate AMD will use GDDR6 memory.

Minisforum Announces New G7 Ti Mini-PC With Core i9-14900HX and RTX 4070

Minisforum is thrilled to announce the launch of the new G7 Ti Mini-PC, a marvel of engineering designed specifically for professionals in AI development, video rendering, 3D design, and AI-driven creative fields. This ultra-compact yet extraordinarily powerful system is set to revolutionize the market, offering top-tier performance in a sleek, space-saving design.

Unleashing Power with Intel Core i9-14900HX
At the heart of the G7 Ti Mini-PC lies the Intel Core i9-14900HX processor, a dynamic powerhouse that brings desktop-caliber performance to a mini-PC format. With its advanced architecture, the i9-14900HX is optimized for high-speed computing tasks and multitasking, making it an ideal choice for professionals who demand efficiency and speed in their workflow.

Arm to Dip its Fingers into Discrete GPU Game, Plans on Competing with Intel, AMD, and NVIDIA

According to a recent report from Globes, Arm, the chip design giant and maker of the Arm ISA, is reportedly developing a new discrete GPU at its Ra'anana development center in Israel. This development signals Arm's intention to compete directly with industry leaders like Intel, AMD, and NVIDIA in the massive discrete GPU market. Sources close to the matter reveal that Arm has assembled a team of approximately 100 skilled chip and software development engineers at its Israeli facility. The team is focused on creating GPUs primarily aimed at the video game market. However, industry insiders speculate that this technology could potentially be adapted for AI processing in the future, mirroring the trajectory of NVIDIA, which slowly integrated AI hardware accelerators into its lineup.

The Israeli development center is playing a crucial role in this initiative. The hardware teams are overseeing the development of key components for these GPUs, including the flagship Immortalis and Mali GPU. Meanwhile, the software teams are creating interfaces for external graphics engine developers, working with both established game developers and startups. Arm is already entering the PC market through its partners like Qualcomm with Snapdragon X chips. However, these chips run an integrated GPU, and Arm wants to provide discrete GPUs and compete there. While details are still scarce, Arm could make GPUs to accompany Arm-based Copilot+ PCs and some desktop builds. The final execution plan still needs to be discovered, and we are still waiting to see which stage Arm's discrete GPU project is in.

Geekbench AI Hits 1.0 Release: CPUs, GPUs, and NPUs Finally Get AI Benchmarking Solution

Primate Labs, the developer behind the popular Geekbench benchmarking suite, has launched Geekbench AI—a comprehensive benchmark tool designed to measure the artificial intelligence capabilities of various devices. Geekbench AI, previously known as Geekbench ML during its preview phase, has now reached version 1.0. The benchmark is available on multiple operating systems, including Windows, Linux, macOS, Android, and iOS, making it accessible to many users and developers. One of Geekbench AI's key features is its multifaceted approach to scoring. The benchmark utilizes three distinct precision levels: single-precision, half-precision, and quantized data. This evaluation aims to provide a more accurate representation of AI performance across different hardware designs.

In addition to speed, Geekbench AI places a strong emphasis on accuracy. The benchmark assesses how closely each test's output matches the expected results, offering insights into the trade-offs between performance and precision. The release of Geekbench AI 1.0 brings support for new frameworks, including OpenVINO, ONNX, and Qualcomm QNN, expanding its compatibility across various platforms. Primate Labs has also implemented measures to ensure fair comparisons, such as enforcing minimum runtime durations for each workload. The company noted that Samsung and NVIDIA are already utilizing the software to measure their chip performance in-house, showing that adoption is already strong. While the benchmark provides valuable insights, real-world AI applications are still limited, and reliance on a few benchmarks may paint a partial picture. Nevertheless, Geekbench AI represents a significant step forward in standardizing AI performance measurement, potentially influencing future consumer choices in the AI-driven tech market. Results from the benchmark runs can be seen here.

Huawei Reportedly Developing New Ascend 910C AI Chip to Rival NVIDIA's H100 GPU

Amidst escalating tensions in the U.S.-China semiconductor industry, Huawei is reportedly working on a new AI chip called the Ascend 910C. This development appears to be the Chinese tech giant's attempt to compete with NVIDIA's AI processors in the Chinese market. According to a Wall Street Journal report, Huawei has begun testing the Ascend 910C with various Chinese internet and telecom companies to evaluate its performance and capabilities. Notable firms such as ByteDance, Baidu, and China Mobile are said to have received samples of the chip.

Huawei has reportedly informed its clients that the Ascend 910C can match the performance of NVIDIA's H100 chip. The company has been conducting tests for several weeks, suggesting that the new processor is nearing completion. The Wall Street Journal indicates that Huawei could start shipping the chip as early as October 2024. The report also mentions that Huawei and potential customers have discussed orders for over 70,000 chips, potentially worth $2 billion.

ASUS Presents Comprehensive AI Server Lineup

ASUS today announced its ambitious All in AI initiative, marking a significant leap into the server market with a complete AI infrastructure solution, designed to meet the evolving demands of AI-driven applications from edge, inference and generative AI the new, unparalleled wave of AI supercomputing. ASUS has proven its expertise lies in striking the perfect balance between hardware and software, including infrastructure and cluster architecture design, server installation, testing, onboarding, remote management and cloud services - positioning the ASUS brand and AI server solutions to lead the way in driving innovation and enabling the widespread adoption of AI across industries.

Meeting diverse AI needs
In partnership with NVIDIA, Intel and AMD, ASUS offer comprehensive AI-infrastructure solutions with robust software platforms and services, from entry-level AI servers and machine-learning solutions to full racks and data centers for large-scale supercomputing. At the forefront is the ESC AI POD with NVIDIA GB200 NVL72, a cutting-edge rack designed to accelerate trillion-token LLM training and real-time inference operations. Complemented by the latest NVIDIA Blackwell GPUs, NVIDIA Grace CPUs and 5th Gen NVIDIA NVLink technology, ASUS servers ensure unparalleled computing power and efficiency.

SUNON Unveils a Two-Phase Liquid Cooling Solution for Advanced Workstations

In the age of AI, computing power has become a vital component for driving innovation. For most industries, using professional-grade workstations as the computing engine enables efficient computing and infinite creativity. A workstation is a multi-purpose computer that supports high-performance computing in a distributed network setting. It excels at graphics processing and task parallelism, making it suitable for a wide range of AI applications as well as common visual design tasks.

For example, the workstation can fully meet the requirements for multi-task processing, such as 3D modeling, large-scale industrial drawing, advertising rendering output, non-linear video editing, file rendering production and acceleration, and so on. The computer can also perform effectively in a wide range of model loadings and professional software, as well as remote system maintenance and monitoring in unsupervised settings, which has the potential to revolutionize the application domain.

Intel Ships 0x129 Microcode Update for 13th and 14th Generation Processors with Stability Issues

Intel has officially started shipping the "0x129" microcode update for its 13th and 14th generation "Raptor Lake" and "Raptor Lake Refresh" processors. This critical update is currently being pushed to all OEM/ODM partners to address the stability issues that Intel's processors have been facing. According to Intel, this microcode update fixes "incorrect voltage requests to the processor that are causing elevated operating voltage." Intel's analysis shows that the root cause of stability problems is caused by too high voltage during operation of the processor. These increases to voltage cause degradation that increases the minimum voltage required for stable operation. Intel calls this "Vmin"—it's a theoretical construct, not an actual voltage, think "speed for an airplane required to fly". The latest 0x129 microcode patch will limit the processor's voltage to no higher than 1.55 V, which should avoid further degradation. Overclocking is still supported, enthusiasts will have to disable the eTVB setting in their BIOS to push the processor beyond the 1.55 V threshold. The company's internal testing shows that the new default settings with limited voltages with standard run-to-run variations show minimal performance impact, with only a single game (Hitman 3: Dartmoor) showing degradation. For a full statement from Intel, see the quote below.

Intel Announces Arc A760A Automotive-grade GPU

In a strategic move to empower automakers with groundbreaking opportunities, Intel unveiled its first discrete graphics processing unit (dGPU), the Intel Arc Graphics for Automotive, at its AI Cockpit Innovation Experience event. To advance automotive AI, the product will be commercially deployed in vehicles as soon as 2025, accelerating automobile technology and unlocking a new era of AI-driven cockpit experiences and enhanced personalization for manufacturers and drivers alike.

Intel's entry into automotive discrete GPUs addresses growing demand for compute power in increasingly sophisticated vehicle cockpits. By adding the Intel Arc graphics for Automotive to its existing portfolio of AI-enhanced software-defined vehicle (SDV) system-on-chips (SoCs), Intel offers automakers an open, flexible and scalable platform solution that brings next-level, high-fidelity experiences to the vehicle.

NVIDIA's New B200A Targets OEM Customers; High-End GPU Shipments Expected to Grow 55% in 2025

Despite recent rumors speculating on NVIDIA's supposed cancellation of the B100 in favor of the B200A, TrendForce reports that NVIDIA is still on track to launch both the B100 and B200 in the 2H24 as it aims to target CSP customers. Additionally, a scaled-down B200A is planned for other enterprise clients, focusing on edge AI applications.

TrendForce reports that NVIDIA will prioritize the B100 and B200 for CSP customers with higher demand due to the tight production capacity of CoWoS-L. Shipments are expected to commence after 3Q24. In light of yield and mass production challenges with CoWoS-L, NVIDIA is also planning the B200A for other enterprise clients, utilizing CoWoS-S packaging technology.

Lossless Scaling Frame Generation Boosts Frame Rate by 4x in All PC Games, Update Arrives This Week

Lossless Scaling, an all-in-one paid gaming utility for scaling and frame generation, is set to introduce an outstanding 4x FPS mode to its Lossless Scaling Frame Generation (LSFG) technology. Officially announced in the Lossless Scaling Discord and showcased by the YouTube user Vyathaen, the upcoming 4x FPS mode is expected to arrive in the upscaler's frame generation option within this week. While YouTube videos may not fully capture the experience and benefit of this improvement, beta testers have reported that the latency remains consistent with the current 2x FPS option, ensuring that most games will remain perfectly playable given a sufficiently high base framerate. For those seeking a more comprehensive demonstration, the Lossless Scaling official Discord server features a Cyberpunk 2077 video that better illustrates the capabilities of the 4x FPS mode.

The journey of Lossless Scaling has been marked by continuous innovation since its initial release. Version 2.1, launched in June, introduced a 3x FPS mode, effectively tripling framerates. Additionally, it brought performance optimizations that enhanced the speed of the 2x FPS mode compared to previous iterations. The update also included refinements for scenarios where the final frame rate surpasses the monitor's refresh rate. The software is universally compatible with all GPUs and PC games, including emulated titles, requiring only windowed mode and Windows 10 1903 or newer. While the LSFG frame generation technology and LS1 upscaler are proprietary, for upscaling, users can choose one of the many underlying options depending on their GPU like AMD FidelityFX Super Resolution, NVIDIA Image Scaling, Integer Scaling, Nearest Neighbor, xBR, Anime4K, Sharp Bilinear, Bicubic CAS. Below, you can check out the YouTube video with 4x frame generation example.
Lossless Scaling Lossless Scaling

Apple Trained its Apple Intelligence Models on Google TPUs, Not NVIDIA GPUs

Apple has disclosed that its newly announced Apple Intelligence features were developed using Google's Tensor Processing Units (TPUs) rather than NVIDIA's widely adopted hardware accelerators like H100. This unexpected choice was detailed in an official Apple research paper, shedding light on the company's approach to AI development. The paper outlines how systems equipped with Google's TPUv4 and TPUv5 chips played a crucial role in creating Apple Foundation Models (AFMs). These models, including AFM-server and AFM-on-device, are designed to power both online and offline Apple Intelligence features introduced at WWDC 2024. For the training of the 6.4 billion parameter AFM-server, Apple's largest language model, the company utilized an impressive array of 8,192 TPUv4 chips, provisioned as 8×1024 chip slices. The training process involved a three-stage approach, processing a total of 7.4 trillion tokens. Meanwhile, the more compact 3 billion parameter AFM-on-device model, optimized for on-device processing, was trained using 2,048 TPUv5p chips.

Apple's training data came from various sources, including the Applebot web crawler and licensed high-quality datasets. The company also incorporated carefully selected code, math, and public datasets to enhance the models' capabilities. Benchmark results shared in the paper suggest that both AFM-server and AFM-on-device excel in areas such as Instruction Following, Tool Use, and Writing, positioning Apple as a strong contender in the AI race despite its relatively late entry. However, Apple's penetration tactic into the AI market is much more complex than any other AI competitor. Given Apple's massive user base and millions of devices compatible with Apple Intelligence, the AFM has the potential to change user interaction with devices for good, especially for everyday tasks. Hence, refining AI models for these tasks is critical before massive deployment. Another unexpected feature is transparency from Apple, a company typically known for its secrecy. The AI boom is changing some of Apple's ways, and revealing these inner workings is always interesting.

NVIDIA Plans RTX 3050 A with Ada Lovelace AD106 Silicon

NVIDIA may be working on a new RTX 3050 A laptop GPU using an AD106 (Ada Lovelace) die, moving away from the Ampere chips used in other RTX 30-series GPUs. While not officially announced, the GPU is included in NVIDIA's latest driver release and the PCI ID database as GeForce RTX 3050 A Laptop GPU. The AD106 die choice is notable, as it has more transistors and CUDA cores than the GA107 in current RTX 3050s and the AD107 in RTX 4050 laptops. The AD106, used in RTX 4060 Ti desktop and RTX 4070 laptop GPUs, boasts 22.9 billion transistors and 4,608 CUDA cores, compared to GA107's 8.7 billion transistors and 2,560 CUDA cores, and AD107's 18.9 billion transistors and 3,072 CUDA cores.

While this could potentially improve performance, it's likely that NVIDIA will use a cut-down version of the AD106 chip for the RTX 3050 A. The exact specifications and features, such as support for DLSS 3, remain unknown. The use of TSMC's 4N node in AD106, instead of Samsung's 8N node used in Ampere, could potentially improve power efficiency and battery life. The performance of the RTX 3050 A compared to existing RTX 3050 and RTX 4050 laptops remains to be seen, however, the RTX 3050 A will likely perform similarly to existing Ampere-based parts as NVIDIA tends to use similar names for comparable performance levels. It's unclear if NVIDIA will bring this GPU to market, but adding new SKUs late in a product's lifespan isn't unprecedented.

Samsung Electro-Mechanics Collaborates with AMD to Supply High-Performance Substrates for Hyperscale Data Center Computing

Samsung Electro-Mechanics (SEMCO) today announced a collaboration with AMD to supply high-performance substrates for hyperscale data center compute applications. These substrates are made in SEMCO's key the technology hub in Busan and the newly built state of the art factory in Vietnam. Market research firm Prismark predicts that the semiconductor substrate market will grow at an average annual rate of about 7%, increasing from 15.2 trillion KRW in 2024 to 20 trillion KRW in 2028. SEMCO's substantial investment of 1.9 trillion KRW in the FCBGA factory underscores its commitment to advancing substrate technology and manufacturing capabilities to meet the highest industry standards and the future technology needs.

SEMCO's collaboration with AMD focuses on meeting the unique challenges of integrating multiple semiconductor chips (Chiplets) on a single large substrate. These high-performance substrates, essential for CPU/GPU applications, offer significantly larger surface areas and higher layer counts, providing the dense interconnections required for today's advanced data centers. Compared to standard computer substrates, data center substrates are ten times larger and feature three times more layers, ensuring efficient power delivery and lossless signal integrity between chips. Addressing these challenges, SEMCO's innovative manufacturing processes mitigate issues like warpage to ensure high yields during chip mounting.

NVIDIA Shifts Gears: Open-Source Linux GPU Drivers Take Center Stage

Just a few months after hiring Ben Skeggs, a lead maintainer of the open-source NVIDIA GPU driver for Linux kernel, NVIDIA has announced a complete transition to open-source GPU kernel modules in its upcoming R560 driver release for Linux. This decision comes two years after the company's initial foray into open-source territory with the R515 driver in May 2022. The tech giant began focusing on data center compute GPUs, while GeForce and Workstation GPU support remained in the alpha stages. Now, after extensive development and optimization, NVIDIA reports that its open-source modules have achieved performance parity with, and in some cases surpassed, their closed-source counterparts. This transition brings a host of new capabilities, including heterogeneous memory management support, confidential computing features, and compatibility with NVIDIA's Grace platform's coherent memory architectures.

The move to open-source is expected to foster greater collaboration within the Linux ecosystem and potentially lead to faster bug fixes and feature improvements. However, not all GPUs will be compatible with the new open-source modules. While cutting-edge platforms like NVIDIA Grace Hopper and Blackwell will require open-source drivers, older GPUs from the Maxwell, Pascal, or Volta architectures must stick with proprietary drivers. NVIDIA has developed a detection helper script to guide driver selection for users who are unsure about compatibility. The shift also brings changes to NVIDIA's installation processes. The default driver version for most installation methods will now be the open-source variant. This affects package managers with the CUDA meta package, run file installations and even Windows Subsystem for Linux.

Gigabyte AI TOP Utility Reinventing Your Local AI Fine-tuning

GIGABYTE TECHNOLOGY Co. Ltd, a leading manufacturer of motherboards, graphics cards, and hardware solutions, released the GIGABYTE exclusive groundbreaking AI TOP Utility. With reinvented workflows, user-friendly interface, and real-time progress monitoring, AI TOP Utility provides a reinventing touch of local AI model training and fine-tuning. It features a variety of groundbreaking technologies that can be easily adapted by beginners or experts, for most common open-source LLMs, in anyplace even on your desk.

GIGABYTE AI TOP is the all-round solution for local AI Model Fine-tuning. Running local AI training and fine-tuning on sensitive data can relatively provide greater privacy and security with maximum flexibility and real-time adjustment. Collocating with GIGABYTE AI TOP hardware and AI TOP Utility, the common constraints of GPU VRAM insufficiency when trying to execute AI fine-tuning locally can be addressed. By GIGABYTE AI TOP series motherboard, PSU, and SSD, as well as GIGABYTE graphics cards lineup covering NVIDIA GeForce RTX 40 Series, AMD Radeon RX 7900 Series, Radeon Pro W7900 and W7800 series, the size of open-source LLM fine-tuning can now reach up to 236B and more.

Global AI Server Demand Surge Expected to Drive 2024 Market Value to US$187 Billion; Represents 65% of Server Market

TrendForce's latest industry report on AI servers reveals that high demand for advanced AI servers from major CSPs and brand clients is expected to continue in 2024. Meanwhile, TSMC, SK hynix, Samsung, and Micron's gradual production expansion has significantly eased shortages in 2Q24. Consequently, the lead time for NVIDIA's flagship H100 solution has decreased from the previous 40-50 weeks to less than 16 weeks.

TrendForce estimates that AI server shipments in the second quarter will increase by nearly 20% QoQ, and has revised the annual shipment forecast up to 1.67 million units—marking a 41.5% YoY growth.

Intel Core Ultra 300 Series "Panther Lake" Leaks: 16 CPU Cores, 12 Xe3 GPU Cores, and Five-Tile Package

Intel is preparing to launch its next generation of mobile CPUs with Core Ultra 200 series "Lunar Lake" leading the charge. However, as these processors are about to hit the market, leakers reveal Intel's plans for the next-generation Core Ultra 300 series "Panther Lake". According to rumors, Panther Lake will double the core count of Lunar Lake, which capped out at eight cores. There are several configurations of Panther Lake in the making based on the different combinations of performance (P) "Cougar Cove," efficiency (E) "Skymont," and low power (LP) cores. First is the PTL-U with 4P+0E+4LP cores with four Xe3 "Celestial" GPU cores. This configuration is delivered within a 15 W envelope. Next, we have the PTL-H variant with 4P+8E+4LP cores for a total of 16 cores, with four Xe3 GPU cores, inside a 25 W package. Last but not least, Intel will also make PTL-P SKUs with 4P+8E+4LP cores, with 12 Xe3 cores, to create a potentially decent gaming chip with 25 W of power.

Intel's Panther Lake CPU architecture uses an innovative design approach, utilizing a multi-tile configuration. The processor incorporates five distinct tiles, with three playing active roles in its functionality. The central compute operations are handled by one "Die 4" tile with CPU and NPU, while "Die 1" is dedicated to platform control (PCD). Graphics processing is managed by "Die 5", leveraging Intel's Xe3 technology. Interestingly, two of the five tiles serve a primarily structural purpose. These passive elements are strategically placed to achieve a balanced, rectangular form factor for the chip. This design philosophy echoes a similar strategy employed in Intel's Lunar Lake processors. Panther Lake is poised to offer greater versatility compared to its Lunar Lake counterpart. It's expected to cater to a wider range of market segments and use cases. One notable advancement is the potential for increased memory capacity compared to Lunar Lake, which capped out at 32 GB of LPDDR5X memory running at 8533 MT/s. We can expect to hear more potentially at Intel's upcoming Innovation event in September, while general availability of Panther Lake is expected in late 2025 or early 2026.

NVIDIA GeForce RTX 50 Series "Blackwell" TDPs Leaked, All Powered by 16-Pin Connector

In the preparation season for NVIDIA's upcoming GeForce RTX 50 Series of GPUs, codenamed "Blackwell," one power supply manufacturer accidentally leaked the power configurations of all SKUs. Seasonic operates its power supply wattage calculator, allowing users to configure their systems online and get power supply recommendations. This means that the system often gets filled with CPU/GPU SKUs to accommodate the massive variety of components. This time we have the upcoming GeForce RTX 50 series, with RTX 5050 all the way up to the top RTX 5090 GPU. Starting with the GeForce RTX 5050, this SKU is expected to carry a 100 W TDP. Its bigger brother, the RTX 5060, bumps the TDP to 170 W, 55 W higher than the previous generation "Ada Lovelace" RTX 4060.

The GeForce RTX 5070, with a 220 W TDP, is in the middle of the stack, featuring a 20 W increase over the Ada generation. For higher-end SKUs, NVIDIA prepared the GeForce RTX 5080 and RTX 5090, with 350 W and 500 W TDP, respectively. This also represents a jump in TDP from Ada generation with an increase of 30 W for RTX 5080 and 50 W for RTX 5090. Interestingly, this time NVIDIA wants to unify the power connection system of the entire family with a 16-pin 12V-2x6 connector but with an updated PCIe 6.0 CEM specification. The increase in power requirements for the "Blackwell" generation across the SKUs is interesting, and we are eager to see if the performance gains are enough to balance efficiency.

2.1 Billion Pixels in Las Vegas Sphere are Powered by 150 NVIDIA RTX A6000 GPUs

The city of Las Vegas late last year added another attraction to its town: the Sphere. The Sphere is a 1.2 million pixel outdoor display venue famous for its massive size and inner 18,600-seat auditorium. The auditorium space is a feat of its own with features like a 16x16 resolution wraparound interior LED screen, speakers with beamforming and wave field synthesis technologies, and 4D physical effects. However, we have recently found out that NVIDIA GPUs power the Sphere. And not only a handful of them, as 150 NVIDIA RTX A6000 power the Sphere and its 1.2 million outside pixels spread on 54,000 m², as well as 16 of 16K inner displays with a total output of 2.1 billion pixels. Interestingly, the 150 NVIDIA RTX A6000 have a combined output cable number of 600 DisplayPort 1.4a ports.

With each card having 48 GB of memory, that equals to 7.2 TB of GDDR6 ECC memory in the total system. With the Sphere being a $2.3 billion project, it is expected to have an infotainment system capable of driving the massive venue. And it certainly delivers on that. Only a handful of cards powers most massive media projects, but this scale is something we see for the first time in non-AI processing systems. The only scale we are used to today is massive thousand-GPU clusters used for AI processing, so seeing a different and interesting application is refreshing.
Return to Keyword Browsing
Jul 18th, 2025 19:06 CDT change timezone

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts