News Posts matching #HBM3

Return to Keyword Browsing

SK hynix Showcases AI-Driven Innovations for a Sustainable Tomorrow at CES 2025

SK hynix has returned to Las Vegas for Consumer Electronics Show (CES) 2025, showcasing its latest AI memory innovations reshaping the industry. Held from January 7-10, CES 2025 brings together the brightest minds and groundbreaking technologies from the world's leading tech companies. This year, the event's theme is "Dive In," inviting attendees to immerse themselves in the next wave of technological advancement. SK hynix is emphasizing how it is driving this wave through a display of leading AI memory technologies at the SK Group exhibit. Along with SK Telecom, SKC, and SK Enmove, the company is highlighting how the Group's AI infrastructure brings about true change under the theme "Innovative AI, Sustainable Tomorrow."

Groundbreaking Memory Tech Driving Change in the AI Era
Visitors enter SK Group's exhibit through the Innovation Gate, greeted by a video of dynamic wave-inspired visuals which symbolize the power of AI. The video shows the transformation of binary data into a wave which flows through the exhibition, highlighting how data and AI drives change across industries. Continuing deeper into the exhibit, attendees make their way into the AI Data Center area, the focal point of SK hynix's display. This area features the company's transformative memory products driving progress in the AI era. Among the cutting-edge AI memory technologies on display are SK hynix's HBM, server DRAM, eSSD, CXL, and PIM products.

NVIDIA GB300 "Blackwell Ultra" Will Feature 288 GB HBM3E Memory, 1400 W TDP

NVIDIA "Blackwell" series is barely out with B100, B200, and GB200 chips shipping to OEMs and hyperscalers, but the company is already setting in its upgraded "Blackwell Ultra" plans with its upcoming GB300 AI server. According to UDN, the next generation NVIDIA system will be powered by the B300 GPU chip, operating at 1400 W and delivering a remarkable 1.5x improvement in FP4 performance per card compared to its B200 predecessor. One of the most notable upgrades is the memory configuration, with each GPU now sporting 288 GB of HBM3e memory, a substantial increase from the previous 192 GB of GB200. The new design implements a 12-layer stack architecture, advancing from the GB200's 8-layer configuration. The system's cooling infrastructure has been completely reimagined, incorporating advanced water cooling plates and enhanced quick disconnects in the liquid cooling system.

Networking capabilities have also seen a substantial upgrade, with the implementation of ConnectX 8 network cards replacing the previous ConnectX 7 generation, while optical modules have been upgraded from 800G to 1.6T, ensuring faster data transmission. Regarding power management and reliability, the GB300 NVL72 cabinet will standardize capacitor tray implementation, with an optional Battery Backup Unit (BBU) system. Each BBU module costs approximately $300 to manufacture, with a complete GB300 system's BBU configuration totaling around $1,500. The system's supercapacitor requirements are equally substantial, with each NVL72 rack requiring over 300 units, priced between $20-25 per unit during production due to its high-power nature. The GB300, carrying Grace CPU and Blackwell Ultra GPU, also introduces the implementation of LPCAMM on its computing boards, indicating that the LPCAMM memory standard is about to take over servers, not just laptops and desktops. We have to wait for the official launch before seeing LPCAMM memory configurations.

U.S. Unveils Massive Export Restrictions on China's Chip Industry Targeting 140 Firms

The Biden administration is rolling out a third major export control package aimed at China's semiconductor industry, as per a report from Reuters. Estimated to affect 140 companies, including China's chip equipment maker Naura Technology Group, Piotek, and Huawei Technologies, the effort aims to limit China's access to advanced chip making technology. In particular, technology that could be used in military products and artificial intelligence. Important sanctions include export controls to specific chip equipment manufacturers, blocking the delivery of high-performance memory chips and the addition of several semiconductor investment companies to the list of export-restricted entities.

The package expands U.S. regulatory authority through foreign direct product rules. It regulates chip manufacturing equipment manufactured around the world with U.S. technology, Japan and the Netherlands are exempt. However, the rules could have an impact on manufacturers outside U.S. such as those based in Israel, Malaysia, Singapore, South Korea, Taiwan and non-U.S. firms (i.e. ASML) due to the complexity of the technological and supply chain. This continues the Biden administration's strategy to limit China's semiconductor capabilities and comes just weeks before the Trump administration made changes. When asked about US new restrictions Chinese Foreign Ministry spokesperson Lin Jian said at a regular press conference on Monday that such behavior undermines the international economic and trade system, and disrupts global supply chains. China will take measures to protect companies' rights and interests.

Server DRAM and HBM Boost 3Q24 DRAM Industry Revenue by 13.6% QoQ

TrendForce's latest investigations reveal that the global DRAM industry revenue reached US$26.02 billion in 3Q24, marking a 13.6% QoQ increase. The rise was driven by growing demand for DDR5 and HBM in data centers, despite a decline in LPDDR4 and DDR4 shipments due to inventory reduction by Chinese smartphone brands and capacity expansion by Chinese DRAM suppliers. ASPs continued their upward trend from the previous quarter, with contract prices rising by 8% to 13%, further supported by HBM's displacement of conventional DRAM production.

Looking ahead to 4Q24, TrendForce projects a QoQ increase in overall DRAM bit shipments. However, the capacity constraints caused by HBM production are expected to have a weaker-than-anticipated impact on pricing. Additionally, capacity expansions by Chinese suppliers may prompt PC OEMs and smartphone brands to aggressively deplete inventory to secure lower-priced DRAM products. As a result, contract prices for conventional DRAM and blended prices for conventional DRAM and HBM are expected to decline.

AMD Custom Makes CPUs for Azure: 88 "Zen 4" Cores and HBM3 Memory

Microsoft has announced its new Azure HBv5 virtual machines, featuring unique custom hardware made by AMD. CEO Satya Nadella made the announcement during Microsoft Ignite, introducing a custom-designed AMD processor solution that achieves remarkable performance metrics. The new HBv5 virtual machines deliver an extraordinary 6.9 TB/s of memory bandwidth, utilizing four specialized AMD processors equipped with HBM3 technology. This represents an eightfold improvement over existing cloud alternatives and a staggering 20-fold increase compared to previous Azure HBv3 configurations. Each HBv5 virtual machine boasts impressive specifications, including up to 352 AMD EPYC "Zen4" CPU cores capable of reaching 4 GHz peak frequencies. The system provides users with 400-450 GB of HBM3 RAM and features doubled Infinity Fabric bandwidth compared to any previous AMD EPYC server platform. Given that each VM had four CPUs, this yields 88 Zen 4 cores per CPU socket, with 9 GB of memory per core.

The architecture includes 800 Gb/s of NVIDIA Quantum-2 InfiniBand connectivity and 14 TB of local NVMe SSD storage. The development marks a strategic shift in addressing memory performance limitations, which Microsoft identifies as a critical bottleneck in HPC applications. This custom design particularly benefits sectors requiring intensive computational resources, including automotive design, aerospace simulation, weather modeling, and energy research. While the CPU appears custom-designed for Microsoft's needs, it bears similarities to previously rumored AMD processors, suggesting a possible connection to the speculated MI300C chip architecture. The system's design choices, including disabled SMT and single-tenant configuration, clearly focus on optimizing performance for specific HPC workloads. If readers can recall, Intel also made customized Xeons for AWS and their needs, which is normal in the hyperscaler space, given they drive most of the revenue.

NVIDIA B200 "Blackwell" Records 2.2x Performance Improvement Over its "Hopper" Predecessor

We know that NVIDIA's latest "Blackwell" GPUs are fast, but how much faster are they over the previous generation "Hopper"? Thanks to the latest MLPerf Training v4.1 results, NVIDIA's HGX B200 Blackwell platform has demonstrated massive performance gains, measuring up to 2.2x improvement per GPU compared to its HGX H200 Hopper. The latest results, verified by MLCommons, reveal impressive achievements in large language model (LLM) training. The Blackwell architecture, featuring HBM3e high-bandwidth memory and fifth-generation NVLink interconnect technology, achieved double the performance per GPU for GPT-3 pre-training and a 2.2x boost for Llama 2 70B fine-tuning compared to the previous Hopper generation. Each benchmark system incorporated eight Blackwell GPUs operating at a 1,000 W TDP, connected via NVLink Switch for scale-up.

The network infrastructure utilized NVIDIA ConnectX-7 SuperNICs and Quantum-2 InfiniBand switches, enabling high-speed node-to-node communication for distributed training workloads. While previous Hopper-based systems required 256 GPUs to optimize performance for the GPT-3 175B benchmark, Blackwell accomplished the same task with just 64 GPUs, leveraging its larger HBM3e memory capacity and bandwidth. One thing to look out for is the upcoming GB200 NVL72 system, which promises even more significant gains past the 2.2x. It features expanded NVLink domains, higher memory bandwidth, and tight integration with NVIDIA Grace CPUs, complemented by ConnectX-8 SuperNIC and Quantum-X800 switch technologies. With faster switching and better data movement with Grace-Blackwell integration, we could see even more software optimization from NVIDIA to push the performance envelope.

Samsung Electronics Announces Results for Third Quarter of 2024, 7 Percent Revenue Increase

Samsung Electronics today reported financial results for the third quarter ended Sept. 30, 2024. The Company posted KRW 79.1 trillion in consolidated revenue, an increase of 7% from the previous quarter, on the back of the launch effects of new smartphone models and increased sales of high-end memory products. Operating profit declined to KRW 9.18 trillion, largely due to one-off costs, including the provision of incentives in the Device Solutions (DS) Division. The strength of the Korean won against the U.S. dollar resulted in a negative impact on company-wide operating profit of about KRW 0.5 trillion compared to the previous quarter.

In the fourth quarter, while memory demand for mobile and PC may encounter softness, growth in AI will keep demand at robust levels. Against this backdrop, the Company will concentrate on driving sales of High Bandwidth Memory (HBM) and high-density products. The Foundry Business aims to increase order volumes by enhancing advanced process technologies. Samsung Display Corporation (SDC) expects the demand of flagship products from major customers to continue, while maintaining a quite conservative outlook on its performance. The Device eXperience (DX) Division will continue to focus on premium products, but sales are expected to decline slightly compared to the previous quarter.

HBM5 20hi Stack to Adopt Hybrid Bonding Technology, Potentially Transforming Business Models

TrendForce reports that the focus on HBM products in the DRAM industry is increasingly turning attention toward advanced packaging technologies like hybrid bonding. Major HBM manufacturers are considering whether to adopt hybrid bonding for HBM4 16hi stack products but have confirmed plans to implement this technology in the HBM5 20hi stack generation.

Hybrid bonding offers several advantages when compared to the more widely used micro-bumping. Since it does not require bumps, it allows for more stacked layers and can accommodate thicker chips that help address warpage. Hybrid-bonded chips also benefit from faster data transmission and improved heat dissipation.

Micron Updates Corporate Logo with "Ahead of The Curve" Design

Today, Micron updated its corporate logo with new symbolism. The redesign comes as Micron celebrates over four decades of technological advancement in the semiconductor industry. The new logo features a distinctive silicon color, paying homage to the wafers at the core of Micron's products. Its curved lettering represents the company's ability to stay ahead of industry trends and adapt to rapid technological changes. The design also incorporates vibrant gradient colors inspired by light reflections on wafers, which are the core of Mircorn's memory and storage products.

This rebranding effort coincides with Micron's expanding role in AI, where memory and storage innovations are increasingly crucial. The company has positioned itself beyond a commodity memory supplier, now offering leadership in solutions for AI data centers, high-performance computing, and AI-enabled devices. The company has come far from its original 64K DRAM in 1981 to HBM3E DRAM today. Micron offers different HBM memory products, graphics memory powering consumer GPUs, CXL memory modules, and DRAM components and modules.

NVIDIA Cancels Dual-Rack NVL36x2 in Favor of Single-Rack NVL72 Compute Monster

NVIDIA has reportedly discontinued its dual-rack GB200 NVL36x2 GPU model, opting to focus on the single-rack GB200 NVL72 and NVL36 models. This shift, revealed by industry analyst Ming-Chi Kuo, aims to simplify NVIDIA's offerings in the AI and HPC markets. The decision was influenced by major clients like Microsoft, who prefer the NVL72's improved space efficiency and potential for enhanced inference performance. While both models perform similarly in AI large language model (LLM) training, the NVL72 is expected to excel in non-parallelizable inference tasks. As a reminder, the NVL72 features 36 Grace CPUs, delivering 2,592 Arm Neoverse V2 cores with 17 TB LPDDR5X memory with 18.4 TB/s aggregate bandwidth. Additionally, it includes 72 Blackwell GB200 SXM GPUs that have a massive 13.5 TB of HBM3e combined, running at 576 TB/s aggregate bandwidth.

However, this shift presents significant challenges. The NVL72's power consumption of around 120kW far exceeds typical data center capabilities, potentially limiting its immediate widespread adoption. The discontinuation of the NVL36x2 has also sparked concerns about NVIDIA's execution capabilities and may disrupt the supply chain for assembly and cooling solutions. Despite these hurdles, industry experts view this as a pragmatic approach to product planning in the dynamic AI landscape. While some customers may be disappointed by the dual-rack model's cancellation, NVIDIA's long-term outlook in the AI technology market remains strong. The company continues to work with clients and listen to their needs, to position itself as a leader in high-performance computing solutions.

Micron Announces 12-high HBM3E Memory, Bringing 36 GB Capacity and 1.2 TB/s Bandwidth

As AI workloads continue to evolve and expand, memory bandwidth and capacity are increasingly critical for system performance. The latest GPUs in the industry need the highest performance high bandwidth memory (HBM), significant memory capacity, as well as improved power efficiency. Micron is at the forefront of memory innovation to meet these needs and is now shipping production-capable HBM3E 12-high to key industry partners for qualification across the AI ecosystem.

Micron's industry-leading HBM3E 12-high 36 GB delivers significantly lower power consumption than our competitors' 8-high 24 GB offerings, despite having 50% more DRAM capacity in the package
Micron HBM3E 12-high boasts an impressive 36 GB capacity, a 50% increase over current HBM3E 8-high offerings, allowing larger AI models like Llama 2 with 70 billion parameters to run on a single processor. This capacity increase allows faster time to insight by avoiding CPU offload and GPU-GPU communication delays. Micron HBM3E 12-high 36 GB delivers significantly lower power consumption than the competitors' HBM3E 8-high 24 GB solutions. Micron HBM3E 12-high 36 GB offers more than 1.2 terabytes per second (TB/s) of memory bandwidth at a pin speed greater than 9.2 gigabits per second (Gb/s). These combined advantages of Micron HBM3E offer maximum throughput with the lowest power consumption can ensure optimal outcomes for power-hungry data centers. Additionally, Micron HBM3E 12-high incorporates fully programmable MBIST that can run system representative traffic at full spec speed, providing improved test coverage for expedited validation and enabling faster time to market and enhancing system reliability.

Spot Market for Memory Struggles in First Half of 2024; Price Challenges Loom in Second Half

TrendForce reports that memory module makers have been aggressively increasing their DRAM inventories since 3Q23, with inventory levels rising to 11-17 weeks by 2Q24. However, demand for consumer electronics has not rebounded as expected. For instance, smartphone inventories in China have reached excessive levels, and notebook purchases have been delayed as consumers await new AI-powered PCs, leading to continued market contraction.

This has led to a weakening in spot prices for memory products primarily used in consumer electronics, with Q2 prices dropping over 30% compared to Q1. Although spot prices remained disconnected from contract prices through August, this divergence may signal potential future trends for contract pricing.

AMD MI300X Accelerators are Competitive with NVIDIA H100, Crunch MLPerf Inference v4.1

The MLCommons consortium on Wednesday posted MLPerf Inference v4.1 benchmark results for popular AI inferencing accelerators available in the market, across brands that include NVIDIA, AMD, and Intel. AMD's Instinct MI300X accelerators emerged competitive to NVIDIA's "Hopper" H100 series AI GPUs. AMD also used the opportunity to showcase the kind of AI inferencing performance uplifts customers can expect from its next-generation EPYC "Turin" server processors powering these MI300X machines. "Turin" features "Zen 5" CPU cores, sporting a 512-bit FPU datapath, and improved performance in AI-relevant 512-bit SIMD instruction-sets, such as AVX-512, and VNNI. The MI300X, on the other hand, banks on the strengths of its memory sub-system, FP8 data format support, and efficient KV cache management.

The MLPerf Inference v4.1 benchmark focused on the 70 billion-parameter LLaMA2-70B model. AMD's submissions included machines featuring the Instinct MI300X, powered by the current EPYC "Genoa" (Zen 4), and next-gen EPYC "Turin" (Zen 5). The GPUs are backed by AMD's ROCm open-source software stack. The benchmark evaluated inference performance using 24,576 Q&A samples from the OpenORCA dataset, with each sample containing up to 1024 input and output tokens. Two scenarios were assessed: the offline scenario, focusing on batch processing to maximize throughput in tokens per second, and the server scenario, which simulates real-time queries with strict latency limits (TTFT ≤ 2 seconds, TPOT ≤ 200 ms). This lets you see the chip's mettle in both high-throughput and low-latency queries.

GIGABYTE Introduces Accelerated Computing Servers With NVIDIA HGX H200

Giga Computing, a subsidiary of GIGABYTE and an industry leader in generative AI servers and advanced cooling technologies, today added two new 8-GPU baseboard servers to the GIGABYTE G593 series that support the NVIDIA HGX H200, a GPU memory platform ideal for large AI datasets, as well as scientific simulations and other memory-intensive workloads.

G593 Series for Scale-up Computing in AI & HPC
With dedicated real estate for cooling GPUs, the G593 series achieves stable, demanding performance in its compact 5U chassis with high airflow for incredible compute density. Maintaining the same power requirements as the air-cooled NVIDIA HGX H100-based systems, the NVIDIA H200 Tensor Core GPU optimally pairs with the road-tested GIGABYTE G593 series server that is purpose-built for an 8-GPU baseboard. To alleviate the memory bandwidth constraints on AI, including AI inference, the NVIDIA H200 GPU offers a sizable increase in memory capacity and bandwidth compared to the NVIDIA H100 Tensor Core GPU. The H200 GPU has up to 141 GB of HBM3e memory and 4.8 TB/s of memory bandwidth, translating to a 1.7X increase in memory capacity and 1.4X increase in throughput.

SK hynix Presents Extensive AI Memory Lineup at Expanded FMS 2024

SK hynix has returned to Santa Clara, California to present its full array of groundbreaking AI memory technologies at FMS: the Future of Memory and Storage (FMS) 2024 from August 6-8. Previously known as Flash Memory Summit, the conference changed its name to reflect its broader focus on all types of memory and storage products amid growing interest in AI. Bringing together industry leaders, customers, and IT professionals, FMS 2024 covers the latest trends and innovations shaping the memory industry.

Participating in the event under the slogan "Memory, The Power of AI," SK hynix is showcasing its outstanding memory capabilities through a keynote presentation, multiple technology sessions, and product exhibits.

Samsung's 8-layer HBM3E Chips Pass NVIDIA's Tests

Samsung Electronics has achieved a significant milestone in its pursuit of supplying advanced memory chips for AI systems. Their latest fifth-generation high-bandwidth memory (HBM) chips, known as HBM3E, have finally passed all NVIDIA's tests. This approval will help Samsung in catching up with competitors SK Hynix and Micron in the race to provide HBM memory chips to NVIDIA. While a supply deal hasn't been finalized yet, deliveries are expected to start in late 2024.

However, it's worth noting that Samsung passed NVIDIA's tests for the eight-layer HBM3E chips while the more advanced twelve-layer version of the HBM3E chips is still struggling pass those tests. Both Samsung and NVIDIA declined to comment on these developments. Industry expert Dylan Patel notes that while Samsung is making progress, they're still behind SK Hynix, which is already preparing to ship its own twelve-layer HBM3E chips.

NVIDIA's New B200A Targets OEM Customers; High-End GPU Shipments Expected to Grow 55% in 2025

Despite recent rumors speculating on NVIDIA's supposed cancellation of the B100 in favor of the B200A, TrendForce reports that NVIDIA is still on track to launch both the B100 and B200 in the 2H24 as it aims to target CSP customers. Additionally, a scaled-down B200A is planned for other enterprise clients, focusing on edge AI applications.

TrendForce reports that NVIDIA will prioritize the B100 and B200 for CSP customers with higher demand due to the tight production capacity of CoWoS-L. Shipments are expected to commence after 3Q24. In light of yield and mass production challenges with CoWoS-L, NVIDIA is also planning the B200A for other enterprise clients, utilizing CoWoS-S packaging technology.

Samsung Electronics Announces Results for Second Quarter of 2024

Samsung Electronics today reported financial results for the second quarter ended June 30, 2024. The Company posted KRW 74.07 trillion in consolidated revenue and operating profit of KRW 10.44 trillion as favorable memory market conditions drove higher average sales price (ASP), while robust sales of OLED panels also contributed to the results.

Memory Market Continues To Recover; Solid Second Half Outlook Centered on Server Demand
The DS Division posted KRW 28.56 trillion in consolidated revenue and KRW 6.45 trillion in operating profit for the second quarter. Driven by strong demand for HBM as well as conventional DRAM and server SSDs, the memory market as a whole continued its recovery. This increased demand is a result of the continued AI investments by cloud service providers and growing demand for AI from businesses for their on-premise servers.

Samsung's HBM3 Chips Approved by NVIDIA for Limited Use

Samsung Electronics' latest high bandwidth memory (HBM) chips have reportedly passed NVIDIA's suitability tests, according to Reuters. This development comes two months after initial reports suggested the chips had failed due to heat and power consumption issues. Despite this approval, NVIDIA plans to use Samsung's memory chips only in its H20 GPUs, a less advanced version of the H100 processors designed for the Chinese market to comply with US export restrictions.

The future of Samsung's HBM3 chips in NVIDIA's other GPU models remains uncertain, with potential additional testing required. Reuters also reported that Samsung's upcoming fifth-generation HBM3E chips are still undergoing NVIDIA's evaluation process. When approached for comment, neither company responded to Reuters. It's worth noting that Samsung previously denied the initial claims of chip failure.

Global AI Server Demand Surge Expected to Drive 2024 Market Value to US$187 Billion; Represents 65% of Server Market

TrendForce's latest industry report on AI servers reveals that high demand for advanced AI servers from major CSPs and brand clients is expected to continue in 2024. Meanwhile, TSMC, SK hynix, Samsung, and Micron's gradual production expansion has significantly eased shortages in 2Q24. Consequently, the lead time for NVIDIA's flagship H100 solution has decreased from the previous 40-50 weeks to less than 16 weeks.

TrendForce estimates that AI server shipments in the second quarter will increase by nearly 20% QoQ, and has revised the annual shipment forecast up to 1.67 million units—marking a 41.5% YoY growth.

OPENEDGES Successfully Validated Its 7nm HBM3 Testchip

OPENEDGES Technology, Inc the leading provider of memory subsystem IP, is pleased to announce that its subsidiary, The Six Semiconductor Inc (TSS), has successfully brought-up and validated its HBM3 testchip in 7 nm process technology. The IP validation testchip and the HBM3 PHY were brought up within the first month to 6.4 Gbps, and further tuning has resulted in successful operation of the HBM3 memory subsystem overclocked to 7.2 Gbps.

To date, there are only a handful of IP vendors that have taped out and demonstrated HBM3 memory subsystems, as test shuttle and HBM3 DRAM die stack sample availability are both highly limited. OPENEDGES is thrilled to be amongst one of the few companies to have demonstrated an HBM3 memory subsystem in silicon.

NVIDIA to Sell Over One Million H20 GPUs to China, Taking Home $12 Billion

When NVIDIA started preparing the H20 GPU for China, the company anticipated great demand from sanction-obeying GPUs. However, we now know precisely what the company makes from its Chinese venture: an astonishing $12 billion in take-home revenue. Due to the massive demand for NVIDIA GPUs, Chinese AI research labs are acquiring as many as they can get their hands on. According to a report from Financial Times, citing SemiAnalysis as its source, NVIDIA will sell over one million H20 GPUs in China. This number far outweighs the number of home-grown Huawei Ascend 910B accelerators that the Chinese companies plan to source, with numbers being "only" 550,000 Ascend 910B chips. While we don't know if Chinese semiconductor makers like SMIC are capable of producing more chips or if the demand isn't as high, we know why NVIDIA H20 chips are the primary target.

The Huawei Ascend 910B features Total Processing Performance (TPP), a metric developed by US Govt. to track GPU performance measuring TeraFLOPS times bit-length of over 5,000, while the NVIDIA H20 comes to 2,368 TPP, which is half of the Huawei accelerator. That is the performance on paper, where SemiAnalysis notes that the real-world performance is actually ahead for the H20 GPU due to better memory configuration of the GPU, including higher HBM3 memory bandwidth. All of this proves to be a better alternative than Ascend 910B accelerator, accounting for an estimate of over one million GPUs shipped this year in China. With an average price of $12,000 per NVIDIA H20 GPU, China's $12 billion revenue will undoubtedly help raise NVIDIA's 2024 profits even further.

AI Startup Etched Unveils Transformer ASIC Claiming 20x Speed-up Over NVIDIA H100

A new startup emerged out of stealth mode today to power the next generation of generative AI. Etched is a company that makes an application-specific integrated circuit (ASIC) to process "Transformers." The transformer is an architecture for designing deep learning models developed by Google and is now the powerhouse behind models like OpenAI's GPT-4o in ChatGPT, Anthropic Claude, Google Gemini, and Meta's Llama family. Etched wanted to create an ASIC for processing only the transformer models, making a chip called Sohu. The claim is Sohu outperforms NVIDIA's latest and greatest by an entire order of magnitude. Where a server configuration with eight NVIDIA H100 GPU clusters pushes Llama-3 70B models at 25,000 tokens per second, and the latest eight B200 "Blackwell" GPU cluster pushes 43,000 tokens/s, the eight Sohu clusters manage to output 500,000 tokens per second.

Why is this important? Not only does the ASIC outperform Hopper by 20x and Blackwell by 10x, but it also serves so many tokens per second that it enables an entirely new fleet of AI applications requiring real-time output. The Sohu architecture is so efficient that 90% of the FLOPS can be used, while traditional GPUs boast a 30-40% FLOP utilization rate. This translates into inefficiency and waste of power, which Etched hopes to solve by building an accelerator dedicated to power transformers (the "T" in GPT) at massive scales. Given that the frontier model development costs more than one billion US dollars, and hardware costs are measured in tens of billions of US Dollars, having an accelerator dedicated to powering a specific application can help advance AI faster. AI researchers often say that "scale is all you need" (resembling the legendary "attention is all you need" paper), and Etched wants to build on that.

SK hynix Showcases Its New AI Memory Solutions at HPE Discover 2024

SK hynix has returned to Las Vegas to showcase its leading AI memory solutions at HPE Discover 2024, Hewlett Packard Enterprise's (HPE) annual technology conference. Held from June 17-20, HPE Discover 2024 features a packed schedule with more than 150 live demonstrations, as well as technical sessions, exhibitions, and more. This year, attendees can also benefit from three new curated programs on edge computing and networking, hybrid cloud technology, and AI. Under the slogan "Memory, The Power of AI," SK hynix is displaying its latest memory solutions at the event including those supplied to HPE. The company is also taking advantage of the numerous networking opportunities to strengthen its relationship with the host company and its other partners.

The World's Leading Memory Solutions Driving AI
SK hynix's booth at HPE Discover 2024 consists of three product sections and a demonstration zone which showcase the unprecedented capabilities of its AI memory solutions. The first section features the company's groundbreaking memory solutions for AI, including HBM solutions. In particular, the industry-leading HBM3E has emerged as a core product to meet the growing demands of AI systems due to its exceptional processing speed, capacity, and heat dissipation. A key solution from the company's CXL lineup, CXL Memory Module-DDR5 (CMM-DDR5), is also on display in this section. In the AI era where high performance and capacity are vital, CMM-DDR5 has gained attention for its ability to expand system bandwidth by up to 50% and capacity by up to 100% compared to systems only equipped with DDR5 DRAM.

SK hynix Showcases Its Next-Gen Solutions at Computex 2024

SK hynix presented its leading AI memory solutions at COMPUTEX Taipei 2024 from June 4-7. As one of Asia's premier IT shows, COMPUTEX Taipei 2024 welcomed around 1,500 global participants including tech companies, venture capitalists, and accelerators under the theme "Connecting AI". Making its debut at the event, SK hynix underlined its position as a first mover and leading AI memory provider through its lineup of next-generation products.

"Connecting AI" With the Industry's Finest AI Memory Solutions
Themed "Memory, The Power of AI," SK hynix's booth featured its advanced AI server solutions, groundbreaking technologies for on-device AI PCs, and outstanding consumer SSD products. HBM3E, the fifth generation of HBM1, was among the AI server solutions on display. Offering industry-leading data processing speeds of 1.18 terabytes (TB) per second, vast capacity, and advanced heat dissipation capability, HBM3E is optimized to meet the requirements of AI servers and other applications. Another technology which has become crucial for AI servers is CXL as it can increase system bandwidth and processing capacity. SK hynix highlighted the strength of its CXL portfolio by presenting its CXL Memory Module-DDR5 (CMM-DDR5), which significantly expands system bandwidth and capacity compared to systems only equipped with DDR5. Other AI server solutions on display included the server DRAM products DDR5 RDIMM and MCR DIMM. In particular, SK hynix showcased its tall 128-gigabyte (GB) MCR DIMM for the first time at an exhibition.
Return to Keyword Browsing
Jan 20th, 2025 21:23 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts