News Posts matching #AI

Return to Keyword Browsing

Samsung Galaxy Z Fold6 and Z Flip6 Elevate Galaxy AI to New Heights

Samsung Electronics today announced its all-new Galaxy Z Fold6 and Galaxy Z Flip6, along with Galaxy Buds3 and Galaxy Buds3 Pro at Galaxy Unpacked in Paris.

Earlier this year, Samsung ushered in the era of mobile AI through the power of Galaxy AI. With the introduction of the new Galaxy Z series, Samsung is opening the next chapter of Galaxy AI by leveraging its most versatile and flexible form factor perfectly designed to enable a range of unique mobile experiences. Whether using Galaxy Z Fold's large screen, Galaxy Z Flip's FlexWindow or making the most of the iconic FlexMode, Galaxy Z Fold6 and Flip6 will provide more opportunities to maximize AI capabilities. Built on the foundation of Samsung's history of form factor innovation, Galaxy AI uses powerful, intelligent, and durable foldable experience to accelerate a new era of communication, productivity, and creativity.

Global PC Market Recovery Continues with 3% Growth in Q2 2024, Report

The PC market gathered momentum in Q2 2024, with worldwide shipments of desktops and notebooks up 3.4% year-on-year, reaching 62.8 million units. Shipments of notebooks (including mobile workstations) hit 50 million units, growing 4%. Desktops (including desktop workstations), which constitute 20% of the total PC market, experienced a slight 1% growth, totaling 12.8 million units. The stage is now set for accelerated growth as the refresh cycle driven by the Windows 11 transition and AI PC adoption ramps up over the next four quarters.

"The PC industry is going from strength to strength with a third consecutive quarter of growth," said Ishan Dutt, Principal Analyst at Canalys. "The market turnaround is coinciding with exciting announcements from vendors and chipset manufacturers as their AI PC roadmaps transition from promise to reality. The quarter culminated with the launch of the first Copilot+ PCs powered by Snapdragon processors and more clarity around Apple's AI strategy with the announcement of the Apple Intelligence suite of features for Mac, iPad and iPhone. Beyond these innovations, the market will start to benefit even more from its biggest tailwind - a ramp-up in PC demand driven by the Windows 11 refresh cycle. The vast majority of channel partners surveyed by Canalys in June indicated that Windows 10 end-of-life is likely to impact customer refresh plans most in either the second half of 2024 or the first half of 2025, suggesting that shipment growth will only gather steam in upcoming quarters."

CPU-Z v2.10 Changelog Confirms Core-Config of Ryzen AI 300-series Processors

CPUID this week released the latest version of CPU-Z, and its change-log confirms the core-configurations of upcoming AMD Ryzen AI 300-series "Strix Point" processor SKUs. On paper, "Strix Point" packs a 12-core CPU based on the latest "Zen 5" microarchitecture, but there's more to this number. We've known since June 2024 that the chip has a heterogeneous multicore configuration of four full-sized "Zen 5" cores, and eight compacted "Zen 5c" cores. Only the "Zen 5" cores can reach the maximum boost frequencies rated for the chip, while the "Zen 5c" cores go a few notches above the base frequency, although it's expected that the gap in boost frequencies between the two core types is expected to slightly narrow compared to that between the "Zen 4" and "Zen 4c" cores in chips such as the "Phoenix 2."

The series is led by the AMD Ryzen AI 9 HX 375, an enthusiast segment chip that maxes out all 12 cores on the chip—that's 4x "Zen 5" and 8x "Zen 5c." This model is closely followed by the Ryzen AI 9 365, which AMD marked in its presentations as being simply a 10-core/20-thread chip. We're now learning that it has 4x "Zen 5" and 6x "Zen 5c," meaning that AMD hasn't touched the counts of its faster "Zen 5" cores. It's important to note here that "Zen 5c" is not an E-core. It supports SMT, and at base frequency, it has an identical IPC to "Zen 5." It also supports the entire ISA that "Zen 5" does.

Battery Life is Driving Sales of Qualcomm Snapdragon Copilot+ PCs, Not AI

The recent launch of Copilot+ PCs, a collaboration between Microsoft and Qualcomm, has taken an unexpected turn in the market. While these devices were promoted for their artificial intelligence capabilities, a Bloomberg report reveals that consumers are primarily drawn to them for their impressive battery life. The Snapdragon X-powered Copilot+ PCs have made a significant impact, securing 20% of global PC sales during their launch week. However, industry analyst Avi Greengart points out that the extended battery life, not the AI features, is driving these sales. Microsoft introduced three AI-powered features exclusive to these PCs: Cocreator, Windows Studio Effects, and Live Captions with Translation. Despite these innovations, many users find these features non-essential for daily use. The delay of the anticipated Recall feature due to privacy concerns has further dampened enthusiasm for the AI aspects of these devices.

The slow reception of on-device AI capabilities extends beyond consumer preferences to the software industry. Major companies like Adobe, Salesforce, and SentinelOne declined Microsoft's request to optimize their apps for the new hardware, citing resource constraints and the limited market share of AI-capable PCs. Gregor Steward, SentinelOne's VP for AI, suggests it could take years before AI PCs are widespread enough to justify app optimization. Analysts project that by 2028, only 40% of new computers will be AI-capable. Despite these challenges, Qualcomm remains optimistic about the future of AI PCs. While the concept may currently be more on the marketing side, the introduction of Arm-based Windows laptops offers a welcome alternative to the Intel-AMD duopoly. As the technology evolves and adoption increases, on-device AI features may become more prevalent and useful. The imminent arrival of AMD Ryzen AI 300 series and Intel Lunar Lake chips promises to expand the Copilot+ PC space further. For now, however, it appears that superior battery life remains the primary selling point for consumers.

Moore Threads MTLink Scales Up to 10,000 Home-Grown GPUs in AI Cluster

Chinese GPU manufacturer Moore Threads has announced a significant upgrade to its KUAE data center server. The company now has the ability to connect up to 10,000 GPUs in a single cluster, marking a huge leap in its scale-out capabilities for artificial intelligence and high-performance computing applications. The enhanced KUAE server incorporates eight MTT S4000 GPUs, leveraging Moore Threads' proprietary MTLink interconnect technology. These GPUs, based on the MUSA architecture, each feature 128 tensor cores and 48 GB of GDDR6 memory, delivering a bandwidth of 768 GB/s. While the full performance metrics of a 10,000-GPU cluster remain undisclosed, the sheer scale of 1,280,000 tensor cores suggests decent computing potential. Moore Threads' GPUs currently lag behind NVIDIA's GPU offerings in terms of performance. However, the company claims its MTT S4000 remains competitive against certain NVIDIA models, particularly in large language model training and inference tasks.

The Chinese company is facing significant challenges due to its inclusion on the U.S. Department of Commerce's Entity List, restricting access to advanced manufacturing processes. Despite these obstacles, the firm has secured partnerships with major Chinese state-run telecom operators and technology companies, focusing on developing new computing cluster projects. A recent financing round raised approximately $343.7 million will help Moore Threads' ambitious expansion plans. However, limited access to cutting-edge semiconductor fabrication technologies may constrain the company's future growth. Nonetheless, creating a scale-out server infrastructure with up to 10,000 GPUs is vital for LLM training and inference, especially as Chinese AI labs catch up to Western labs in terms of the performance of their AI models.

AMD "Strix Halo" a Large Rectangular BGA Package the Size of an LGA1700 Processor

Apparently the AMD "Strix Halo" processor is real, and it's large. The chip is designed to square off against the likes of the Apple M3 Pro and M3 Max, in letting ultraportable notebooks have powerful graphics performance. A chiplet-based processor, not unlike the desktop socketed "Raphael," and mobile BGA "Dragon Range," the "Strix Halo" processor consists of one or two CCDs containing CPU cores, wired to a large die, that's technically the cIOD (client I/O die), but containing an oversized iGPU, and an NPU. The point behind "Strix Halo" is to eliminate the need for a performance-segment discrete GPU, and conserve its PCB footprint.

According to leaks by Harukaze5719, a reliable source with AMD leaks, "Strix Halo" comes in a BGA package dubbed FP11, measuring 37.5 mm x 45 mm, which is significantly larger than the 25 mm x 40 mm size of the FP8 BGA package that the regular "Strix Point," "Hawk Point," and "Phoenix" mobile processors are built on. It is larger in area than the 40 mm x 40 mm FL1 BGA package of "Dragon Range" and upcoming "Fire Range" gaming notebook processors. "Strix Halo" features one or two of the same 4 nm "Zen 5" CCDs featured on the "Granite Ridge" desktop and "Fire Range" mobile processors, but connected to a much larger I/O die, as we mentioned.

Samsung Electronics To Provide Turnkey Semiconductor Solutions With 2nm GAA Process and 2.5D Package to Preferred Networks

Samsung Electronics, a world leader in advanced semiconductor technology, today announced that it will provide turnkey semiconductor solutions using the 2-nanometer (nm) foundry process and the advanced 2.5D packaging technology Interposer-Cube S (I-Cube S) to Preferred Networks, a leading Japanese AI company.

By leveraging Samsung's leading-edge foundry and advanced packaging products, Preferred Networks aims to develop powerful AI accelerators that meet the ever-growing demand for computing power driven by generative AI.

AAEON MAXER-2100 Inference Server Integrates Both Intel CPU and NVIDIA GPU Tech

Leading provider of advanced AI solutions AAEON (Stock Code: 6579), has released the inaugural offering of its AI Inference Server product line, the MAXER-2100. The MAXER-2100 is a 2U Rackmount AI inference server powered by the Intel Core i9-13900 Processor, designed to meet high-performance computing needs.
The MAXER-2100 is also able to support both 12th and 13th Generation Intel Core LGA 1700 socket-type CPUs, up to 125 W, and features an integrated NVIDIA GeForce RTX 4080 SUPER GPU. While the product's default comes with the NVIDIA GeForce RTX 4080 SUPER, it is also compatible with and an NVIDIA-Certified Edge System for both the NVIDIA L4 Tensor Core and NVIDIA RTX 6000 Ada GPUs.

Given the MAXER-2100 is equipped with both a high-performance CPU and industry-leading GPU, a key feature highlighted by AAEON upon the product's launch is its capacity to execute complex AI algorithms and datasets, process multiple high-definition video streams simultaneously, and utilize machine learning to refine large language models (LLMs) and inferencing models.

AMD is Becoming a Software Company. Here's the Plan

Just a few weeks ago, AMD invited us to Barcelona as part of a roundtable, to share their vision for the future of the company, and to get our feedback. On site, were prominent AMD leadership, including Phil Guido, Executive Vice President & Chief Commercial Officer and Jack Huynh, Senior VP & GM, Computing and Graphics Business Group. AMD is making changes in a big way to how they are approaching technology, shifting their focus from hardware development to emphasizing software, APIs, and AI experiences. Software is no longer just a complement to hardware; it's the core of modern technological ecosystems, and AMD is finally aligning its strategy accordingly.

The major difference between AMD and NVIDIA is that AMD is a hardware company that makes software on the side to support its hardware; while NVIDIA is a software company that designs hardware on the side to accelerate its software. This is about to change, as AMD is making a pivot toward software. They believe that they now have the full stack of computing hardware—all the way from CPUs, to AI accelerators, to GPUs, to FPGAs, to data-processing and even server architecture. The only frontier left for AMD is software.

NVIDIA to Sell Over One Million H20 GPUs to China, Taking Home $12 Billion

When NVIDIA started preparing the H20 GPU for China, the company anticipated great demand from sanction-obeying GPUs. However, we now know precisely what the company makes from its Chinese venture: an astonishing $12 billion in take-home revenue. Due to the massive demand for NVIDIA GPUs, Chinese AI research labs are acquiring as many as they can get their hands on. According to a report from Financial Times, citing SemiAnalysis as its source, NVIDIA will sell over one million H20 GPUs in China. This number far outweighs the number of home-grown Huawei Ascend 910B accelerators that the Chinese companies plan to source, with numbers being "only" 550,000 Ascend 910B chips. While we don't know if Chinese semiconductor makers like SMIC are capable of producing more chips or if the demand isn't as high, we know why NVIDIA H20 chips are the primary target.

The Huawei Ascend 910B features Total Processing Performance (TPP), a metric developed by US Govt. to track GPU performance measuring TeraFLOPS times bit-length of over 5,000, while the NVIDIA H20 comes to 2,368 TPP, which is half of the Huawei accelerator. That is the performance on paper, where SemiAnalysis notes that the real-world performance is actually ahead for the H20 GPU due to better memory configuration of the GPU, including higher HBM3 memory bandwidth. All of this proves to be a better alternative than Ascend 910B accelerator, accounting for an estimate of over one million GPUs shipped this year in China. With an average price of $12,000 per NVIDIA H20 GPU, China's $12 billion revenue will undoubtedly help raise NVIDIA's 2024 profits even further.

Intel Arrow Lake CPU Refresh May Include Upgraded NPU, Increasing Die Size

Intel's upcoming Arrow Lake "S" Desktop and "HX" laptop CPUs are reported to launch without dedicated NPU hardware. NPUs will be limited to Arrow Lake-H/U and Lunar Lake chips, with Core Ultra 200V chips offering up to 48 TOPS of AI performance. Currently, AMD is the only manufacturer offering desktop chips with dedicated NPUs in their Ryzen 8000G "Hawk Point" series for the AM5 platform. However, according to Jaykihn, an active Intel-related leaker, Intel may be planning to incorporate NPUs in future Arrow Lake-S and Arrow Lake-HX refreshes.

The potential refresh could include an NPU within the SOC tile, possibly increasing the die size by 2.8 mm compared to current Arrow Lake designs. The package size is expected to remain unchanged, maintaining socket compatibility, however, motherboard manufacturers would need to enable Fast Voltage Mode (FVM) on VccSA rails to support the NPU functionality. While it's early to discuss an Arrow Lake refresh before the initial launch, this development could impact Intel's roadmap and the "AI PC" market segment. Also, it could have possible implications for the release schedule of future architectures like Panther Lake.

Intel Arc "Battlemage" Xe2 GPUs with 448 EUs (56 Xe cores) Spotted in Transit

Intel very much does intend to make discrete gaming GPUs based on its Xe2 "Battlemage" graphics architecture, which made its debut with the Core Ultra 200V "Lunar Lake-MX" processor as an iGPU. With its next generation, Intel plans to capture an even bigger share of the gaming graphics market, both on the notebook and desktop platforms. "Battlemage" will be crucial for Intel, as it will be able to make its case with Microsoft and Sony for semi-custom chips, for their next-generation consoles. Intel has all pieces of the console SoC puzzle that AMD does. A Xe2 "Battlemage" discrete GPU sample, codenamed "Churchill Falls," has been spotted making its transit in and out of locations known for Intel SoC development, such as Bangalore in India, and Shanghai in China.

Such shipping manifests tend to be incredibly descriptive, and speak of an Arc "Battlemage" X3 and Arc "Battlemage" X4 SKUs, each with 448 execution units (EU), across 56 Xe cores. Assuming an Xe core continues to have 128 unified shaders in the "Battlemage" architecture, you're looking at 7,168 unified shaders for this GPU, a staggering 75% increase in just the numerical count of the shaders, and not accounting for IPC increase and other architecture-level features. The descriptions also speak of a 256-bit wide memory bus, although they don't specify memory type or speed. Given that at launch, the Arc A770 "Alchemist" was a 1440p-class GPU, we predict Intel might take a crack at a 4K-class GPU. Besides raster 3D performance, Intel is expected to significantly improve the ray tracing and AI performance of its Xe2 discrete GPUs, making them powerful options for creative professionals.

Demand from AMD and NVIDIA Drives FOPLP Development, Mass Production Expected in 2027-2028

In 2016, TSMC developed and named its InFO FOWLP technology, and applied it to the A10 processor used in the iPhone 7. TrendForce points out that since then, OSAT providers have been striving to develop FOWLP and FOPLP technologies to offer more cost-effective packaging solutions.

Starting in the second quarter, chip companies like AMD have actively engaged with TSMC and OSAT providers to explore the use of FOPLP technology for chip packaging and helping drive industry interest in FOPLP. TrendForce observes that there are three main models for introducing FOPLP packaging technology: Firstly, OSAT providers transitioning from traditional methods of consumer IC packaging to FOPLP. Secondly, foundries and OSAT providers packaging AI GPUs that are transitioning 2.5D packaging from wafer level to panel level. Thirdly, panel makers who are packaging consumer ICs.

Panmnesia Uses CXL Protocol to Expand GPU Memory with Add-in DRAM Card or Even SSD

South Korean startup Panmnesia has unveiled an interesting solution to address the memory limitations of modern GPUs. The company has developed a low-latency Compute Express Link (CXL) IP that could help expand GPU memory with external add-in card. Current GPU-accelerated applications in AI and HPC are constrained by the set amount of memory built into GPUs. With data sizes growing by 3x yearly, GPU networks must keep getting larger just to fit the application in the local memory, benefiting latency and token generation. Panmnesia's proposed approach to fix this leverages the CXL protocol to expand GPU memory capacity using PCIe-connected DRAM or even SSDs. The company has overcome significant technical hurdles, including the absence of CXL logic fabric in GPUs and the limitations of existing unified virtual memory (UVM) systems.

At the heart of Panmnesia's solution is a CXL 3.1-compliant root complex with multiple root ports and a host bridge featuring a host-managed device memory (HDM) decoder. This sophisticated system effectively tricks the GPU's memory subsystem into treating PCIe-connected memory as native system memory. Extensive testing has demonstrated impressive results. Panmnesia's CXL solution, CXL-Opt, achieved two-digit nanosecond round-trip latency, significantly outperforming both UVM and earlier CXL prototypes. In GPU kernel execution tests, CXL-Opt showed execution times up to 3.22 times faster than UVM. Older CXL memory extenders recorded around 250 nanoseconds round trip latency, with CXL-Opt potentially achieving less than 80 nanoseconds. As with CXL, the problem is usually that the memory pools add up latency and performance degrades, while these CXL extenders tend to add to the cost model as well. However, the Panmnesia CXL-Opt could find a use case, and we are waiting to see if anyone adopts this in their infrastructure.
Below are some benchmarks by Panmnesia, as well as the architecture of the CXL-Opt.

Opera GX Browser AI Gets New Features

Opera GX, the browser for gamers, is bringing a significant update to the browser's built-in AI, Aria. This update provides users with the latest AI features that Opera has been releasing as part of their experimental AI Feature Drops program in the Developer stream of the Opera One browser. The features that are arriving to Opera GX increase Aria's capabilities by introducing image generation and understanding, voice output, a chat summary option, and links to sources.

Image Generation and Voice Output
Images are crucial to today's web, so this Opera GX update places a strong focus on the visual. With this update, Aria is gaining the ability to turn text prompts and descriptions into unique images using the image generation model Imagen2 by Google. Aria identifies the user's intention to generate an image based on conversational prompts. They can also use the "regenerate" option to have Aria come up with a new image. Aria allows them to generate 30 images per day per user.

Intel Core Ultra "Arrow Lake" Desktop Platform Map Leaked: Two CPU-attached M.2 Slots

Intel's upcoming Core Ultra "Arrow Lake-S" desktop processor introduces a new socket, the LGA1851, alongside the new Intel 800-series desktop chipset. We now have some idea what the 151 additional pins on the new socket are used for, thanks to a leaked platform map on the ChipHell forums, discovered by HXL. Intel is expanding the number of PCIe lanes from the processor. It now puts out a total of 32 PCIe lanes.

From the 32 PCIe lanes put out by the "Arrow Lake-S" processor's system agent, 16 are meant for the PCI-Express 5.0 x16 PEG slot to be used for discrete graphics. Eight are used as chipset bus, technically DMI 4.0 x8 (these are eight lanes that operate at Gen 4 speed for 128 Gbps per direction of bandwidth). There are now not one, but two CPU-attached M.2 NVMe slots possible, just like on the AMD "Raphael" and "Granite Ridge" processors. What's interesting, though, is that not both are Gen 5. One of these is Gen 5 x4, while the other is Gen 4 x4.

SK Hynix to Invest $75 Billion by 2028 in Memory Solutions for AI

South Korean giant SK Group has unveiled plans for substantial investments in AI and semiconductor technologies worth almost $75 billion. SK Group subsidiary, SK Hynix, will lead this initiative with a staggering 103 trillion won ($74.6 billion) investment over the next three years, with plans to realize the investment by 2028. This commitment is in addition to the ongoing construction of a $90 billion mega fab complex in Gyeonggi Province for cutting-edge memory production. SK Group has further pledged an additional $58 billion, bringing the total investment to a whopping $133 billion. This capital infusion aims to enhance the group's competitiveness in the AI value chain while funding operations across its 175 subsidiaries, including SK Hynix.

While specific details remain undisclosed, SK Group is reportedly exploring various options, including potential mergers and divestments. SK Group has signaled that business practices need change amid shifting geopolitical situations and the massive boost that AI is bringing to the overall economy. We may see more interesting products from SK Group in the coming years as it potentially enters new markets centered around AI. This strategic pivot comes after SK Hynix reported its first loss in a decade in 2022. However, the company has since shown signs of recovery, fueled by the surging demand for memory solutions for AI chips. The company currently has a 35% share of the global DRAM market and plans to have an even stronger presence in the coming years. The massive investment aligns with the South Korean government's recently announced $19 billion support package for the domestic semiconductor industry, which will be distributed across companies like SK Hynix and Samsung.

AMD Designs Neural Block Compression Tech for Games: Smaller Downloads and Updates

AMD is developing a new technology that promises to significantly reduce the size on disk of games, as well as reduce the size of game patches and updates. Today's AAA games tend to be over a 100 GB in size, with game updates running into tens of gigabytes, with some of the major updates practically downloading the game all over again. Upcoming games like Call of Duty: Black Ops 6 is reportedly over 300 GB in size, which pushes the game away from those with anything but Internet connections with hundreds of Mbps in speeds. Much of the bulk of the game is made up of visual assets—textures, sprites, and cutscene videos. A modern AAA title could have hundreds of thousands of individual game assets, and sometimes even redundant sets of textures for different image quality settings.

AMD's solution to this problem is the Neural Block Compression technology. The company will get into the nuts and bolts of the tech in its presentation at the 2024 Eurographics Symposium on Rendering (July 3-5), but we have a vague idea of what it could be. Modern games don't drape surfaces of a wireframe with a texture, but also additional layers, such as specular maps, normal maps, roughness maps, etc). AMD's idea is to "flatten" all these layers, including the base texture, into a single asset format, which the game engine could disaggregate into the individual layers using an AI neural network. This is not to be confused with mega-textures—something entirely different, which relies on a single large texture covering all objects in a scene. The idea here is to flatten the various data layers of individual textures and their maps, into a single asset type. In theory, this should yield significant file-size savings, even if it results in some additional compute cost on the client's end.

Report: US PC Market Set for 5% Growth in 2024 Amid a Healthy Recovery Trajectory

PC (excluding tablets) shipments to the United States grew 5% year-on-year to 14.8 million units in Q1 2024. The consumer and SMB segments were the key growth drivers, both witnessing shipment increases above 9% year-on-year in the first quarter. With a strong start to the year, the market is now poised for a healthy recovery trajectory amid the ongoing Windows refresh cycle. Total PC shipments to the US are expected to hit 69 million units in 2024 before growing another 8% to 75 million units in 2025.

For the third consecutive quarter, the consumer segment showed the best performance in the US market. "Continued discounting after the holiday season boosted consumer demand for PCs into the start of 2024," said Greg Davis, Analyst at Canalys. "However, the first quarter also saw an uptick in commercial sector performance. Shipment growth in small and medium businesses indicates that the anticipated refresh brought by the Windows 10 end-of-life is underway. With enterprise customers set to follow suit, the near-term outlook for the market remains highly positive."

Intel Xeon Processors Accelerate GenAI Workloads with Aible

Intel and Aible, an end-to-end serverless generative AI (GenAI) and augmented analytics enterprise solution, now offer solutions to shared customers to run advanced GenAI and retrieval-augmented generation (RAG) use cases on multiple generations of Intel Xeon CPUs. The collaboration, which includes engineering optimizations and a benchmarking program, enhances Aible's ability to deliver GenAI results at a low cost for enterprise customers and helps developers embed AI intelligence into applications. Together, the companies offer scalable and efficient AI solutions that draw on high-performing hardware to help customers solve challenges with AI and Intel.

"Customers are looking for efficient, enterprise-grade solutions to harness the power of AI. Our collaboration with Aible shows how we're closely working with the industry to deliver innovation in AI and lowering the barrier to entry for many customers to run the latest GenAI workloads using Intel Xeon processors," said Mishali Naik, Intel senior principal engineer, Data Center and AI Group.

Intel Demonstrates First Fully Integrated Optical IO Chiplet

Intel Corporation has achieved a revolutionary milestone in integrated photonics technology for high-speed data transmission. At the Optical Fiber Communication Conference (OFC) 2024, Intel's Integrated Photonics Solutions (IPS) Group demonstrated the industry's most advanced and first-ever fully integrated optical compute interconnect (OCI) chiplet co-packaged with an Intel CPU and running live data. Intel's OCI chiplet represents a leap forward in high-bandwidth interconnect by enabling co-packaged optical input/output (I/O) in emerging AI infrastructure for data centers and high performance computing (HPC) applications.

"The ever-increasing movement of data from server to server is straining the capabilities of today's data center infrastructure, and current solutions are rapidly approaching the practical limits of electrical I/O performance. However, Intel's groundbreaking achievement empowers customers to seamlessly integrate co-packaged silicon photonics interconnect solutions into next-generation compute systems. Our OCI chiplet boosts bandwidth, reduces power consumption and increases reach, enabling ML workload acceleration that promises to revolutionize high-performance AI infrastructure," said Thomas Liljeberg, senior director, Product Management and Strategy, Integrated Photonics Solutions (IPS) Group.

ByteDance and Broadcom to Collaborate on Advanced AI Chip

ByteDance, TikTok's parent company, is reportedly working with American chip designer Broadcom to develop a cutting-edge AI processor. This collaboration could secure a stable supply of high-performance chips for ByteDance, according to Reuters. Sources claim the joint project involves a 5 nm Application-Specific Integrated Circuit (ASIC), designed to comply with U.S. export regulations. TSMC is slated to manufacture the chip, though production is not expected to begin this year.

This partnership marks a significant development in U.S.-China tech relations, as no public announcements of such collaborations on advanced chips have been made since Washington implemented stricter export controls in 2022. For ByteDance, this move could reduce procurement costs and ensure a steady chip supply, crucial for powering its array of popular apps, including TikTok and the ChatGPT-like AI chatbot "Doubao." The company has already invested heavily in AI chips, reportedly spending $2 billion on NVIDIA processors in 2023.

AI Startup Etched Unveils Transformer ASIC Claiming 20x Speed-up Over NVIDIA H100

A new startup emerged out of stealth mode today to power the next generation of generative AI. Etched is a company that makes an application-specific integrated circuit (ASIC) to process "Transformers." The transformer is an architecture for designing deep learning models developed by Google and is now the powerhouse behind models like OpenAI's GPT-4o in ChatGPT, Anthropic Claude, Google Gemini, and Meta's Llama family. Etched wanted to create an ASIC for processing only the transformer models, making a chip called Sohu. The claim is Sohu outperforms NVIDIA's latest and greatest by an entire order of magnitude. Where a server configuration with eight NVIDIA H100 GPU clusters pushes Llama-3 70B models at 25,000 tokens per second, and the latest eight B200 "Blackwell" GPU cluster pushes 43,000 tokens/s, the eight Sohu clusters manage to output 500,000 tokens per second.

Why is this important? Not only does the ASIC outperform Hopper by 20x and Blackwell by 10x, but it also serves so many tokens per second that it enables an entirely new fleet of AI applications requiring real-time output. The Sohu architecture is so efficient that 90% of the FLOPS can be used, while traditional GPUs boast a 30-40% FLOP utilization rate. This translates into inefficiency and waste of power, which Etched hopes to solve by building an accelerator dedicated to power transformers (the "T" in GPT) at massive scales. Given that the frontier model development costs more than one billion US dollars, and hardware costs are measured in tens of billions of US Dollars, having an accelerator dedicated to powering a specific application can help advance AI faster. AI researchers often say that "scale is all you need" (resembling the legendary "attention is all you need" paper), and Etched wants to build on that.

CSPs to Expand into Edge AI, Driving Average NB DRAM Capacity Growth by at Least 7% in 2025

TrendForce has observed that in 2024, major CSPs such as Microsoft, Google, Meta, and AWS will continue to be the primary buyers of high-end AI servers, which are crucial for LLM and AI modeling. Following establishing a significant AI training server infrastructure in 2024, these CSPs are expected to actively expand into edge AI in 2025. This expansion will include the development of smaller LLM models and setting up edge AI servers to facilitate AI applications across various sectors, such as manufacturing, finance, healthcare, and business.

Moreover, AI PCs or notebooks share a similar architecture to AI servers, offering substantial computational power and the ability to run smaller LLM and generative AI applications. These devices are anticipated to serve as the final bridge between cloud AI infrastructure and edge AI for small-scale training or inference applications.

QNAP Thunderbolt 4 NAS TBS-h574TX and TVS-h874T Win the Red Dot Award 2024

Amid a field of over 20,000 submissions from 60 countries, the QNAP Thunderbolt 4 NAS TBS-h574TX and TVS-h874T won the Red Dot Award: Product Design 2024. The TBS-h574TX Thunderbolt 4 all-flash NASbook is designed for film sets, small studios, small-scale video production teams and SOHO users. Powered by an Intel Core i9 16-core / i7 12-core processor, the TVS-h874T Thunderbolt 4 NAS is a great sidekick for your creative talents. The professional jurors of the Red Dot Jury highly recognized the TBS-h574TX and the TVS-h874T with distinction, signifying high quality design.

The TBS-h574TX packs high-speed I/O and Intel Core performance required by video production, allowing creators using Mac or Windows to enjoy the smoothest experience ever in real-time video editing, large file transfer, video transcoding, and backup. The TBS-h574TX, acting as the bridge between pre-production and post-production, takes video projects and team collaboration to the next level. The TBS-h574TX runs the ZFS-based QuTS hero operating system that ensures data integrity. You can also switch to the QTS operating system based on your needs.
Return to Keyword Browsing
Nov 21st, 2024 11:40 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts