News Posts matching #training

Return to Keyword Browsing

Google Teams up with MediaTek for Next-Generation TPU v7 Design

According to Reuters, citing The Information, Google will collaborate with MediaTek to develop its seventh-generation Tensor Processing Unit (TPU), which is also known as TPU v7. Google maintains its existing partnership with Broadcom despite the new MediaTek collaboration. The AI accelerator is scheduled for production in 2026, and TSMC is handling manufacturing duties. Google will lead the core architecture design while MediaTek manages I/O and peripheral components, as Economic Daily News reports. This differs from Google's ongoing relationship with Broadcom, which co-develops core TPU architecture. The MediaTek partnership reportedly stems from the company's strong TSMC relationship and lower costs compared to Broadcom.

There is also a possibility that MediaTek could design inference-focused TPU v7 chips while Broadcom focuses on training architecture. Nonetheless, the development of TPU is a massive market as Google is using so many chips that it could use a third company, hypothetically. The development of TPU continues Google's vertical integration strategy for AI infrastructure. Google reduces dependency on NVIDIA hardware by designing proprietary AI chips for internal R&D and cloud operations. At the same time, competitors like OpenAI, Anthropic, and Meta rely heavily on NVIDIA's processors for AI training and inference. At Google's scale, serving billions of queries a day, designing custom chips makes sense from both financial and technological sides. As Google develops its own specific workloads, translating that into hardware acceleration is the game that Google has been playing for years now.

Ubisoft Summarizes Rainbow Six Siege X Showcase, Announces June 10 Release

The next evolution of Rainbow Six Siege was revealed today at the Siege X Showcase. Launching on June 10, Siege X will introduce Dual Front, a dynamic new 6v6 game mode, as well as deliver foundational upgrades to the core game (including visual enhancements, an audio overhaul, rappel upgrades, and more) alongside revamped player protection systems, and free access that will allow players to experience the unique tactical action of Rainbow Six Siege at no cost. Plus, from now through March 19, a free Dual Front closed beta is live on PC via Ubisoft Connect, PS5, and Xbox Series X|S, giving players a first chance to play the exciting new mode. Read on to find out how to get into the beta and try Dual Front for yourself.

Dual Front
Taking place on an entirely new map called District, Dual Front is a new mode that pits two teams of six Operators against each other in a fight to attack enemy sectors while defending their own. Players can choose from a curated roster of 35 Operators—both Attackers and Defenders - that will rotate twice per season. During each match, two objective points are live at all times, one in each team's lane; teams must plant a sabotage kit (akin to a defuser) in the opposing team's objective room and defend it in order to capture the sector and progress towards the final objective: the Base. Sabotage the Base to claim victory, but don't forget to defend your own sector, lest your foes progress faster than you and beat you to it.

Meta Reportedly Reaches Test Phase with First In-house AI Training Chip

According to a Reuters technology report, Meta's engineering department is engaged in the testing of their "first in-house chip for training artificial intelligence systems." Two inside sources have declared this significant development milestone; involving a small-scale deployment of early samples. The owner of Facebook could ramp up production, upon initial batches passing muster. Despite a recent-ish showcasing of an open-architecture NVIDIA "Blackwell" GB200 system for enterprise, Meta leadership is reported to be pursuing proprietary solutions. Multiple big players—in the field of artificial intelligence—are attempting to breakaway from a total reliance on Team Green. Last month, press outlets concentrated on OpenAI's alleged finalization of an in-house design, with rumored involvement coming from Broadcom and TSMC.

One of the Reuters industry moles believes that Meta has signed up with TSMC—supposedly, the Taiwanese foundry was responsible for the production of test batches. Tom's Hardware reckons that Meta and Broadcom were working together with the tape out of the social media giant's "first AI training accelerator." Development of the company's "Meta Training and Inference Accelerator" (MTIA) series has stretched back a couple of years—according to Reuters, this multi-part project: "had a wobbly start for years, and at one point scrapped a chip at a similar phase of development...Meta last year, started using an MTIA chip to perform inference, or the process involved in running an AI system as users interact with it, for the recommendation systems that determine which content shows up on Facebook and Instagram news feeds." Leadership is reportedly aiming to get custom silicon solutions up and running for AI training by next year. Past examples of MTIA hardware were deployed with open-source RISC-V cores (for inference tasks), but is not clear whether this architecture will form the basis of Meta's latest AI chip design.

PGA TOUR 2K25 Out Now on PC & Consoles

2K announces that PGA TOUR 2K25, the newest entry in the golf simulation franchise from HB Studios, is now available worldwide on Xbox Series X|S, PlayStation 5, and PC via Steam. Players can step onto the tee box and take their best shot at golf glory with several franchise advancements, including upgraded graphics and the addition of new EvoSwing mechanics, which offer a host of new shot types, ball flights, roll physics, and visual improvements. The new Perfect Swing difficulty setting offers beginner players a more forgiving intuitive experience and veteran players a more relaxing round. The most immersive and customizable PGA TOUR 2K MyPLAYER and MyCAREER experience to date offers a diverse suite of customization options, while the franchise's signature Course Designer offers new tools to allow players more freedom to create their dream courses and share them with the global community.

"We've reached a major milestone in the evolution of the franchise with PGA TOUR 2K25," said Dennis Ceccarelli, Senior Vice President and General Manager, Sports at 2K. "From the moment players step onto the digital tee box, they'll feel the difference—whether it's the precision of EvoSwing, the ease of Perfect Swing or the depth of our all-new progression systems. With enhanced realism, refined gameplay and three Major Championships, PGA TOUR 2K25 delivers the most immersive and rewarding golf experience ever made."

NVIDIA Explains How CUDA Libraries Bolster Cybersecurity With AI

Traditional cybersecurity measures are proving insufficient for addressing emerging cyber threats such as malware, ransomware, phishing and data access attacks. Moreover, future quantum computers pose a security risk to today's data through "harvest now, decrypt later" attack strategies. Cybersecurity technology powered by NVIDIA accelerated computing and high-speed networking is transforming the way organizations protect their data, systems and operations. These advanced technologies not only enhance security but also drive operational efficiency, scalability and business growth.

Accelerated AI-Powered Cybersecurity
Modern cybersecurity relies heavily on AI for predictive analytics and automated threat mitigation. NVIDIA GPUs are essential for training and deploying AI models due to their exceptional computational power.

NVIDIA & Partners Will Discuss Supercharging of AI Development at GTC 2025

Generative AI is redefining computing, unlocking new ways to build, train and optimize AI models on PCs and workstations. From content creation and large and small language models to software development, AI-powered PCs and workstations are transforming workflows and enhancing productivity. At GTC 2025, running March 17-21 in the San Jose Convention Center, experts from across the AI ecosystem will share insights on deploying AI locally, optimizing models and harnessing cutting-edge hardware and software to enhance AI workloads—highlighting key advancements in RTX AI PCs and workstations.

Develop and Deploy on RTX
RTX GPUs are built with specialized AI hardware called Tensor Cores that provide the compute performance needed to run the latest and most demanding AI models. These high-performance GPUs can help build digital humans, chatbots, AI-generated podcasts and more. With more than 100 million GeForce RTX and NVIDIA RTX GPUs users, developers have a large audience to target when new AI apps and features are deployed. In the session "Build Digital Humans, Chatbots, and AI-Generated Podcasts for RTX PCs and Workstations," Annamalai Chockalingam, senior product manager at NVIDIA, will showcase the end-to-end suite of tools developers can use to streamline development and deploy incredibly fast AI-enabled applications.

AMD to Showcase Ryzen AI Max PRO Series at 3DExperience World 2025

It's that time again! 3DExperience World 2025 kicks off on February 23 and runs through February 26 at the George R. Brown Convention Center in Houston, Texas. The show is hosted by Dassault Systèmes and highlights annual advances and improvements throughout its product ecosystem. It's a great opportunity to meet the engineers, students, and industry professionals who use SolidWorks and other Dassault Systèmes applications across browsers, local workstations, and the cloud.

One of the best parts of the event for me is showcasing how advances in silicon engineering can lead to transformational products - systems that offer performance, features, and efficiency that wasn't possible before. In 2024, the AMD Ryzen Threadripper PRO 7000 WX-Series processor stole the proverbial show with its excellent single-thread performance, support for multi-GPU configurations for AI training, and up to 96 cores and 2T B of memory for the largest and most demanding projects. This year, AMD has complemented these full-size tower systems with compact and mobile workstations based on the new AMD Ryzen AI Max PRO Series processors. Drop by booth #919 and see the array of systems and demos on exhibit.

CoreWeave Launches Debut Wave of NVIDIA GB200 NVL72-based Cloud Instances

AI reasoning models and agents are set to transform industries, but delivering their full potential at scale requires massive compute and optimized software. The "reasoning" process involves multiple models, generating many additional tokens, and demands infrastructure with a combination of high-speed communication, memory and compute to ensure real-time, high-quality results. To meet this demand, CoreWeave has launched NVIDIA GB200 NVL72-based instances, becoming the first cloud service provider to make the NVIDIA Blackwell platform generally available. With rack-scale NVIDIA NVLink across 72 NVIDIA Blackwell GPUs and 36 NVIDIA Grace CPUs, scaling to up to 110,000 GPUs with NVIDIA Quantum-2 InfiniBand networking, these instances provide the scale and performance needed to build and deploy the next generation of AI reasoning models and agents.

NVIDIA GB200 NVL72 on CoreWeave
NVIDIA GB200 NVL72 is a liquid-cooled, rack-scale solution with a 72-GPU NVLink domain, which enables the six dozen GPUs to act as a single massive GPU. NVIDIA Blackwell features many technological breakthroughs that accelerate inference token generation, boosting performance while reducing service costs. For example, fifth-generation NVLink enables 130 TB/s of GPU bandwidth in one 72-GPU NVLink domain, and the second-generation Transformer Engine enables FP4 for faster AI performance while maintaining high accuracy. CoreWeave's portfolio of managed cloud services is purpose-built for Blackwell. CoreWeave Kubernetes Service optimizes workload orchestration by exposing NVLink domain IDs, ensuring efficient scheduling within the same rack. Slurm on Kubernetes (SUNK) supports the topology block plug-in, enabling intelligent workload distribution across GB200 NVL72 racks. In addition, CoreWeave's Observability Platform provides real-time insights into NVLink performance, GPU utilization and temperatures.

ASUS AI POD With NVIDIA GB200 NVL72 Platform Ready to Ramp-Up Production for Scheduled Shipment in March

ASUS is proud to announce that ASUS AI POD, featuring the NVIDIA GB200 NVL72 platform, is ready to ramp-up production for a scheduled shipping date of March 2025. ASUS remains dedicated to providing comprehensive end-to-end solutions and software services, encompassing everything from AI supercomputing to cloud services. With a strong focus on fostering AI adoption across industries, ASUS is positioned to empower clients in accelerating their time to market by offering a full spectrum of solutions.

Proof of concept, funded by ASUS
Honoring the commitment to delivering exceptional value to clients, ASUS is set to launch a proof of concept (POC) for the groundbreaking ASUS AI POD, powered by the NVIDIA Blackwell platform. This exclusive opportunity is now open to a select group of innovators who are eager to harness the full potential of AI computing. Innovators and enterprises can experience firsthand the full potential of AI and deep learning solutions at exceptional scale. To take advantage of this limited-time offer, please complete this surveyi at: forms.office.com/r/FrAbm5BfH2. The expert ASUS team of NVIDIA GB200 specialists will guide users through the next steps.

ADLINK Launches the DLAP Supreme Series

ADLINK Technology Inc., a global leader in edge computing, unveiled its new "DLAP Supreme Series", an edge generative AI platform. By integrating Phison's innovative aiDAPTIV+ AI solution, this series overcomes memory limitations in edge generative AI applications, significantly enhancing AI computing capabilities on edge devices. Without increasing high hardware costs, the DLAP Supreme series achieves notable AI performance improvements, helping enterprises reduce the cost barriers of AI deployment and accelerating the adoption of generative AI across various industries, especially in edge computing.

Lower AI Computing Costs and Significantly Improved Performance
As generative AI continues to penetrate various industries, many edge devices encounter performance bottlenecks due to insufficient DRAM capacity when executing large language models, affecting model operations and even causing issues such as inadequate token length. The DLAP Supreme series, leveraging aiDAPTIV+ technology, effectively overcomes these limitations and significantly enhances computing performance. Additionally, it supports edge devices in conducting generative language model training, enabling them with AI model training capabilities and improving their autonomous learning and adaptability.

Intel Co-CEO Dampens Expectations for First-Gen "Falcon Shores" GPU

Intel's ambitious plan to challenge AMD and NVIDIA in the AI accelerator market may still be a little questionable, according to recent comments from interim co-CEO Michelle Johnston Holthaus at the Barclays 22nd Annual Global Technology Conference. The company's "Falcon Shores" project, which aims to merge Gaudi AI capabilities with Intel's data center GPU technology for HPC workloads, received surprising commentary from Holthaus. "We really need to think about how we go from Gaudi to our first generation of Falcon Shores, which is a GPU," she stated, before acknowledging potential limitations. "And I'll tell you right now, is it going to be wonderful? No, but it is a good first step."

Intel's pragmatic approach to AI hardware development was further highlighted when Holthaus addressed the company's product strategy. Rather than completely overhauling their development pipeline, she emphasized the value of iterative progress: "If you just stop everything and you go back to doing like all new product, products take a really long time to come to market. And so, you know, you're two years to three years out from having something." The co-CEO advocated for a more agile approach, stating, "I'd rather have something that I can do in smaller volume, learn, iterate, and get better so that we can get there." She acknowledged the enduring nature of AI market opportunities, particularly noting the current focus on training while highlighting the potential in other areas: "Obviously, AI is not going away. Obviously training is, you know, the focus today, but there's inference opportunities in other places where there will be different needs from a hardware perspective."

Google Genie 2 Promises AI-Generated Interactive Worlds With Realistic Physics and AI-Powered NPCs

For better or worse, generative AI has been a disruptive force in many industries, although its reception in video games has been lukewarm at best, with attempts at integrating AI-powered NPCs into games failing to impress most gamers. Now, Google's DeepMind AI has a new model called Genie 2, which can supposedly be used to generate "action-controllable, playable, 3D environments for training and evaluating embodied agents." All the environments generated by Genie 2 can supposedly be interacted with, whether by a human piloting a character with a mouse and keyboard or an AI-controlled NPC, although it's unclear what the behind-the-scenes code and optimizations look like, both aspects of which will be key to any real-world applications of the tech. Google says worlds created by Genie 2 can simulate consequences of actions in addition to the world itself, all in real-time. This means that when a player interacts with a world generated by Genie 2, the AI will respond with what its model suggests is the result of that action (like stepping on a leaf resulting in the destruction of said leaf). This extends to things like lighting, reflections, and physics, with Google showing off some impressively accurate water, volumetric effects, and accurate gravity.

In a demo video, Google showed a number of different AI-generated worlds, each with their own interactive characters, from a spaceship interior being explored by an astronaut to a robot taking a stroll in a futuristic cyberpunk urban environment, and even a sailboat sailing over water and a cowboy riding through some grassy plains on horseback. What's perhaps most interesting about Genie 2's generated environments is that Genie has apparently given each world a different perspective and camera control scheme. Some of the examples shown are first-person, while others are third-person with the camera either locked to the character or free-floating around the character. Of course, being generative AI, there is some weirdness, and Google clearly chose its demo clips carefully to avoid graphical anomalies from taking center stage. What's more, at least a few clips seem to very strongly resemble worlds from popular video games, Assassin's Creed, Red Dead Redemption, Sony's Horizon franchise, and what appears to be a mix of various sci-fi games, including Warframe, Destiny, Mass Effect, and Subnautica. This isn't surprising, since the worlds Google used to showcase the AI are all generated with an image and text prompt as inputs, and, given what Google says it used as training data used, it seems likely that gaming clips from those games made it into the AI model's training data.

Amazon AWS Announces General Availability of Trainium2 Instances, Reveals Details of Next Gen Trainium3 Chip

At AWS re:Invent, Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company, today announced the general availability of AWS Trainium2-powered Amazon Elastic Compute Cloud (Amazon EC2) instances, introduced new Trn2 UltraServers, enabling customers to train and deploy today's latest AI models as well as future large language models (LLM) and foundation models (FM) with exceptional levels of performance and cost efficiency, and unveiled next-generation Trainium3 chips.

"Trainium2 is purpose built to support the largest, most cutting-edge generative AI workloads, for both training and inference, and to deliver the best price performance on AWS," said David Brown, vice president of Compute and Networking at AWS. "With models approaching trillions of parameters, we understand customers also need a novel approach to train and run these massive workloads. New Trn2 UltraServers offer the fastest training and inference performance on AWS and help organizations of all sizes to train and deploy the world's largest models faster and at a lower cost."

NVIDIA B200 "Blackwell" Records 2.2x Performance Improvement Over its "Hopper" Predecessor

We know that NVIDIA's latest "Blackwell" GPUs are fast, but how much faster are they over the previous generation "Hopper"? Thanks to the latest MLPerf Training v4.1 results, NVIDIA's HGX B200 Blackwell platform has demonstrated massive performance gains, measuring up to 2.2x improvement per GPU compared to its HGX H200 Hopper. The latest results, verified by MLCommons, reveal impressive achievements in large language model (LLM) training. The Blackwell architecture, featuring HBM3e high-bandwidth memory and fifth-generation NVLink interconnect technology, achieved double the performance per GPU for GPT-3 pre-training and a 2.2x boost for Llama 2 70B fine-tuning compared to the previous Hopper generation. Each benchmark system incorporated eight Blackwell GPUs operating at a 1,000 W TDP, connected via NVLink Switch for scale-up.

The network infrastructure utilized NVIDIA ConnectX-7 SuperNICs and Quantum-2 InfiniBand switches, enabling high-speed node-to-node communication for distributed training workloads. While previous Hopper-based systems required 256 GPUs to optimize performance for the GPT-3 175B benchmark, Blackwell accomplished the same task with just 64 GPUs, leveraging its larger HBM3e memory capacity and bandwidth. One thing to look out for is the upcoming GB200 NVL72 system, which promises even more significant gains past the 2.2x. It features expanded NVLink domains, higher memory bandwidth, and tight integration with NVIDIA Grace CPUs, complemented by ConnectX-8 SuperNIC and Quantum-X800 switch technologies. With faster switching and better data movement with Grace-Blackwell integration, we could see even more software optimization from NVIDIA to push the performance envelope.

Meta Shows Open-Architecture NVIDIA "Blackwell" GB200 System for Data Center

During the Open Compute Project (OCP) Summit 2024, Meta, one of the prime members of the OCP project, showed its NVIDIA "Blackwell" GB200 systems for its massive data centers. We previously covered Microsoft's Azure server rack with GB200 GPUs featuring one-third of the rack space for computing and two-thirds for cooling. A few days later, Google showed off its smaller GB200 system, and today, Meta is showing off its GB200 system—the smallest of the bunch. To train a dense transformer large language model with 405B parameters and a context window of up to 128k tokens, like the Llama 3.1 405B, Meta must redesign its data center infrastructure to run a distributed training job on two 24,000 GPU clusters. That is 48,000 GPUs used for training a single AI model.

Called "Catalina," it is built on the NVIDIA Blackwell platform, emphasizing modularity and adaptability while incorporating the latest NVIDIA GB200 Grace Blackwell Superchip. To address the escalating power requirements of GPUs, Catalina introduces the Orv3, a high-power rack capable of delivering up to 140kW. The comprehensive liquid-cooled setup encompasses a power shelf supporting various components, including a compute tray, switch tray, the Orv3 HPR, Wedge 400 fabric switch with 12.8 Tbps switching capacity, management switch, battery backup, and a rack management controller. Interestingly, Meta also upgraded its "Grand Teton" system for internal usage, such as deep learning recommendation models (DLRMs) and content understanding with AMD Instinct MI300X. Those are used to inference internal models, and MI300X appears to provide the best performance per Dollar for inference. According to Meta, the computational demand stemming from AI will continue to increase exponentially, so more NVIDIA and AMD GPUs is needed, and we can't wait to see what the company builds.

Accenture to Train 30,000 of Its Employees on NVIDIA AI Full Stack

Accenture and NVIDIA today announced an expanded partnership, including Accenture's formation of a new NVIDIA Business Group, to help the world's enterprises rapidly scale their AI adoption. With generative AI demand driving $3 billion in Accenture bookings in its recently-closed fiscal year, the new group will help clients lay the foundation for agentic AI functionality using Accenture's AI Refinery, which uses the full NVIDIA AI stack—including NVIDIA AI Foundry, NVIDIA AI Enterprise and NVIDIA Omniverse—to advance areas such as process reinvention, AI-powered simulation and sovereign AI.

Accenture AI Refinery will be available on all public and private cloud platforms and will integrate seamlessly with other Accenture Business Groups to accelerate AI across the SaaS and Cloud AI ecosystem.

NVIDIA Cancels Dual-Rack NVL36x2 in Favor of Single-Rack NVL72 Compute Monster

NVIDIA has reportedly discontinued its dual-rack GB200 NVL36x2 GPU model, opting to focus on the single-rack GB200 NVL72 and NVL36 models. This shift, revealed by industry analyst Ming-Chi Kuo, aims to simplify NVIDIA's offerings in the AI and HPC markets. The decision was influenced by major clients like Microsoft, who prefer the NVL72's improved space efficiency and potential for enhanced inference performance. While both models perform similarly in AI large language model (LLM) training, the NVL72 is expected to excel in non-parallelizable inference tasks. As a reminder, the NVL72 features 36 Grace CPUs, delivering 2,592 Arm Neoverse V2 cores with 17 TB LPDDR5X memory with 18.4 TB/s aggregate bandwidth. Additionally, it includes 72 Blackwell GB200 SXM GPUs that have a massive 13.5 TB of HBM3e combined, running at 576 TB/s aggregate bandwidth.

However, this shift presents significant challenges. The NVL72's power consumption of around 120kW far exceeds typical data center capabilities, potentially limiting its immediate widespread adoption. The discontinuation of the NVL36x2 has also sparked concerns about NVIDIA's execution capabilities and may disrupt the supply chain for assembly and cooling solutions. Despite these hurdles, industry experts view this as a pragmatic approach to product planning in the dynamic AI landscape. While some customers may be disappointed by the dual-rack model's cancellation, NVIDIA's long-term outlook in the AI technology market remains strong. The company continues to work with clients and listen to their needs, to position itself as a leader in high-performance computing solutions.

Huawei Starts Shipping "Ascend 910C" AI Accelerator Samples to Large NVIDIA Customers

Huawei has reportedly started shipping its Ascend 910C accelerator—the company's domestic alternative to NVIDIA's H100 accelerator for AI training and inference. As the report from China South Morning Post notes, Huawei is shipping samples of its accelerator to large NVIDIA customers. This includes companies like Alibaba, Baidu, and Tencent, which have ordered massive amounts of NVIDIA accelerators. However, Huawei is on track to deliver 70,000 chips, potentially worth $2 billion. With NVIDIA working on a B20 accelerator SKU that complies with US government export regulations, the Huawei Ascend 910C accelerator could potentially outperform NVIDIA's B20 processor, per some analyst expectations.

If the Ascend 910C receives positive results from Chinese tech giants, it could be the start of Huawei's expansion into data center accelerators, once hindered by the company's ability to manufacture advanced chips. Now, with foundries like SMIC printing 7 nm designs and possibly 5 nm coming soon, Huawei will leverage this technology to satisfy the domestic demand for more AI processing power. Competing on a global scale, though, remains a challenge. Companies like NVIDIA, AMD, and Intel have access to advanced nodes, which gives their AI accelerators more efficiency and performance.

Intel Targets 35% Cost Reduction in Sales and Marketing Group, Bracing for Tough Times Ahead

Intel's Sales and Marketing Group (SMG) has announced a 35% reduction in costs as the company looks to streamline operations and adapt to challenging market conditions. The cuts, revealed during an all-hands meeting on August 5th, will impact both jobs and marketing expenses within the SMG. Intel has directed the group to "simplify programs end-to-end" by the end of the year, a directive that comes on the heels of the company's announcement that it would lay off 15% of its global workforce to save $10 billion in operating expenses. "We are becoming a simpler, leaner, and more agile company that's easier for partners and customers to work with while ensuring we focus our investments on areas where we see the greatest opportunities for innovation and growth," Intel said in a statement to CRN. The company emphasized that this restructuring is about "building a stronger Intel for the future," with partners integral to its plans.

The job cuts within the SMG are expected to target overlapping responsibilities, such as account managers and industry-focused teams, which can confuse customers navigating Intel's complex organization. Additionally, the company plans to significantly reduce its marketing budget and simplify programs, aiming to save at least $100 million in the latter half of 2024 and an additional $300 million in the first half of 2025. The impact will also be felt in Intel's market development fund (MDF), a crucial tool for supporting OEMs and other partners through events, training, and more. An ex-Intel executive warned that the MDF had become vital as the company's product leadership waned, allowing it to maintain valuable relationships with partners. As Intel navigates these changes, its partners are bracing for the impact, with one CEO describing the situation as everyone "hunkering down and just waiting to hear something." Another partner executive expressed concerns about Intel's ability to maintain the level of service and support its customers have come to expect.

Microsoft Prepares MAI-1 In-House AI Model with 500B Parameters

According to The Information, Microsoft is developing a new AI model, internally named MAI-1, designed to compete with the leading models from Google, Anthropic, and OpenAI. This significant step forward in the tech giant's AI capabilities is boosted by Mustafa Suleyman, the former Google AI leader who previously served as CEO of Inflection AI before Microsoft acquired the majority of its staff and intellectual property for $650 million in March. MAI-1 is a custom Microsoft creation that utilizes training data and technology from Inflection but is not a transferred model. It is also distinct from Inflection's previously released Pi models, as confirmed by two Microsoft insiders familiar with the project. With approximately 500 billion parameters, MAI-1 will be significantly larger than its predecessors, surpassing the capabilities of Microsoft's smaller, open-source models.

For comparison, OpenAI's GPT-4 boasts 1.8 trillion parameters in a Mixture of Experts sparse design, while open-source models from Meta and Mistral feature 70 billion parameters dense. Microsoft's investment in MAI-1 highlights its commitment to staying competitive in the rapidly evolving AI landscape. The development of this large-scale model represents a significant step forward for the tech giant, as it seeks to challenge industry leaders in the field. The increased computing power, training data, and financial resources required for MAI-1 demonstrate Microsoft's dedication to pushing the boundaries of AI capabilities and intention to compete on its own. With the involvement of Mustafa Suleyman, a renowned expert in AI, the company is well-positioned to make significant strides in this field.

US Government Wants Nuclear Plants to Offload AI Data Center Expansion

The expansion of AI technology affects not only the production and demand for graphics cards but also the electricity grid that powers them. Data centers hosting thousands of GPUs are becoming more common, and the industry has been building new facilities for GPU-enhanced servers to serve the need for more AI. However, these powerful GPUs often consume over 500 Watts per single card, and NVIDIA's latest Blackwell B200 GPU has a TGP of 1000 Watts or a single kilowatt. These kilowatt GPUs will be present in data centers with 10s of thousands of cards, resulting in multi-megawatt facilities. To combat the load on the national electricity grid, US President Joe Biden's administration has been discussing with big tech to re-evaluate their power sources, possibly using smaller nuclear plants. According to an Axios interview with Energy Secretary Jennifer Granholm, she has noted that "AI itself isn't a problem because AI could help to solve the problem." However, the problem is the load-bearing of the national electricity grid, which can't sustain the rapid expansion of the AI data centers.

The Department of Energy (DOE) has been reportedly talking with firms, most notably hyperscalers like Microsoft, Google, and Amazon, to start considering nuclear fusion and fission power plants to satisfy the need for AI expansion. We have already discussed the plan by Microsoft to embed a nuclear reactor near its data center facility and help manage the load of thousands of GPUs running AI training/inference. However, this time, it is not just Microsoft. Other tech giants are reportedly thinking about nuclear as well. They all need to offload their AI expansion from the US national power grid and develop a nuclear solution. Nuclear power is a mere 20% of the US power sourcing, and DOE is currently financing a Holtec Palisades 800-MW electric nuclear generating station with $1.52 billion in funds for restoration and resumption of service. Microsoft is investing in a Small Modular Reactors (SMRs) microreactor energy strategy, which could be an example for other big tech companies to follow.

Altair SimSolid Transforms Simulation for Electronics Industry

Altair, a global leader in computational intelligence, announced the upcoming release of Altair SimSolid for electronics, bringing game-changing fast, easy, and precise multi-physics scenario exploration for electronics, from chips, PCBs, and ICs to full system design. "As the electronics industry pushes the boundaries of complexity and miniaturization, engineers have struggled with simulations that often compromise on detail for expediency. Altair SimSolid will empower engineers to capture the intricate complexities of PCBs and ICs without simplification," said James R. Scapa, founder and chief executive officer, Altair. "Traditional simulation methods often require approximations when analyzing PCB structures due to their complexity. Altair SimSolid eliminates these approximations to run more accurate simulations for complex problems with vast dimensional disparities."

Altair SimSolid has revolutionized conventional analysis in its ability to accurately predict complex structural problems with blazing-fast speed while eliminating the complexity of laborious hours of modeling. It eliminates geometry simplification and meshing, the two most time-consuming and expertise-intensive tasks done in traditional finite element analysis. As a result, it delivers results in seconds to minutes—up to 25x faster than traditional finite element solvers—and effortlessly handles complex assemblies. Having experienced fast adoption in the aerospace and automotive industries, two sectors that typically experience challenges associated with massive structures, Altair SimSolid is poised to play a significant role in the electronics market. The initial release, expected in Q2 2024, will support structural and thermal analysis for PCBs and ICs with full electromagnetics analysis coming in a future release.

Dell Expands Generative AI Solutions Portfolio, Selects NVIDIA Blackwell GPUs

Dell Technologies is strengthening its collaboration with NVIDIA to help enterprises adopt AI technologies. By expanding the Dell Generative AI Solutions portfolio, including with the new Dell AI Factory with NVIDIA, organizations can accelerate integration of their data, AI tools and on-premises infrastructure to maximize their generative AI (GenAI) investments. "Our enterprise customers are looking for an easy way to implement AI solutions—that is exactly what Dell Technologies and NVIDIA are delivering," said Michael Dell, founder and CEO, Dell Technologies. "Through our combined efforts, organizations can seamlessly integrate data with their own use cases and streamline the development of customized GenAI models."

"AI factories are central to creating intelligence on an industrial scale," said Jensen Huang, founder and CEO, NVIDIA. "Together, NVIDIA and Dell are helping enterprises create AI factories to turn their proprietary data into powerful insights."

MAINGEAR Introduces PRO AI Workstations Featuring aiDAPTIV+ For Cost-Effective Large Language Model Training

MAINGEAR, a leading provider of high-performance custom PC systems, and Phison, a global leader in NAND controllers and storage solutions, today unveiled groundbreaking MAINGEAR PRO AI workstations with Phison's aiDAPTIV+ technology. Specifically engineered to democratize Large Language Model (LLM) development and training for small and medium-sized businesses (SMBs), these ultra-powerful workstations incorporate aiDAPTIV+ technology to deliver supercomputer LLM training capabilities at a fraction of the cost of traditional AI training servers.

As the demand for large-scale generative AI models continues to surge and their complexity increases, the potential for LLMs also expands. However, this rapid advancement in LLM AI technology has led to a notable boost in hardware requirements, making model training cost-prohibitive and inaccessible for many small to medium businesses.

Cerebras & G42 Break Ground on Condor Galaxy 3 - an 8 exaFLOPs AI Supercomputer

Cerebras Systems, the pioneer in accelerating generative AI, and G42, the Abu Dhabi-based leading technology holding group, today announced the build of Condor Galaxy 3 (CG-3), the third cluster of their constellation of AI supercomputers, the Condor Galaxy. Featuring 64 of Cerebras' newly announced CS-3 systems - all powered by the industry's fastest AI chip, the Wafer-Scale Engine 3 (WSE-3) - Condor Galaxy 3 will deliver 8 exaFLOPs of AI with 58 million AI-optimized cores. The Cerebras and G42 strategic partnership already delivered 8 exaFLOPs of AI supercomputing performance via Condor Galaxy 1 and Condor Galaxy 2, each amongst the largest AI supercomputers in the world. Located in Dallas, Texas, Condor Galaxy 3 brings the current total of the Condor Galaxy network to 16 exaFLOPs.

"With Condor Galaxy 3, we continue to achieve our joint vision of transforming the worldwide inventory of AI compute through the development of the world's largest and fastest AI supercomputers," said Kiril Evtimov, Group CTO of G42. "The existing Condor Galaxy network has trained some of the leading open-source models in the industry, with tens of thousands of downloads. By doubling the capacity to 16exaFLOPs, we look forward to seeing the next wave of innovation Condor Galaxy supercomputers can enable." At the heart of Condor Galaxy 3 are 64 Cerebras CS-3 Systems. Each CS-3 is powered by the new 4 trillion transistor, 900,000 AI core WSE-3. Manufactured at TSMC at the 5-nanometer node, the WSE-3 delivers twice the performance at the same power and for the same price as the previous generation part. Purpose built for training the industry's largest AI models, WSE-3 delivers an astounding 125 petaflops of peak AI performance per chip.
Return to Keyword Browsing
Mar 25th, 2025 07:01 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts