News Posts matching #ML

Return to Keyword Browsing

Interview with RISC-V International: High-Performance Chips, AI, Ecosystem Fragmentation, and The Future

RISC-V is an industry standard instruction set architecture (ISA) born in UC Berkeley. RISC-V is the fifth iteration in the lineage of historic RISC processors. The core value of the RISC-V ISA is the freedom of usage it offers. Any organization can leverage the ISA to design the best possible core for their specific needs, with no regional restrictions or licensing costs. It attracts a massive ecosystem of developers and companies building systems using the RISC-V ISA. To support these efforts and grow the ecosystem, the brains behind RISC decided to form RISC-V International—a non-profit foundation that governs the ISA and guides the ecosystem.

We had the privilege of talking with Andrea Gallo, Vice President of Technology at RISC-V International. Andrea oversees the technological advancement of RISC-V, collaborating with vendors and institutions to overcome challenges and expand its global presence. Andrea's career in technology spans several influential roles at major companies. Before joining RISC-V International, he worked at Linaro, where he pioneered Arm data center engineering initiatives, later overseeing diverse technological sectors as Vice President of Segment Groups, and ultimately managing crucial business development activities as executive Vice President. During his earlier tenure as a Fellow at ST-Ericsson, he focused on smartphone and application processor technology, and at STMicroelectronics he optimized hardware-software architectures and established international development teams.

Emteq Labs Unveils World's First Emotion-Sensing Eyewear

Emteq Labs, the market leader in emotion-recognition wearable technology, today announced the forthcoming introduction of Sense, the world's first emotion-sensing eyewear. Alongside the unveiling of Sense, the company is pleased to announce the appointment of Steen Strand, former head of the hardware division of Snap Inc., as its new Chief Executive Officer.

Over the past decade, Emteq Labs - led by renowned surgeon and facial musculature expert, Dr. Charles Nduka - has been at the forefront of engineering advanced technologies for sensing facial movements and emotions. This data has significant implications on health and well-being, but has never been available outside of a laboratory, healthcare facility, or other controlled setting. Now, Emteq Labs has developed Sense: a patented, AI-powered eyewear platform that provides lab-quality insights in real life and in real time. This includes comprehensive measurement and analysis of the wearer's facial expressions, dietary habits, mood, posture, attention levels, physical activity, and additional health-related metrics.

Western Digital Enterprise SSDs Certified to Support NVIDIA GB200 NVL72 System for Compute-Intensive AI Environments

Western Digital Corp. today announced that its PCIe Gen 5 DC SN861 E.1S enterprise-class NVMe SSDs have been certified to support the NVIDIA GB200 NVL72 rack-scale system.

The rapid rise of AI, ML, and large language models (LLMs) is creating a challenge for companies with two opposing forces. Data generation and consumption are accelerating, while organizations face pressure to quickly derive value from this data. Performance, scalability, and efficiency are essential for AI technology stacks as storage demands rise. Certified to be compatible with the GB200 NVL72 system, Western Digital's enterprise SSD addresses the growing needs of the AI market for high-speed accelerated computing combined with low latency to serve compute-intensive AI environments.

AMD Launches New Slim Form Factor Alveo UL3422 Accelerator Card

AMD today announced the AMD Alveo UL3422 accelerator card, the latest addition to its record-breaking family of accelerators1 designed for ultra-low latency electronic trading applications. AMD Alveo UL3422 provides trading firms, market makers and financial institutions with a slim form factor accelerator optimized for rack space, cost and designed for a fast path to deployment in a wide range of servers. The Alveo UL3422 accelerator is powered by an AMD Virtex UltraScale+ FPGA that features a novel transceiver architecture with hardened, optimized network connectivity cores, custom built for high-speed trading. It enables ultra-low latency trade execution, achieving less than 3ns FPGA transceiver latency and breakthrough 'tick-to-trade' performance not achievable with standard off-the-shelf FPGAs.

"Speed is the ultimate advantage in the increasingly competitive world of high-speed trading," said Yousef Khalilollahi, corporate vice president & general manager, Adaptive Computing Group, AMD. "The Alveo UL3422 card provides a lower-cost entry point while still delivering cutting-edge latency performance, making it accessible to firms of all sizes that want to stay competitive in the ultra-low latency trading space."

Lenovo Accelerates Business Transformation with New ThinkSystem Servers Engineered for Optimal AI and Powered by AMD

Today, Lenovo announced its industry-leading ThinkSystem infrastructure solutions powered by AMD EPYC 9005 Series processors, as well as AMD Instinct MI325X accelerators. Backed by 225 of AMD's world-record performance benchmarks, the Lenovo ThinkSystem servers deliver an unparalleled combination of AMD technology-based performance and efficiency to tackle today's most demanding edge-to-cloud workloads, including AI training, inferencing and modeling.

"Lenovo is helping organizations of all sizes and across various industries achieve AI-powered business transformations," said Vlad Rozanovich, Senior Vice President, Lenovo Infrastructure Solutions Group. "Not only do we deliver unmatched performance, we offer the right mix of solutions to change the economics of AI and give customers faster time-to-value and improved total value of ownership."

Supermicro Currently Shipping Over 100,000 GPUs Per Quarter in its Complete Rack Scale Liquid Cooled Servers

Supermicro, Inc., a Total IT Solution Provider for Cloud, AI/ML, Storage, and 5G/Edge, is announcing a complete liquid cooling solution that includes powerful Coolant Distribution Units (CDUs), cold plates, Coolant Distribution Manifolds (CDMs), cooling towers and end to end management software. This complete solution reduces ongoing power costs and Day 0 hardware acquisition and data center cooling infrastructure costs. The entire end-to-end data center scale liquid cooling solution is available directly from Supermicro.

"Supermicro continues to innovate, delivering full data center plug-and-play rack scale liquid cooling solutions," said Charles Liang, CEO and president of Supermicro. "Our complete liquid cooling solutions, including SuperCloud Composer for the entire life-cycle management of all components, are now cooling massive, state-of-the-art AI factories, reducing costs and improving performance. The combination of Supermicro deployment experience and delivering innovative technology is resulting in data center operators coming to Supermicro to meet their technical and financial goals for both the construction of greenfield sites and the modernization of existing data centers. Since Supermicro supplies all the components, the time to deployment and online are measured in weeks, not months."

Apple Introduces the iPhone 16 and iPhone 16 Plus

Apple today announced iPhone 16 and iPhone 16 Plus, built for Apple Intelligence, the easy-to-use personal intelligence system that understands personal context to deliver intelligence that is helpful and relevant while protecting user privacy. The iPhone 16 lineup also introduces Camera Control, which brings new ways to capture memories, and will help users quickly access visual intelligence to learn about objects or places around them faster than ever before. The powerful camera system features a 48MP Fusion camera with a 2x Telephoto option, giving users two cameras in one, while a new Ultra Wide camera enables macro photography. Next-generation Photographic Styles help users personalize their images, and spatial photo and video capture allows users to relive life's precious memories with remarkable depth on Apple Vision Pro. The new A18 chip delivers a huge leap in performance and efficiency, enabling demanding AAA games, as well as a big boost in battery life.

iPhone 16 and iPhone 16 Plus will be available in five bold colors: black, white, pink, teal, and ultramarine. Pre-orders begin Friday, September 13, with availability beginning Friday, September 20.

Apple Debuts the iPhone 16 Pro and iPhone 16 Pro Max - Now with a Camera Button

Apple today introduced iPhone 16 Pro and iPhone 16 Pro Max, featuring Apple Intelligence, larger display sizes, new creative capabilities with innovative pro camera features, stunning graphics for immersive gaming, and more—all powered by the A18 Pro chip. With Apple Intelligence, powerful Apple-built generative models come to iPhone in the easy-to-use personal intelligence system that understands personal context to deliver intelligence that is helpful and relevant while protecting user privacy. Camera Control unlocks a fast, intuitive way to tap into visual intelligence and easily interact with the advanced camera system. Featuring a new 48MP Fusion camera with a faster quad-pixel sensor that enables 4K120 FPS video recording in Dolby Vision, these new Pro models achieve the highest resolution and frame-rate combination ever available on iPhone. Additional advancements include a new 48MP Ultra Wide camera for higher-resolution photography, including macro; a 5x Telephoto camera on both Pro models; and studio-quality mics to record more true-to-life audio. The durable titanium design is strong yet lightweight, with larger display sizes, the thinnest borders on any Apple product, and a huge leap in battery life—with iPhone 16 Pro Max offering the best battery life on iPhone ever.

iPhone 16 Pro and iPhone 16 Pro Max will be available in four stunning finishes: black titanium, natural titanium, white titanium, and desert titanium. Pre-orders begin Friday, September 13, with availability beginning Friday, September 20.

Efficient Teams Up with GlobalFoundries to Develop Ultra-Low Power MRAM Processors

Today, Efficient announced a strategic partnership with GlobalFoundries (GF) to bring to market a new high-performance computer processor that is up to 166x more energy-efficient than industry-standard embedded CPUs. Efficient is already working with select customers for early access and customer sampling by summer 2025. The official introduction of the category-creating processor will mark a new era in computing, free from restrictive energy limitations.

The partnership will combine Efficient's novel architecture and technology with GF's U.S.-based manufacturing, global reach and market expertise to enable a quantum leap in edge device capabilities and battery lifetime. Through this partnership, Efficient will provide the computing power to smarter, longer-lasting devices and applications across the Internet of Things, wearable and implantable health devices, space systems, and security and defense.

NVIDIA Blackwell Sets New Standard for Generative AI in MLPerf Inference Benchmark

As enterprises race to adopt generative AI and bring new services to market, the demands on data center infrastructure have never been greater. Training large language models is one challenge, but delivering LLM-powered real-time services is another. In the latest round of MLPerf industry benchmarks, Inference v4.1, NVIDIA platforms delivered leading performance across all data center tests. The first-ever submission of the upcoming NVIDIA Blackwell platform revealed up to 4x more performance than the NVIDIA H100 Tensor Core GPU on MLPerf's biggest LLM workload, Llama 2 70B, thanks to its use of a second-generation Transformer Engine and FP4 Tensor Cores.

The NVIDIA H200 Tensor Core GPU delivered outstanding results on every benchmark in the data center category - including the latest addition to the benchmark, the Mixtral 8x7B mixture of experts (MoE) LLM, which features a total of 46.7 billion parameters, with 12.9 billion parameters active per token. MoE models have gained popularity as a way to bring more versatility to LLM deployments, as they're capable of answering a wide variety of questions and performing more diverse tasks in a single deployment. They're also more efficient since they only activate a few experts per inference - meaning they deliver results much faster than dense models of a similar size.

Cerebras Launches the World's Fastest AI Inference

Today, Cerebras Systems, the pioneer in high performance AI compute, announced Cerebras Inference, the fastest AI inference solution in the world. Delivering 1,800 tokens per second for Llama3.1 8B and 450 tokens per second for Llama3.1 70B, Cerebras Inference is 20 times faster than NVIDIA GPU-based solutions in hyperscale clouds. Starting at just 10c per million tokens, Cerebras Inference is priced at a fraction of GPU solutions, providing 100x higher price-performance for AI workloads.

Unlike alternative approaches that compromise accuracy for performance, Cerebras offers the fastest performance while maintaining state of the art accuracy by staying in the 16-bit domain for the entire inference run. Cerebras Inference is priced at a fraction of GPU-based competitors, with pay-as-you-go pricing of 10 cents per million tokens for Llama 3.1 8B and 60 cents per million tokens for Llama 3.1 70B.

Samsung to Install High-NA EUV Machines Ahead of TSMC in Q4 2024 or Q1 2025

Samsung Electronics is set to make a significant leap in semiconductor manufacturing technology with the introduction of its first High-NA 0.55 EUV lithography tool. The company plans to install the ASML Twinscan EXE:5000 system at its Hwaseong campus between Q4 2024 and Q1 2025, marking a crucial step in developing next-generation process technologies for logic and DRAM production. This move positions Samsung about a year behind Intel but ahead of rivals TSMC and SK Hynix in adopting High-NA EUV technology. The system is expected to be operational by mid-2025, primarily for research and development purposes. Samsung is not just focusing on the lithography equipment itself but is building a comprehensive ecosystem around High-NA EUV technology.

The company is collaborating with several key partners like Lasertec (developing inspection equipment for High-NA photomasks), JSR (working on advanced photoresists), Tokyo Electron (enhancing etching machines), and Synopsys (shifting to curvilinear patterns on photomasks for improved circuit precision). The High-NA EUV technology promises significant advancements in chip manufacturing. With an 8 nm resolution capability, it could make transistors about 1.7 times smaller and increase transistor density by nearly three times compared to current Low-NA EUV systems. However, the transition to High-NA EUV comes with challenges. The tools are more expensive, costing up to $380 million each, and have a smaller imaging field. Their larger size also requires chipmakers to reconsider fab layouts. Despite these hurdles, Samsung aims for commercial implementation of High-NA EUV by 2027.

Geekbench AI Hits 1.0 Release: CPUs, GPUs, and NPUs Finally Get AI Benchmarking Solution

Primate Labs, the developer behind the popular Geekbench benchmarking suite, has launched Geekbench AI—a comprehensive benchmark tool designed to measure the artificial intelligence capabilities of various devices. Geekbench AI, previously known as Geekbench ML during its preview phase, has now reached version 1.0. The benchmark is available on multiple operating systems, including Windows, Linux, macOS, Android, and iOS, making it accessible to many users and developers. One of Geekbench AI's key features is its multifaceted approach to scoring. The benchmark utilizes three distinct precision levels: single-precision, half-precision, and quantized data. This evaluation aims to provide a more accurate representation of AI performance across different hardware designs.

In addition to speed, Geekbench AI places a strong emphasis on accuracy. The benchmark assesses how closely each test's output matches the expected results, offering insights into the trade-offs between performance and precision. The release of Geekbench AI 1.0 brings support for new frameworks, including OpenVINO, ONNX, and Qualcomm QNN, expanding its compatibility across various platforms. Primate Labs has also implemented measures to ensure fair comparisons, such as enforcing minimum runtime durations for each workload. The company noted that Samsung and NVIDIA are already utilizing the software to measure their chip performance in-house, showing that adoption is already strong. While the benchmark provides valuable insights, real-world AI applications are still limited, and reliance on a few benchmarks may paint a partial picture. Nevertheless, Geekbench AI represents a significant step forward in standardizing AI performance measurement, potentially influencing future consumer choices in the AI-driven tech market. Results from the benchmark runs can be seen here.

NVIDIA MLPerf Training Results Showcase Unprecedented Performance and Elasticity

The full-stack NVIDIA accelerated computing platform has once again demonstrated exceptional performance in the latest MLPerf Training v4.0 benchmarks. NVIDIA more than tripled the performance on the large language model (LLM) benchmark, based on GPT-3 175B, compared to the record-setting NVIDIA submission made last year. Using an AI supercomputer featuring 11,616 NVIDIA H100 Tensor Core GPUs connected with NVIDIA Quantum-2 InfiniBand networking, NVIDIA achieved this remarkable feat through larger scale - more than triple that of the 3,584 H100 GPU submission a year ago - and extensive full-stack engineering.

Thanks to the scalability of the NVIDIA AI platform, Eos can now train massive AI models like GPT-3 175B even faster, and this great AI performance translates into significant business opportunities. For example, in NVIDIA's recent earnings call, we described how LLM service providers can turn a single dollar invested into seven dollars in just four years running the Llama 3 70B model on NVIDIA HGX H200 servers. This return assumes an LLM service provider serving Llama 3 70B at $0.60/M tokens, with an HGX H200 server throughput of 24,000 tokens/second.

Arm Also Announces Three New GPUs for Consumer Devices

In addition to its two new CPU cores, Arm has announced three new GPU cores, namely the Immortalis-G925, Mali-G725 and Mali-G625. Starting from the top, the Immortalis-G925 is said to bring up to 37 percent better performance at 30 percent lower power usage compared to last year's Immortalis-G720 GPU core, whilst having two additional GPU cores in the test scenario. It's also said to bring up to 52 percent better ray tracing performance and up to 36 percent improved inference in AI/ML workloads. It's also been given a big overhaul when it comes to ray tracing—due to it being aimed towards gaming phones—and Arm claims that it can either offer up to 52 percent increased performance by reducing the accuracy in scenes with intricate objects, or 27 percent more performance with maintained accuracy.

The Immortalis-G925 supports 50 percent more shader cores and it supports configurations of up to 24 cores, compared to 16 cores for the Immortalis-G720. The Mali-G725 will be available with between six and nine cores, whereas the Mali-G625 will sport between one and five cores. The Mali-G625 is intended for smartwatches and entry-level mobile devices where a more complex GPU might not be suitable due to power draw. The Mali-G725 on the other hand is targeting upper mid-range devices and the Immortalis-G925 is aimed towards flagship devices or gaming phones as mentioned above. In related news, Arm said it's working with Epic Games to get its Unreal Engine 5 desktop renderer up and running on Android, which could lead to more complex games on mobile devices.

Micron First to Achieve Qualification Sample Milestone to Accelerate Ecosystem Adoption of CXL 2.0 Memory

Micron Technology, a leader in innovative data center solutions, today announced it has achieved its qualification sample milestone for the Micron CZ120 memory expansion modules using Compute Express Link (CXL). Micron is the first in the industry to achieve this milestone, which accelerates the adoption of CXL solutions within the data center to tackle the growing memory challenges stemming from existing data-intensive workloads and emerging artificial intelligence (AI) and machine learning (ML) workloads.

Using a new and emerging CXL standard, the CZ120 required substantial hardware testing for reliability, quality and performance across CPU providers and OEMs, along with comprehensive software testing for compatibility and compliance with OS and hypervisor vendors. This achievement reflects the collaboration and commitment across the data center ecosystem to validate the advantages of CXL memory. By testing the combined products for interoperability and compatibility across hardware and software, the Micron CZ120 memory expansion modules satisfy the rigorous standards for reliability, quality and performance required by customers' data centers.

New Performance Optimizations Supercharge NVIDIA RTX AI PCs for Gamers, Creators and Developers

NVIDIA today announced at Microsoft Build new AI performance optimizations and integrations for Windows that help deliver maximum performance on NVIDIA GeForce RTX AI PCs and NVIDIA RTX workstations. Large language models (LLMs) power some of the most exciting new use cases in generative AI and now run up to 3x faster with ONNX Runtime (ORT) and DirectML using the new NVIDIA R555 Game Ready Driver. ORT and DirectML are high-performance tools used to run AI models locally on Windows PCs.

WebNN, an application programming interface for web developers to deploy AI models, is now accelerated with RTX via DirectML, enabling web apps to incorporate fast, AI-powered capabilities. And PyTorch will support DirectML execution backends, enabling Windows developers to train and infer complex AI models on Windows natively. NVIDIA and Microsoft are collaborating to scale performance on RTX GPUs. These advancements build on NVIDIA's world-leading AI platform, which accelerates more than 500 applications and games on over 100 million RTX AI PCs and workstations worldwide.

ChatGPT Comes to Desktop with OpenAI's Latest GPT-4o Model That Talks With Users

At OpenAI's spring update, a lot of eyes were fixed on the company, which spurred the AI boom with the ChatGPT application. Now being almost a must-have app for consumers and prosumers alike, ChatGPT is a de-facto application for the latest AI innovation, backed by researchers and scientists from OpenAI. Today, OpenAI announced a new model called GPT-4o (Omni), which hopes to bring advanced intelligence, improved overall capabilities, and real-time voice interaction with users. Now, the ChatGPT application wants to become like a personal assistant that actively communicates with users and provides much broader capabilities. OpenAI claims that it can respond to audio inputs as quickly as 232 milliseconds, with an average of 320 milliseconds, similar to human response time in conversations.

However, OpenAI states that it wants ChatGPT's latest GPT-4o model to be available to the free, Plus, and Team paid subscribers, where paid subscribers get 5x higher usage and early access to the model. Interestingly, the GPT-4o model is much improved across a variety of standard benchmarks like MMLU, Math, HumanEval, GPQA, and others, where it now surpasses almost all models except Claude 3 Opus in MGSM. It now understands more than 50 languages and can do real time translation. In addition to the new model, OpenAI announced that they are launching a desktop ChatGPT app, which can act as a personal assistant and see what is happening on the screen, but it is only allowed by user command. This is supposed to bring a much more refined user experience and enable users to use AI as a third person to help understand the screen's content. Initially only available on macOS, we are waiting for OpenAI to launch the Windows ChatGPT application so everyone can also experience the new technology.

SpiNNcloud Systems Announces First Commercially Available Neuromorphic Supercomputer

Today, in advance of ISC High Performance 2024, SpiNNcloud Systems announced the commercial availability of its SpiNNaker2 platform, a supercomputer-level hybrid AI high-performance computer system based on principles of the human brain. Pioneered by Steve Furber, designer of the original ARM and SpiNNaker1 architectures, the SpiNNaker2 supercomputing platform uses a large number of low-power processors for efficiently computing AI and other workloads.

First-generation SpiNNaker1 architecture is currently used in dozens of research groups across 23 countries worldwide. Sandia National Laboratories, Technical University of München and Universität Göttingen are among the first customers placing orders for SpiNNaker2, which was developed around commercialized IP invented in the Human Brain Project, a billion-euro research project funded by the European Union to design intelligent, efficient artificial systems.

Apple Introduces the M4 Chip

Apple today announced M4, the latest chip delivering phenomenal performance to the all-new iPad Pro. Built using second-generation 3-nanometer technology, M4 is a system on a chip (SoC) that advances the industry-leading power efficiency of Apple silicon and enables the incredibly thin design of iPad Pro. It also features an entirely new display engine to drive the stunning precision, color, and brightness of the breakthrough Ultra Retina XDR display on iPad Pro. A new CPU has up to 10 cores, while the new 10-core GPU builds on the next-generation GPU architecture introduced in M3, and brings Dynamic Caching, hardware-accelerated ray tracing, and hardware-accelerated mesh shading to iPad for the first time. M4 has Apple's fastest Neural Engine ever, capable of up to 38 trillion operations per second, which is faster than the neural processing unit of any AI PC today. Combined with faster memory bandwidth, along with next-generation machine learning (ML) accelerators in the CPU, and a high-performance GPU, M4 makes the new iPad Pro an outrageously powerful device for artificial intelligence.

"The new iPad Pro with M4 is a great example of how building best-in-class custom silicon enables breakthrough products," said Johny Srouji, Apple's senior vice president of Hardware Technologies. "The power-efficient performance of M4, along with its new display engine, makes the thin design and game-changing display of iPad Pro possible, while fundamental improvements to the CPU, GPU, Neural Engine, and memory system make M4 extremely well suited for the latest applications leveraging AI. Altogether, this new chip makes iPad Pro the most powerful device of its kind."

Intel Builds World's Largest Neuromorphic System to Enable More Sustainable AI

Today, Intel announced that it has built the world's largest neuromorphic system. Code-named Hala Point, this large-scale neuromorphic system, initially deployed at Sandia National Laboratories, utilizes Intel's Loihi 2 processor, aims at supporting research for future brain-inspired artificial intelligence (AI), and tackles challenges related to the efficiency and sustainability of today's AI. Hala Point advances Intel's first-generation large-scale research system, Pohoiki Springs, with architectural improvements to achieve over 10 times more neuron capacity and up to 12 times higher performance.

"The computing cost of today's AI models is rising at unsustainable rates. The industry needs fundamentally new approaches capable of scaling. For that reason, we developed Hala Point, which combines deep learning efficiency with novel brain-inspired learning and optimization capabilities. We hope that research with Hala Point will advance the efficiency and adaptability of large-scale AI technology." -Mike Davies, director of the Neuromorphic Computing Lab at Intel Labs

Intel Announces New Program for AI PC Software Developers and Hardware Vendors

Intel Corporation today announced the creation of two new artificial intelligence (AI) initiatives as part of the AI PC Acceleration Program: the AI PC Developer Program and the addition of independent hardware vendors to the program. These are critical milestones in Intel's pursuit of enabling the software and hardware ecosystem to optimize and maximize AI on more than 100 million Intel-based AI PCs through 2025.

"We have made great strides with our AI PC Acceleration Program by working with the ecosystem. Today, with the addition of the AI PC Developer Program, we are expanding our reach to go beyond large ISVs and engage with small- and medium-sized players and aspiring developers. Our goal is to drive a frictionless experience by offering a broad set of tools including the new AI-ready Developer Kit," said Carla Rodriguez, Intel vice president and general manager of Client Software Ecosystem Enabling.

UL Announces the Procyon AI Image Generation Benchmark Based on Stable Diffusion

We're excited to announce we're expanding our AI Inference benchmark offerings with the UL Procyon AI Image Generation Benchmark, coming Monday, 25th March. AI has the potential to be one of the most significant new technologies hitting the mainstream this decade, and many industry leaders are competing to deliver the best AI Inference performance through their hardware. Last year, we launched the first of our Procyon AI Inference Benchmarks for Windows, which measured AI Inference performance with a workload using Computer Vision.

The upcoming UL Procyon AI Image Generation Benchmark provides a consistent, accurate and understandable workload for measuring the AI performance of high-end hardware, built with input from members of the industry to ensure fair and comparable results across all supported hardware.

Lenovo and Anaconda Announce Agreement to Accelerate AI Development and Deployment

Today, Lenovo announced a strategic partnership with Anaconda Inc., the leading provider of the world's most popular artificial intelligence (AI), machine learning (ML) and data science platform, to empower Lenovo's high performance data science workstations. The partnership will couple Lenovo's trusted ThinkStation and ThinkPad workstation product portfolio heritage and leadership with Anaconda's enterprise strengths for open-source leadership, security, and reliability.

The rapidly evolving world of artificial intelligence, deep learning and generative AI is opening up new opportunities for businesses and data scientists. Much of the AI innovation taking place today is driven by open-source software and cloud-based solutions, with Python being a leading software language for AI applications. However, the data security risks associated with utilizing open-source software at an enterprise level, privacy concerns and often prohibitive cost of cloud-based AI solutions, is causing many organizations to rethink their approach to investment in AI development. With Intel -powered Lenovo workstations architected with the latest generations of professional NVIDIA GPUs built for large-language model fine-tuning, and the Anaconda Navigator's ability to enable businesses to leverage open-source and AI with enhanced security, scale, and governance mechanisms in place, the partnership allows data scientists to create and deploy AI solutions with first class hardware and enterprise-grade AI software support within a more manageable investment framework.

Ethernet Switch Chips are Now Infected with AI: Broadcom Announces Trident 5-X12

Artificial intelligence has been a hot topic this year, and everything is now an AI processor, from CPUs to GPUs, NPUs, and many others. However, it was only a matter of time before we saw an integration of AI processing elements into the networking chips. Today, Broadcom announced its new Ethernet switching silicon called Trident 5-X12. The Trident 5-X12 delivers 16 Tb/s of bandwidth, double that of the previous Trident generation while adding support for fast 800G ports for connection to Tomahawk 5 spine switch chips. The 5-X12 is software-upgradable and optimized for dense 1RU top-of-rack designs, enabling configurations with up to 48x200G downstream server ports and 8x800G upstream fabric ports. The 800G support is added using 100G-PAM4 SerDes, which enables up to 4 m DAC and linear optics.

However, this is not only a switch chip on its own. Broadcom has added AI processing elements in an inference engine called NetGNT (Networking General-purpose Neural-network Traffic-analyzer). It can detect common traffic patterns and optimize data movement across the chip. Specifically, the company has listed an example of the system doing AI/ML workloads. In that case, NetGNT performs intelligent traffic analysis to avoid network congestion in these workloads. For example, it can detect the so-called "incast" patterns in real-time, where many flows converge simultaneously on the same port. By recognizing the start of incast early, NetGNT can invoke hardware-based congestion control techniques to prevent performance degradation without added latency.
Return to Keyword Browsing
Nov 18th, 2024 02:18 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts