News Posts matching #NVIDIA NeMo

Return to Keyword Browsing

NVIDIA Blackwell Delivers Breakthrough Performance in Latest MLPerf Training Results

Press Release by

Jun 6th, 2025 07:50 Discuss (0 Comments)

NVIDIA is working with companies worldwide to build out AI factories—speeding the training and deployment of next-generation AI applications that use the latest advancements in training and inference. The NVIDIA Blackwell architecture is built to meet the heightened performance requirements of these new applications. In the latest round of MLPerf Training—the 12th since the benchmark's introduction in 2018—the NVIDIA AI platform delivered the highest performance at scale on every benchmark and powered every result submitted on the benchmark's toughest large language model (LLM)-focused test: Llama 3.1 405B pretraining.

The NVIDIA platform was the only one that submitted results on every MLPerf Training v5.0 benchmark—underscoring its exceptional performance and versatility across a wide array of AI workloads, spanning LLMs, recommendation systems, multimodal LLMs, object detection and graph neural networks. The at-scale submissions used two AI supercomputers powered by the NVIDIA Blackwell platform: Tyche, built using NVIDIA GB200 NVL72 rack-scale systems, and Nyx, based on NVIDIA DGX B200 systems. In addition, NVIDIA collaborated with CoreWeave and IBM to submit GB200 NVL72 results using a total of 2,496 Blackwell GPUs and 1,248 NVIDIA Grace CPUs.

Read full story

Dell Technologies Unveils Next Generation Enterprise AI Solutions with NVIDIA

Press Release by

May 19th, 2025 12:31 Discuss (0 Comments)

The world's top provider of AI-centric infrastructure, Dell Technologies, announces innovations across the Dell AI Factory with NVIDIA - all designed to help enterprises accelerate AI adoption and achieve faster time to value.

Why it matters
As enterprises make AI central to their strategy and progress from experimentation to implementation, their demand for accessible AI skills and technologies grows exponentially. Dell and NVIDIA continue the rapid pace of innovation with updates to the Dell AI Factory with NVIDIA, including robust AI infrastructure, solutions and services that streamline the path to full-scale implementation.

Read full story

NVIDIA & ServiceNow CEOs Jointly Present "Super Genius" Open-source Apriel Nemotron 15B LLM

Press Release by

May 7th, 2025 09:55 Discuss (0 Comments)

ServiceNow is accelerating enterprise AI with a new reasoning model built in partnership with NVIDIA—enabling AI agents that respond in real time, handle complex workflows and scale functions like IT, HR and customer service teams worldwide. Unveiled today at ServiceNow's Knowledge 2025—where NVIDIA CEO and founder Jensen Huang joined ServiceNow chairman and CEO Bill McDermott during his keynote address—Apriel Nemotron 15B is compact, cost-efficient and tuned for action. It's designed to drive the next step forward in enterprise large language models (LLMs).

Apriel Nemotron 15B was developed with NVIDIA NeMo, the open NVIDIA Llama Nemotron Post-Training Dataset and ServiceNow domain-specific data, and was trained on NVIDIA DGX Cloud running on Amazon Web Services (AWS). The news follows the April release of the NVIDIA Llama Nemotron Ultra model, which harnesses the NVIDIA open dataset that ServiceNow used to build its Apriel Nemotron 15B model. Ultra is among the strongest open-source models at reasoning, including scientific reasoning, coding, advanced math and other agentic AI tasks.

Read full story

NVIDIA Anticipates Another Leap Forward for Cybersecurity - Enabled by Agentic AI

Press Release by

May 1st, 2025 11:27 Discuss (6 Comments)

Agentic AI is redefining the cybersecurity landscape—introducing new opportunities that demand rethinking how to secure AI while offering the keys to addressing those challenges. Unlike standard AI systems, AI agents can take autonomous actions—interacting with tools, environments, other agents and sensitive data. This provides new opportunities for defenders but also introduces new classes of risks. Enterprises must now take a dual approach: defend both with and against agentic AI.

Building Cybersecurity Defense With Agentic AI
Cybersecurity teams are increasingly overwhelmed by talent shortages and growing alert volume. Agentic AI offers new ways to bolster threat detection, response and AI security—and requires a fundamental pivot in the foundations of the cybersecurity ecosystem. Agentic AI systems can perceive, reason and act autonomously to solve complex problems. They can also serve as intelligent collaborators for cyber experts to safeguard digital assets, mitigate risks in enterprise environments and boost efficiency in security operations centers. This frees up cybersecurity teams to focus on high-impact decisions, helping them scale their expertise while potentially reducing workforce burnout. For example, AI agents can cut the time needed to respond to software security vulnerabilities by investigating the risk of a new common vulnerability or exposure in just seconds. They can search external resources, evaluate environments and summarize and prioritize findings so human analysts can take swift, informed action.

Read full story

NVIDIA Will Bring Agentic AI Reasoning to Enterprises with Google Cloud

Press Release by

Apr 9th, 2025 10:45 Discuss (2 Comments)

NVIDIA is collaborating with Google Cloud to bring agentic AI to enterprises seeking to locally harness the Google Gemini family of AI models using the NVIDIA Blackwell HGX and DGX platforms and NVIDIA Confidential Computing for data safety. With the NVIDIA Blackwell platform on Google Distributed Cloud, on-premises data centers can stay aligned with regulatory requirements and data sovereignty laws by locking down access to sensitive information, such as patient records, financial transactions and classified government information. NVIDIA Confidential Computing also secures sensitive code in the Gemini models from unauthorized access and data leaks.

"By bringing our Gemini models on premises with NVIDIA Blackwell's breakthrough performance and confidential computing capabilities, we're enabling enterprises to unlock the full potential of agentic AI," said Sachin Gupta, vice president and general manager of infrastructure and solutions at Google Cloud. "This collaboration helps ensure customers can innovate securely without compromising on performance or operational ease." Confidential computing with NVIDIA Blackwell provides enterprises with the technical assurance that their user prompts to the Gemini models' application programming interface—as well as the data they used for fine-tuning—remain secure and cannot be viewed or modified. At the same time, model owners can protect against unauthorized access or tampering, providing dual-layer protection that enables enterprises to innovate with Gemini models while maintaining data privacy.

Read full story

NVIDIA & Storage Industry Leaders Unveil New Class of Enterprise Infrastructure for the AI Era

Press Release by

Mar 19th, 2025 15:32 Discuss (0 Comments)

At GTC 2025, NVIDIA announced the NVIDIA AI Data Platform, a customizable reference design that leading providers are using to build a new class of AI infrastructure for demanding AI inference workloads: enterprise storage platforms with AI query agents fueled by NVIDIA accelerated computing, networking and software. Using the NVIDIA AI Data Platform, NVIDIA-Certified Storage providers can build infrastructure to speed AI reasoning workloads with specialized AI query agents. These agents help businesses generate insights from data in near real time, using NVIDIA AI Enterprise software—including NVIDIA NIM microservices for the new NVIDIA Llama Nemotron models with reasoning capabilities—as well as the new NVIDIA AI-Q Blueprint.

Storage providers can optimize their infrastructure to power these agents with NVIDIA Blackwell GPUs, NVIDIA BlueField DPUs, NVIDIA Spectrum-X networking and the NVIDIA Dynamo open-source inference library. Leading data platform and storage providers—including DDN, Dell Technologies, Hewlett Packard Enterprise, Hitachi Vantara, IBM, NetApp, Nutanix, Pure Storage, VAST Data and WEKA—are collaborating with NVIDIA to create customized AI data platforms that can harness enterprise data to reason and respond to complex queries. "Data is the raw material powering industries in the age of AI," said Jensen Huang, founder and CEO of NVIDIA. "With the world's storage leaders, we're building a new class of enterprise infrastructure that companies need to deploy and scale agentic AI across hybrid data centers."

Read full story

CoreWeave Launches Debut Wave of NVIDIA GB200 NVL72-based Cloud Instances

Press Release by

Feb 10th, 2025 11:29 Discuss (1 Comment)

AI reasoning models and agents are set to transform industries, but delivering their full potential at scale requires massive compute and optimized software. The "reasoning" process involves multiple models, generating many additional tokens, and demands infrastructure with a combination of high-speed communication, memory and compute to ensure real-time, high-quality results. To meet this demand, CoreWeave has launched NVIDIA GB200 NVL72-based instances, becoming the first cloud service provider to make the NVIDIA Blackwell platform generally available. With rack-scale NVIDIA NVLink across 72 NVIDIA Blackwell GPUs and 36 NVIDIA Grace CPUs, scaling to up to 110,000 GPUs with NVIDIA Quantum-2 InfiniBand networking, these instances provide the scale and performance needed to build and deploy the next generation of AI reasoning models and agents.

NVIDIA GB200 NVL72 on CoreWeave
NVIDIA GB200 NVL72 is a liquid-cooled, rack-scale solution with a 72-GPU NVLink domain, which enables the six dozen GPUs to act as a single massive GPU. NVIDIA Blackwell features many technological breakthroughs that accelerate inference token generation, boosting performance while reducing service costs. For example, fifth-generation NVLink enables 130 TB/s of GPU bandwidth in one 72-GPU NVLink domain, and the second-generation Transformer Engine enables FP4 for faster AI performance while maintaining high accuracy. CoreWeave's portfolio of managed cloud services is purpose-built for Blackwell. CoreWeave Kubernetes Service optimizes workload orchestration by exposing NVLink domain IDs, ensuring efficient scheduling within the same rack. Slurm on Kubernetes (SUNK) supports the topology block plug-in, enabling intelligent workload distribution across GB200 NVL72 racks. In addition, CoreWeave's Observability Platform provides real-time insights into NVLink performance, GPU utilization and temperatures.

Read full story

DeepSeek-R1 Goes Live on NVIDIA NIM

Press Release by

Jan 31st, 2025 09:59 Discuss (9 Comments)

DeepSeek-R1 is an open model with state-of-the-art reasoning capabilities. Instead of offering direct responses, reasoning models like DeepSeek-R1 perform multiple inference passes over a query, conducting chain-of-thought, consensus and search methods to generate the best answer. Performing this sequence of inference passes—using reason to arrive at the best answer—is known as test-time scaling. DeepSeek-R1 is a perfect example of this scaling law, demonstrating why accelerated computing is critical for the demands of agentic AI inference.

As models are allowed to iteratively "think" through the problem, they create more output tokens and longer generation cycles, so model quality continues to scale. Significant test-time compute is critical to enable both real-time inference and higher-quality responses from reasoning models like DeepSeek-R1, requiring larger inference deployments. R1 delivers leading accuracy for tasks demanding logical inference, reasoning, math, coding and language understanding while also delivering high inference efficiency.

Read full story

NVIDIA NeMo AI Guardrails Upgraded with Latest NIM Microservices

Press Release by

Jan 16th, 2025 10:29 Discuss (2 Comments)

AI agents are poised to transform productivity for the world's billion knowledge workers with "knowledge robots" that can accomplish a variety of tasks. To develop AI agents, enterprises need to address critical concerns like trust, safety, security and compliance. New NVIDIA NIM microservices for AI guardrails—part of the NVIDIA NeMo Guardrails collection of software tools—are portable, optimized inference microservices that help companies improve the safety, precision and scalability of their generative AI applications.

Central to the orchestration of the microservices is NeMo Guardrails, part of the NVIDIA NeMo platform for curating, customizing and guardrailing AI. NeMo Guardrails helps developers integrate and manage AI guardrails in large language model (LLM) applications. Industry leaders Amdocs, Cerence AI and Lowe's are among those using NeMo Guardrails to safeguard AI applications. Developers can use the NIM microservices to build more secure, trustworthy AI agents that provide safe, appropriate responses within context-specific guidelines and are bolstered against jailbreak attempts. Deployed in customer service across industries like automotive, finance, healthcare, manufacturing and retail, the agents can boost customer satisfaction and trust.

Read full story

NVIDIA & IQVIA Build Specialized Agentic AI for Life Science & Healthcare Applications

Press Release by

Jan 13th, 2025 10:19 Discuss (2 Comments)

IQVIA, the world's leading provider of clinical research services, commercial insights and healthcare intelligence, is working with NVIDIA to build custom foundation models and agentic AI workflows that can accelerate research, clinical development and access to new treatments. AI applications trained on the organization's vast healthcare-specific information and guided by its deep domain expertise will help the industry boost the efficiency of clinical trials and optimize planning for the launch of therapies and medical devices—ultimately improving patient outcomes.

Operating in over 100 countries, IQVIA has built the largest global healthcare network and is uniquely connected to the ecosystem with the most comprehensive and granular set of information, analytics and technologies in the industry. Announced today at the J.P. Morgan Conference in San Francisco, IQVIA's collection of models, AI agents and reference workflows will be developed with the NVIDIA AI Foundry platform for building custom models, allowing IQVIA's thousands of pharmaceutical, biotech and medical device customers to benefit from NVIDIA's agentic AI capabilities and IQVIA's technologies, life sciences information and expertise.

Read full story

NVIDIA Puts Grace Blackwell on Every Desk and at Every AI Developer's Fingertips

CES Press Release by

Jan 6th, 2025 22:27 Discuss (14 Comments)

NVIDIA today unveiled NVIDIA Project DIGITS, a personal AI supercomputer that provides AI researchers, data scientists and students worldwide with access to the power of the NVIDIA Grace Blackwell platform. Project DIGITS features the new NVIDIA GB10 Grace Blackwell Superchip, offering a petaflop of AI computing performance for prototyping, fine-tuning and running large AI models.

With Project DIGITS, users can develop and run inference on models using their own desktop system, then seamlessly deploy the models on accelerated cloud or data center infrastructure. "AI will be mainstream in every application for every industry. With Project DIGITS, the Grace Blackwell Superchip comes to millions of developers," said Jensen Huang, founder and CEO of NVIDIA. "Placing an AI supercomputer on the desks of every data scientist, AI researcher and student empowers them to engage and shape the age of AI."

Read full story

NVIDIA Digital Human Technologies Bring AI Characters to Life

Press Release by

Mar 19th, 2024 04:57 Discuss (3 Comments)

NVIDIA announced today that leading AI application developers across a wide range of industries are using NVIDIA digital human technologies to create lifelike avatars for commercial applications and dynamic game characters. The results are on display at GTC, the global AI conference held this week in San Jose, Calif., and can be seen in technology demonstrations from Hippocratic AI, Inworld AI, UneeQ and more.

NVIDIA Avatar Cloud Engine (ACE) for speech and animation, NVIDIA NeMo for language, and NVIDIA RTX for ray-traced rendering are the building blocks that enable developers to create digital humans capable of AI-powered natural language interactions, making conversations more realistic and engaging.

Read full story

Jensen Huang Celebrates Rise of Portable AI Workstations

Press Release by

Mar 8th, 2024 14:14 Discuss (18 Comments)

2024 will be the year generative AI gets personal, the CEOs of NVIDIA and HP said today in a fireside chat, unveiling new laptops that can build, test and run large language models. "This is a renaissance of the personal computer," said NVIDIA founder and CEO Jensen Huang at HP Amplify, a gathering in Las Vegas of about 1,500 resellers and distributors. "The work of creators, designers and data scientists is going to be revolutionized by these new workstations."

Greater Speed and Security
"AI is the biggest thing to come to the PC in decades," said HP's Enrique Lores, in the runup to the announcement of what his company billed as "the industry's largest portfolio of AI PCs and workstations." Compared to running their AI work in the cloud, the new systems will provide increased speed and security while reducing costs and energy, Lores said in a keynote at the event. New HP ZBooks provide a portfolio of mobile AI workstations powered by a full range of NVIDIA RTX Ada Generation GPUs. Entry-level systems with the NVIDIA RTX 500 Ada Generation Laptop GPU let users run generative AI apps and tools wherever they go. High-end models pack the RTX 5000 to deliver up to 682 TOPS, so they can create and run LLMs locally, using retrieval-augmented generation (RAG) to connect to their content for results that are both personalized and private.

Read full story

ServiceNow, Hugging Face & NVIDIA Release StarCoder2 - a New Open-Access LLM Family

Press Release by

Feb 28th, 2024 14:05 Discuss (2 Comments)

ServiceNow, Hugging Face, and NVIDIA today announced the release of StarCoder2, a family of open-access large language models for code generation that sets new standards for performance, transparency, and cost-effectiveness. StarCoder2 was developed in partnership with the BigCode Community, managed by ServiceNow, the leading digital workflow company making the world work better for everyone, and Hugging Face, the most-used open-source platform, where the machine learning community collaborates on models, datasets, and applications. Trained on 619 programming languages, StarCoder2 can be further trained and embedded in enterprise applications to perform specialized tasks such as application source code generation, workflow generation, text summarization, and more. Developers can use its code completion, advanced code summarization, code snippets retrieval, and other capabilities to accelerate innovation and improve productivity.

StarCoder2 offers three model sizes: a 3-billion-parameter model trained by ServiceNow; a 7-billion-parameter model trained by Hugging Face; and a 15-billion-parameter model built by NVIDIA with NVIDIA NeMo and trained on NVIDIA accelerated infrastructure. The smaller variants provide powerful performance while saving on compute costs, as fewer parameters require less computing during inference. In fact, the new 3-billion-parameter model matches the performance of the original StarCoder 15-billion-parameter model. "StarCoder2 stands as a testament to the combined power of open scientific collaboration and responsible AI practices with an ethical data supply chain," emphasized Harm de Vries, lead of ServiceNow's StarCoder2 development team and co-lead of BigCode. "The state-of-the-art open-access model improves on prior generative AI performance to increase developer productivity and provides developers equal access to the benefits of code generation AI, which in turn enables organizations of any size to more easily meet their full business potential."

Read full story

NVIDIA Announces Q4 and Fiscal 2024 Results, Clocks 126% YoY Revenue Growth, Gaming Just 1/6th of Data Center Revenues

Press Release by

Feb 21st, 2024 19:11 Discuss (40 Comments)

NVIDIA (NASDAQ: NVDA) today reported revenue for the fourth quarter ended January 28, 2024, of $22.1 billion, up 22% from the previous quarter and up 265% from a year ago. For the quarter, GAAP earnings per diluted share was $4.93, up 33% from the previous quarter and up 765% from a year ago. Non-GAAP earnings per diluted share was $5.16, up 28% from the previous quarter and up 486% from a year ago.

For fiscal 2024, revenue was up 126% to $60.9 billion. GAAP earnings per diluted share was $11.93, up 586% from a year ago. Non-GAAP earnings per diluted share was $12.96, up 288% from a year ago. "Accelerated computing and generative AI have hit the tipping point. Demand is surging worldwide across companies, industries and nations," said Jensen Huang, founder and CEO of NVIDIA.

Read full story

AWS and NVIDIA Partner to Deliver 65 ExaFLOP AI Supercomputer, Other Solutions

Press Release by

Nov 28th, 2023 13:37 Discuss (5 Comments)

Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company (NASDAQ: AMZN), and NVIDIA (NASDAQ: NVDA) today announced an expansion of their strategic collaboration to deliver the most-advanced infrastructure, software and services to power customers' generative artificial intelligence (AI) innovations. The companies will bring together the best of NVIDIA and AWS technologies—from NVIDIA's newest multi-node systems featuring next-generation GPUs, CPUs and AI software, to AWS Nitro System advanced virtualization and security, Elastic Fabric Adapter (EFA) interconnect, and UltraCluster scalability—that are ideal for training foundation models and building generative AI applications.

The expanded collaboration builds on a longstanding relationship that has fueled the generative AI era by offering early machine learning (ML) pioneers the compute performance required to advance the state-of-the-art in these technologies.

Read full story

Dropbox and NVIDIA Team to Bring Personalized Generative AI to Millions of Customers

Press Release by

Nov 17th, 2023 08:32 Discuss (21 Comments)

Today, Dropbox, Inc. and NVIDIA announced a collaboration to supercharge knowledge work and improve productivity for millions of Dropbox customers through the power of AI. The companies' collaboration will expand Dropbox's extensive AI functionality with new uses for personalized generative AI to improve search accuracy, provide better organization, and simplify workflows for its customers across their cloud content.

Dropbox plans to leverage NVIDIA's AI foundry consisting of NVIDIA AI Foundation Models, NVIDIA AI Enterprise software and NVIDIA accelerated computing to enhance its latest AI-powered product experiences. These include Dropbox Dash, universal search that connects apps, tools, and content in a single search bar to help customers find what they need; Dropbox AI, a tool that allows customers to ask questions and get summaries on large files across their entire Dropbox; among other AI capabilities in Dropbox.

Read full story

NVIDIA Introduces Generative AI Foundry Service on Microsoft Azure for Enterprises and Startups Worldwide

Press Release by

Nov 15th, 2023 10:28 Discuss (0 Comments)

NVIDIA today introduced an AI foundry service to supercharge the development and tuning of custom generative AI applications for enterprises and startups deploying on Microsoft Azure.

The NVIDIA AI foundry service pulls together three elements—a collection of NVIDIA AI Foundation Models, NVIDIA NeMo framework and tools, and NVIDIA DGX Cloud AI supercomputing services—that give enterprises an end-to-end solution for creating custom generative AI models. Businesses can then deploy their customized models with NVIDIA AI Enterprise software to power generative AI applications, including intelligent search, summarization and content generation.

Read full story

NVIDIA Turbocharges Generative AI Training in MLPerf Benchmarks

Press Release by

Nov 8th, 2023 13:03 Discuss (2 Comments)

NVIDIA's AI platform raised the bar for AI training and high performance computing in the latest MLPerf industry benchmarks. Among many new records and milestones, one in generative AI stands out: NVIDIA Eos - an AI supercomputer powered by a whopping 10,752 NVIDIA H100 Tensor Core GPUs and NVIDIA Quantum-2 InfiniBand networking - completed a training benchmark based on a GPT-3 model with 175 billion parameters trained on one billion tokens in just 3.9 minutes. That's a nearly 3x gain from 10.9 minutes, the record NVIDIA set when the test was introduced less than six months ago.

The benchmark uses a portion of the full GPT-3 data set behind the popular ChatGPT service that, by extrapolation, Eos could now train in just eight days, 73x faster than a prior state-of-the-art system using 512 A100 GPUs. The acceleration in training time reduces costs, saves energy and speeds time-to-market. It's heavy lifting that makes large language models widely available so every business can adopt them with tools like NVIDIA NeMo, a framework for customizing LLMs. In a new generative AI test ‌this round, 1,024 NVIDIA Hopper architecture GPUs completed a training benchmark based on the Stable Diffusion text-to-image model in 2.5 minutes, setting a high bar on this new workload. By adopting these two tests, MLPerf reinforces its leadership as the industry standard for measuring AI performance, since generative AI is the most transformative technology of our time.

Read full story

Return to Keyword Browsing

Jul 19th, 2025 00:02 CDT change timezone

Latest GPU Drivers

New Forum Posts

23:55 by A Computer Guy
What's your latest tech purchase? (24311)
23:24 by GodisanAtheist
9060 XT 8GB or 5060 8GB? (35)
23:23 by esserpain
question for everyone about google play games beta (1)
23:17 by esserpain
Gacha Games - Discussions, Pulls, Updates, etc. (0)
22:53 by Chrispy_
Idle issue since 5060 ti installed (28)
22:38 by thesmokingman
Windows 11 General Discussion (6151)
21:47 by swhite4784
Have you got pie today? (16795)
21:19 by Fokker
Anime Nation (13054)
21:08 by Event Horizon
Stalker 2 is looking great. (214)
21:06 by sepheronx
Intel Core i9-13980HX Undervolt - Observations and Questions (62)

Popular Reviews

Jul 14th, 2025 MSI GeForce RTX 5060 Gaming OC Review
Jul 17th, 2025 Razer Blade 16 (2025) Review - Thin, Light, Punchy, and Efficient
Jul 18th, 2025 Thermal Grizzly WireView Pro Review
Jul 16th, 2025 Pulsar X2 Crazylight Review
Jul 15th, 2025 SilverStone SETA H2 Review
Jul 18th, 2025 AVerMedia Live Gamer Ultra S (GC553Pro) Review
May 13th, 2025 Upcoming Hardware Launches 2025 (Updated May 2025)
Jun 20th, 2025 Sapphire Radeon RX 9060 XT Pulse OC 16 GB Review - An Excellent Choice
Jul 4th, 2025 NVIDIA GeForce RTX 5050 8 GB Review
Jul 11th, 2025 Our Visit to the Hunter Super Computer

TPU on YouTube

Controversial News Posts