News Posts matching #LLM

Return to Keyword Browsing

NVIDIA Espouses Generative AI for Improved Productivity Across Industries

A watershed moment on Nov. 22, 2022, was mostly virtual, yet it shook the foundations of nearly every industry on the planet. On that day, OpenAI released ChatGPT, the most advanced artificial intelligence chatbot ever developed. This set off demand for generative AI applications that help businesses become more efficient, from providing consumers with answers to their questions to accelerating the work of researchers as they seek scientific breakthroughs, and much, much more.

Businesses that previously dabbled in AI are now rushing to adopt and deploy the latest applications. Generative AI—the ability of algorithms to create new text, images, sounds, animations, 3D models and even computer code—is moving at warp speed, transforming the way people work and play. By employing large language models (LLMs) to handle queries, the technology can dramatically reduce the time people devote to manual tasks like searching for and compiling information.

AMD CEO Lisa Su Notes: AI to Dominate Chip Design

Artificial intelligence (AI) has emerged as a transformative force in chip design, with recent examples from China and the United States showcasing its potential. Jensen Huang, CEO of Nvidia, believes that AI can empower individuals to become programmers, while Lisa Su, CEO of AMD, predicts an era where AI dominates chip design. During the 2023 World Artificial Intelligence Conference (WAIC) in Shanghai, Su emphasized the importance of interdisciplinary collaboration for the next generation of chip designers. To excel in this field, engineers must possess a holistic understanding of hardware, software, and algorithms, enabling them to create superior chip designs that meet system usage, customer deployment, and application requirements.

The integration of AI into chip design processes has gained momentum, fueled by the AI revolution catalyzed by large language models (LLMs). Both Huang and Mark Papermaster, CTO of AMD, acknowledge the benefits of AI in accelerating computation and facilitating chip design. AMD has already started leveraging AI in semiconductor design, testing, and verification, with plans to expand its use of generative AI in chip design applications. Companies are now actively exploring the fusion of AI technology with Electronic Design Automation (EDA) tools to streamline complex tasks and minimize manual intervention in chip design. Despite limited data and accuracy challenges, the "EDA+AI" approach holds great promise. For instance, Synopsys has invested significantly in AI tool research and recently launched Synopsys.ai, the industry's first end-to-end AI-driven EDA solution. This comprehensive solution empowers developers to harness AI at every stage of chip development, from system architecture and design to manufacturing, marking a significant leap forward in AI's integration into chip design workflows.

Oracle Fusion Cloud HCM Enhanced with Generative AI, Projected to Boost HR Productivity

Oracle today announced the addition of generative AI-powered capabilities within Oracle Fusion Cloud Human Capital Management (HCM). Supported by the Oracle Cloud Infrastructure (OCI) generative AI service, the new capabilities are embedded in existing HR processes to drive faster business value, improve productivity, enhance the candidate and employee experience, and streamline HR processes.

"Generative AI is boosting productivity and unlocking a new world of skills, ideas, and creativity that can have an immediate impact in the workplace," said Chris Leone, executive vice president, applications development, Oracle Cloud HCM. "With the ability to summarize, author, and recommend content, generative AI helps to reduce friction as employees complete important HR functions. For example, with the new embedded generative AI capabilities in Oracle Cloud HCM, our customers will be able to take advantage of large language models to drastically reduce the time required to complete tasks, improve the employee experience, enhance the accuracy of workforce insights, and ultimately increase business value."

NVIDIA Cambridge-1 AI Supercomputer Hooked up to DGX Cloud Platform

Scientific researchers need massive computational resources that can support exploration wherever it happens. Whether they're conducting groundbreaking pharmaceutical research, exploring alternative energy sources or discovering new ways to prevent financial fraud, accessible state-of-the-art AI computing resources are key to driving innovation. This new model of computing can solve the challenges of generative AI and power the next wave of innovation. Cambridge-1, a supercomputer NVIDIA launched in the U.K. during the pandemic, has powered discoveries from some of the country's top healthcare researchers. The system is now becoming part of NVIDIA DGX Cloud to accelerate the pace of scientific innovation and discovery - across almost every industry.

As a cloud-based resource, it will broaden access to AI supercomputing for researchers in climate science, autonomous machines, worker safety and other areas, delivered with the simplicity and speed of the cloud, ideally located for the U.K. and European access. DGX Cloud is a multinode AI training service that makes it possible for any enterprise to access leading-edge supercomputing resources from a browser. The original Cambridge-1 infrastructure included 80 NVIDIA DGX systems; now it will join with DGX Cloud, to allow customers access to world-class infrastructure.

ASUS Demonstrates Liquid Cooling and AI Solutions at ISC High Performance 2023

ASUS today announced a showcase of the latest HPC solutions to empower innovation and push the boundaries of supercomputing, at ISC High Performance 2023 in Hamburg, Germany on May 21-25, 2023. The ASUS exhibition, at booth H813, will reveal the latest supercomputing advances, including liquid-cooling and AI solutions, as well as outlining a slew of sustainability breakthroughs - plus a whole lot more besides.

Comprehensive Liquid-Cooling Solutions
ASUS is working with Submer, the industry-leading liquid-cooling provider to demonstrate immersion-cooling solutions at ISC High Performance 2023, focused on ASUS RS720-E11-IM - the Intel -based 2U4N server that leverages our trusted legacy server architecture and popular features to create a compact new design. This fresh outlook improves the accessibility on I/O ports, storage and cable routing, and strengthens the structure to allow the server to be placed vertically in the tank, with durability assured.

NVIDIA A800 China-Tailored GPU Performance within 70% of A100

The recent growth in demand for training Large Language Models (LLMs) like Generative Pre-trained Transformer (GPT) has sparked the interest of many companies to invest in GPU solutions that are used to train these models. However, countries like China have struggled with US sanctions, and NVIDIA has to create custom models that meet US export regulations. Carrying two GPUs, H800 and A800, they represent cut-down versions of the original H100 and A100, respectively. We reported about H800; however, it remained as mysterious as A800 that we are talking about today. Thanks to MyDrivers, we have information that the A800 GPU performance is within 70% of the regular A100.

The regular A100 GPU manages 9.7 TeraFLOPs of FP64, 19.5 TeraFLOPS of FP64 Tensor, and up to 624 BF16/FP16 TeraFLOPS with sparsity. A rough napkin math would suggest that 70% performance of the original (a 30% cut) would equal 6.8 TeraFLOPs of FP64 precision, 13.7 TeraFLOPs of FP64 Tensor, and 437 BF16/FP16 TeraFLOPs with sparsity. MyDrivers notes that A800 can be had for 100,000 Yuan, translating to about 14,462 USD at the time of writing. This is not the most capable GPU that Chinese companies can acquire, as H800 exists. However, we don't have any information about its performance for now.

NVIDIA H100 Compared to A100 for Training GPT Large Language Models

NVIDIA's H100 has recently become available to use via Cloud Service Providers (CSPs), and it was only a matter of time before someone decided to benchmark its performance and compare it to the previous generation's A100 GPU. Today, thanks to the benchmarks of MosaicML, a startup company led by the ex-CEO of Nervana and GM of Artificial Intelligence (AI) at Intel, Naveen Rao, we have some comparison between these two GPUs with a fascinating insight about the cost factor. Firstly, MosaicML has taken Generative Pre-trained Transformer (GPT) models of various sizes and trained them using bfloat16 and FP8 Floating Point precision formats. All training occurred on CoreWeave cloud GPU instances.

Regarding performance, the NVIDIA H100 GPU achieved anywhere from 2.2x to 3.3x speedup. However, an interesting finding emerges when comparing the cost of running these GPUs in the cloud. CoreWeave prices the H100 SXM GPUs at $4.76/hr/GPU, while the A100 80 GB SXM gets $2.21/hr/GPU pricing. While the H100 is 2.2x more expensive, the performance makes it up, resulting in less time to train a model and a lower price for the training process. This inherently makes H100 more attractive for researchers and companies wanting to train Large Language Models (LLMs) and makes choosing the newer GPU more viable, despite the increased cost. Below, you can see tables of comparison between two GPUs in training time, speedup, and cost of training.

NVIDIA Wants to Set Guardrails for Large Language Models Such as ChatGPT

ChatGPT has surged in popularity over a few months, and usage of this software has been regarded as one of the fastest-growing apps ever. Based on a Large Language Model (LLM) called GPT-3.5/4, ChatGPT uses user input to form answers based on its extensive database used in the training process. Having billions of parameters, the GPT models used for GPT can give precise answers; however, sometimes, these models hallucinate. Given a question about a non-existing topic/subject, ChatGPT can induce hallucination and make up the information. To prevent these hallucinations, NVIDIA, the maker of GPUs used for training and inferencing LLMs, has released a software library to put AI in place, called NeMo Guardrails.

As the NVIDIA repository states: "NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems. Guardrails (or "rails" for short) are specific ways of controlling the output of a large language model, such as not talking about politics, responding in a particular way to specific user requests, following a predefined dialog path, using a particular language style, extracting structured data, and more." These guardrails are easily programmable and can stop LLMs from outputting unwanted content. For a company that invests heavily in the hardware and software landscape, this launch is a logical decision to keep the lead in setting the infrastructure for future LLM-based applications.
Return to Keyword Browsing
Feb 22nd, 2025 00:09 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts