News Posts matching #ChatGPT

Return to Keyword Browsing

NVIDIA Launches Cosmos World Foundation Model Platform to Accelerate Physical AI Development

NVIDIA today announced NVIDIA Cosmos, a platform comprising state-of-the-art generative world foundation models, advanced tokenizers, guardrails and an accelerated video processing pipeline built to advance the development of physical AI systems such as autonomous vehicles (AVs) and robots.

Physical AI models are costly to develop, and require vast amounts of real-world data and testing. Cosmos world foundation models, or WFMs, offer developers an easy way to generate massive amounts of photoreal, physics-based synthetic data to train and evaluate their existing models. Developers can also build custom models by fine-tuning Cosmos WFMs. Cosmos models will be available under an open model license to accelerate the work of the robotics and AV community. Developers can preview the first models on the NVIDIA API catalog, or download the family of models and fine-tuning framework from the NVIDIA NGC catalog or Hugging Face.

Microsoft Brings Copilot AI Assistant to Windows Terminal

Microsoft has taken another significant step in its AI integration strategy by introducing "Terminal Chat," an AI assistant now available in Windows Terminal. This latest feature brings conversational AI capabilities directly to the command-line interface, marking a notable advancement in making terminal operations more accessible to users of all skill levels. The new feature, currently available in Windows Terminal (Canary), leverages various AI services, including ChatGPT, GitHub Copilot, and Azure OpenAI, to provide interactive assistance for command-line operations. What sets Terminal Chat apart is its context-aware functionality, which automatically recognizes the specific shell environment being used—whether it's PowerShell, Command Prompt, WSL Ubuntu, or Azure Cloud Shell—and tailors its responses accordingly.

Users can interact with Terminal Chat through a dedicated interface within Windows Terminal, where they can ask questions, troubleshoot errors, and request guidance on specific commands. The system provides shell-specific suggestions, automatically adjusting its recommendations based on whether a user is working in Windows PowerShell, Linux, or other environments. For example, when asked about creating a directory, Terminal Chat will suggest "New-Item -ItemType Directory" for PowerShell users while providing "mkdir" as the appropriate command for Linux environments. This intelligent adaptation helps bridge the knowledge gap between different command-line interfaces. Below are some examples courtesy of Windows Latest and their testing:

OpenAI Designs its First AI Chip in Collaboration with Broadcom and TSMC

According to a recent Reuters report, OpenAI is continuing with its moves in the custom silicon space, expanding beyond its reported talks with Broadcom to include a broader strategy involving multiple industry leaders. Broadcom is a fabless chip designer known for a wide range of silicon solutions spanning from networking, PCIe, SSD controllers, and PHYs all the way up to custom ASICs. The company behind ChatGPT is actively working with both Broadcom and TSMC to develop its first proprietary AI chip, specifically focused on inference operations. Getting a custom chip to do training runs is a bit more complex task, and OpenAI leaves that to its current partners until the company figures out all details. Even with an inference chip, the scale at which OpenAI works and serves its models makes financial sense for the company to develop custom solutions tailored to its infrastructure needs.

This time, the initiative represents a more concrete and nuanced approach than previously understood. Rather than just exploratory discussions, OpenAI has assembled a dedicated chip team of approximately 20 people, led by former Google TPU engineers Thomas Norrie and Richard Ho. The company has secured manufacturing capacity with TSMC, targeting a 2026 timeline for its first custom-designed chip. While Broadcom's involvement leverages its expertise in helping companies optimize chip designs for manufacturing and manage data movement between chips—crucial for AI systems running thousands of processors in parallel—OpenAI is simultaneously diversifying its compute strategy. This includes adding AMD's Instinct MI300X chips to its infrastructure alongside its existing NVIDIA deployments. Similarly, Meta has the same approach, where it now trains its models on NVIDIA GPUs and serves them to the public (inferencing) using AMD Instinct MI300X.

Interview with AMD's Senior Vice President and Chief Software Officer Andrej Zdravkovic: UDNA, ROCm for Radeon, AI Everywhere, and Much More!

A few days ago, we reported on AMD's newest expansion plans for Serbia. The company opened two new engineering design centers with offices in Belgrade and Nis. We were invited to join the opening ceremony and got an exclusive interview with one of AMD's top executives, Andrej Zdravkovic, who is the senior vice president and Chief Software Officer. Previously, we reported on AMD's transition to become a software company. The company has recently tripled its software engineering workforce and is moving some of its best people to support these teams. AMD's plan is spread over a three to five-year timeframe to improve its software ecosystem, accelerating hardware development to launch new products more frequently and to react to changes in software demand. AMD found that to help these expansion efforts, opening new design centers in Serbia would be very advantageous.

We sat down with Andrej Zdravkovic to discuss the purpose of AMD's establishment in Serbia and the future of some products. Zdravkovic is actually an engineer from Serbia, where he completed his Bachelor's and Master's degrees in electrical engineering from Belgrade University. In 1998, Zdravkovic joined ATI and quickly rose through the ranks, eventually becoming a senior director. During his decade-long tenure, Zdravkovic witnessed a significant industry shift as AMD acquired ATI in 2006. After a brief stint at another company, Zdravkovic returned to AMD in 2015, bringing with him a wealth of experience and a unique perspective on the evolution of the graphics and computing industry.
Here is the full interview:

AI Startup Etched Unveils Transformer ASIC Claiming 20x Speed-up Over NVIDIA H100

A new startup emerged out of stealth mode today to power the next generation of generative AI. Etched is a company that makes an application-specific integrated circuit (ASIC) to process "Transformers." The transformer is an architecture for designing deep learning models developed by Google and is now the powerhouse behind models like OpenAI's GPT-4o in ChatGPT, Anthropic Claude, Google Gemini, and Meta's Llama family. Etched wanted to create an ASIC for processing only the transformer models, making a chip called Sohu. The claim is Sohu outperforms NVIDIA's latest and greatest by an entire order of magnitude. Where a server configuration with eight NVIDIA H100 GPU clusters pushes Llama-3 70B models at 25,000 tokens per second, and the latest eight B200 "Blackwell" GPU cluster pushes 43,000 tokens/s, the eight Sohu clusters manage to output 500,000 tokens per second.

Why is this important? Not only does the ASIC outperform Hopper by 20x and Blackwell by 10x, but it also serves so many tokens per second that it enables an entirely new fleet of AI applications requiring real-time output. The Sohu architecture is so efficient that 90% of the FLOPS can be used, while traditional GPUs boast a 30-40% FLOP utilization rate. This translates into inefficiency and waste of power, which Etched hopes to solve by building an accelerator dedicated to power transformers (the "T" in GPT) at massive scales. Given that the frontier model development costs more than one billion US dollars, and hardware costs are measured in tens of billions of US Dollars, having an accelerator dedicated to powering a specific application can help advance AI faster. AI researchers often say that "scale is all you need" (resembling the legendary "attention is all you need" paper), and Etched wants to build on that.

macOS Sequoia Takes Productivity and Intelligence on Mac to New Heights

Apple today previewed macOS Sequoia, the next version of the world's most advanced desktop operating system, bringing entirely new ways of working and transformative intelligence features to Mac. macOS Sequoia is full of exciting new capabilities, including iPhone Mirroring, which expands Continuity by enabling full access to and control of iPhone directly from macOS. Safari gets another big update with the new Highlights feature for effortless information discovery on webpages while browsing. The new Passwords app makes it even easier to access passwords and organize credentials all in one place. Gaming advances with a more immersive experience, as well as a breadth of new titles, including Assassin's Creed Shadows, Frostpunk 2, and more.

macOS Sequoia also introduces Apple Intelligence, the personal intelligence system for Mac, iPhone, and iPad that combines the power of generative models with personal context to deliver intelligence that's incredibly useful and relevant. Built with privacy from the ground up, Apple Intelligence is deeply integrated into macOS Sequoia, iOS 18, and iPadOS 18. It understands and creates language and images, takes action across apps, and draws from personal context, simplifying and accelerating everyday tasks. Taking full advantage of the power of Apple Silicon and the Neural Engine, Apple Intelligence will be supported by every Mac with an M-series chip.

NVIDIA Announces Project G-Assist: An AI Chatbot that's Situationally Aware of Your Game

Imagine you're playing a game, you're stuck somewhere, don't know how to craft something, can't manage your inventory, or need to get out of some tricky puzzle. You normally pause and minimize your game, pull up a web-browser, and sift through dozens of search results to find the right Wiki or community thread that solves your problem. Wouldn't it be great, if you could pull up an in-game chat window and interact with an AI chat assistant (not unlike ChatGPT), except, the assistant is situationally aware of your entire game-state? That is NVIDIA Project G-Assist.

NVIDIA is training an AI on the vast ocean of information from the Internet on the countless games there are. You will soon be able to pull up an AI chat assistant, and will be able to either give it a text or natural voice input on how you want it to help you in the situation of the game you're in, and it will be able to guide you. G-Assist will not play the game for you, just give you precise and concise answers based on your game state (something no Google search or game wiki can). Besides assisting your gameplay, G-Assist will also help you with performance optimization and system tuning, the way the NVIDIA App and GeForce Experience already do.

ChatGPT Comes to Desktop with OpenAI's Latest GPT-4o Model That Talks With Users

At OpenAI's spring update, a lot of eyes were fixed on the company, which spurred the AI boom with the ChatGPT application. Now being almost a must-have app for consumers and prosumers alike, ChatGPT is a de-facto application for the latest AI innovation, backed by researchers and scientists from OpenAI. Today, OpenAI announced a new model called GPT-4o (Omni), which hopes to bring advanced intelligence, improved overall capabilities, and real-time voice interaction with users. Now, the ChatGPT application wants to become like a personal assistant that actively communicates with users and provides much broader capabilities. OpenAI claims that it can respond to audio inputs as quickly as 232 milliseconds, with an average of 320 milliseconds, similar to human response time in conversations.

However, OpenAI states that it wants ChatGPT's latest GPT-4o model to be available to the free, Plus, and Team paid subscribers, where paid subscribers get 5x higher usage and early access to the model. Interestingly, the GPT-4o model is much improved across a variety of standard benchmarks like MMLU, Math, HumanEval, GPQA, and others, where it now surpasses almost all models except Claude 3 Opus in MGSM. It now understands more than 50 languages and can do real time translation. In addition to the new model, OpenAI announced that they are launching a desktop ChatGPT app, which can act as a personal assistant and see what is happening on the screen, but it is only allowed by user command. This is supposed to bring a much more refined user experience and enable users to use AI as a third person to help understand the screen's content. Initially only available on macOS, we are waiting for OpenAI to launch the Windows ChatGPT application so everyone can also experience the new technology.

Apple Inches Closer to a Deal with OpenAI to Bring ChatGPT Technology to iPhone

To bring cutting-edge artificial intelligence capabilities to its flagship product, Apple is said to be finalizing a deal with OpenAI to integrate the ChatGPT technology into the upcoming iOS 18 for iPhones. According to Bloomberg, multiple sources report that after months of negotiations, the two tech giants are putting the finishing touches on a partnership that would be an important moment for consumer AI. However, OpenAI may not be Apple's only AI ally. The company has also reportedly been in talks with Google over licensing the Gemini chatbot, though no known agreement has been reached yet. The rare team-up between the fiercely competitive firms underscores the intense focus on AI integration across the industry.

Apple's strategic moves are a clear indication of its recognition of the transformative potential of advanced AI capabilities for the iPhone experience. The integration of OpenAI's language model could empower Siri to understand and respond to complex voice queries with deep contextual awareness. This could revolutionize the way Apple's customers interact with devices, offering hope for a more intuitive and advanced iPhone experience. Potential Gemini integration opens up another realm of possibilities around Google's image and multimodal AI capabilities. Future iPhones may be able to analyze and describe visual scenes, annotate images, generate custom imagery from natural language prompts, and even synthesize audio using AI vocals - all within a conversational interface. As the AI arms race intensifies, Apple wants to position itself at the forefront through these partnerships.
Apple and OpenAI

We Tested NVIDIA's new ChatRTX: Your Own GPU-accelerated AI Assistant with Photo Recognition, Speech Input, Updated Models

NVIDIA today unveiled ChatRTX, the AI assistant that runs locally on your machine, and which is accelerated by your GeForce RTX GPU. NVIDIA had originally launched this as "Chat with RTX" back in February 2024, back then this was regarded more as a public tech demo. We reviewed the application in our feature article. The ChatRTX rebranding is probably aimed at making the name sound more like ChatGPT, which is what the application aims to be—except it runs completely on your machine, and is exhaustively customizable. The most obvious advantage of a locally-run AI assistant is privacy—you are interacting with an assistant that processes your prompt locally, and accelerated by your GPU; the second is that you're not held back by performance bottlenecks by cloud-based assistants.

ChatRTX is a major update over the Chat with RTX tech-demo from February. To begin with, the application has several stability refinements from Chat with RTX, which felt a little rough on the edges. NVIDIA has significantly updated the LLMs included with the application, including Mistral 7B INT4, and Llama 2 7B INT4. Support is also added for additional LLMs, including Gemma, a local LLM trained by Google, based on the same technology used to make Google's flagship Gemini model. ChatRTX now also supports ChatGLM3, for both English and Chinese prompts. Perhaps the biggest upgrade ChatRTX is its ability to recognize images on your machine, as it incorporates CLIP (contrastive language-image pre-training) from OpenAI. CLIP is an LLM that recognizes what it's seeing in image collections. Using this feature, you can interact with your image library without the need for metadata. ChatRTX doesn't just take text input—you can speak to it. It now accepts natural voice input, as it integrates the Whisper speech-to-text NLI model.
DOWNLOAD: NVIDIA ChatRTX

Groq LPU AI Inference Chip is Rivaling Major Players like NVIDIA, AMD, and Intel

AI workloads are split into two different categories: training and inference. While training requires large computing and memory capacity, access speeds are not a significant contributor; inference is another story. With inference, the AI model must run extremely fast to serve the end-user with as many tokens (words) as possible, hence giving the user answers to their prompts faster. An AI chip startup, Groq, which was in stealth mode for a long time, has been making major moves in providing ultra-fast inference speeds using its Language Processing Unit (LPU) designed for large language models (LLMs) like GPT, Llama, and Mistral LLMs. The Groq LPU is a single-core unit based on the Tensor-Streaming Processor (TSP) architecture which achieves 750 TOPS at INT8 and 188 TeraFLOPS at FP16, with 320x320 fused dot product matrix multiplication, in addition to 5,120 Vector ALUs.

Having massive concurrency with 80 TB/s of bandwidth, the Groq LPU has 230 MB capacity of local SRAM. All of this is working together to provide Groq with a fantastic performance, making waves over the past few days on the internet. Serving the Mixtral 8x7B model at 480 tokens per second, the Groq LPU is providing one of the leading inference numbers in the industry. In models like Llama 2 70B with 4096 token context length, Groq can serve 300 tokens/s, while in smaller Llama 2 7B with 2048 tokens of context, Groq LPU can output 750 tokens/s. According to the LLMPerf Leaderboard, the Groq LPU is beating the GPU-based cloud providers at inferencing LLMs Llama in configurations of anywhere from 7 to 70 billion parameters. In token throughput (output) and time to first token (latency), Groq is leading the pack, achieving the highest throughput and second lowest latency.

OpenAI CEO Reportedly Seeking Funds for Purpose-built Chip Foundries

OpenAI CEO, Sam Altman, had a turbulent winter 2023 career moment, but appears to be going all in with his company's future interests. A Bloomberg report suggests that the tech visionary has initiated a major fundraising initiative for the construction of OpenAI-specific semiconductor production plants. The AI evangelist reckons that his industry will become prevalent enough to demand a dedicated network of manufacturing facilities—the U.S. based artificial intelligence (AI) research organization is (reportedly) exploring custom artificial intelligence chip designs. Proprietary AI-focused GPUs and accelerators are not novelties at this stage in time—many top tech companies rely on NVIDIA solutions, but are keen to deploy custom-built hardware in the near future.

OpenAI's popular ChatGPT system is reliant on NVIDIA H100 and A100 GPUs, but tailor-made alternatives seem to be the desired route for Altman & Co. The "on their own terms" pathway seemingly skips an expected/traditional chip manufacturing process—the big foundries could struggle to keep up with demand for AI-oriented silicon. G42 (an Abu Dhabi-based AI development holding company) and SoftBank Group are mentioned as prime investment partners in OpenAI's fledgling scheme—Bloomberg proposes that Altman's team is negotiating a $8 to 10 billion deal with top brass at G42. OpenAI's planned creation of its own foundry network is certainly a lofty and costly goal—the report does not specify whether existing facilities will be purchased and overhauled, or new plants being constructed entirely from scratch.

Microsoft Copilot Becomes a Dedicated Key on Windows-Powered PC Keyboards

Microsoft today announced the introduction of a new Copilot key devoted to its AI assistant on Windows PC keyboards. The key will provide instant access to Microsoft's conversational Copilot feature, offering a ChatGPT-style AI bot right from a button press. The Copilot key represents the first significant Windows keyboard change in nearly 30 years since the addition of the Windows key itself in the 90s. Microsoft sees it as similarly transformative - making AI an integrated part of devices. The company expects broad adoption from PC manufacturers starting this spring. The Copilot key will likely substitute keys like menu or Office on standard layouts. While currently just launching Copilot, Microsoft could also enable combo presses in the future.

The physical keyboard button helps make AI feel native rather than an add-on, as Microsoft aggressively pushes Copilot into Windows 11 and Edge. The company declared its aim to make 2024 the "year of the AI PC", with Copilot as the entry point. Microsoft envisions AI eventually becoming seamlessly woven into computing through system, silicon, and hardware advances. The Copilot key may appear minor, but it signals that profound change is on the horizon. However, users will only embrace the vision if Copilot proves consistently beneficial rather than gimmicky. Microsoft is betting that injecting AI deeper into PCs will provide usefulness, justifying the disruption. With major OS and hardware partners already committed to adopting the Copilot key, Microsoft's AI-first computer vision is materializing rapidly. The button press that invokes Copilot may soon feel as natural as hitting the Windows key or spacebar. As we await the reported launch of Windows 12, we can expect deeper integration with Copilot to appear.

OpenAI Names Emmett Shear as CEO, Sam Altman Joins Microsoft and Drags Hundreds of Employees With Him

On Friday, the AI world was caught by storm as the board of directors of OpenAI, the maker of ChatGPT and other AI software, fired its CEO, Sam Altman. According to multiple sources reporting the state of OpenAI, Sam Altman was stunned by the board's decision of his removal, where the company published a public statement with many remarks, primarily informing the public that "Mr. Altman's departure follows a deliberative review process by the board, which concluded that he was not consistently candid in his communications with the board, hindering its ability to exercise its responsibilities. The board no longer has confidence in his ability to continue leading OpenAI."

After Sam Altman's leave, Greg Brockman, president and co-founder of OpenAI, announced that he was also leaving the company. Satya Nadella, CEO of Microsoft, and other investors have stepped in to lead negotiations between the OpenAI board and Sam Altman to return to his position as the CEO of the non-profit company. However, according to The Information, Sam Altman will not be returning as the CEO, and instead, Emmett Shear will be appointed as the interim CEO of OpenAI. It is also reported that the departure of Sam Altman is now being followed by three senior researchers, Jakub Pachocki, Aleksander Madry, and Szymon Sidor, who have left the company to follow Sam Altman's next adventure. They wanted to go back to OpenAI if Mr. Altman would return; however, with Emmett Shear now being appointed as interim CEO, the company is in shambles with its senior staff employment in question.

Update 15:30 UTC: Sam Altman has joined Microsoft alongside Greg Brockman to lead Microsoft's advanced AI research efforts; additionally with hundreds of OpenAI staff wanting to do projects under Sam Altman's lead. Apparently there are 700 members of staff, and 505 of them plan to follow Mr. Altman and Mr. Brockman under Microsoft's wing.

SK hynix Showcases Next-Gen AI and HPC Solutions at SC23

SK hynix presented its leading AI and high-performance computing (HPC) solutions at Supercomputing 2023 (SC23) held in Denver, Colorado between November 12-17. Organized by the Association for Computing Machinery and IEEE Computer Society since 1988, the annual SC conference showcases the latest advancements in HPC, networking, storage, and data analysis. SK hynix marked its first appearance at the conference by introducing its groundbreaking memory solutions to the HPC community. During the six-day event, several SK hynix employees also made presentations revealing the impact of the company's memory solutions on AI and HPC.

Displaying Advanced HPC & AI Products
At SC23, SK hynix showcased its products tailored for AI and HPC to underline its leadership in the AI memory field. Among these next-generation products, HBM3E attracted attention as the HBM solution meets the industry's highest standards of speed, capacity, heat dissipation, and power efficiency. These capabilities make it particularly suitable for data-intensive AI server systems. HBM3E was presented alongside NVIDIA's H100, a high-performance GPU for AI that uses HBM3 for its memory.

Microsoft Introduces 128-Core Arm CPU for Cloud and Custom AI Accelerator

During its Ignite conference, Microsoft introduced a duo of custom-designed silicon made to accelerate AI and excel in cloud workloads. First of the two is Microsoft's Azure Cobalt 100 CPU, a 128-core design that features a 64-bit Armv9 instruction set, implemented in a cloud-native design that is set to become a part of Microsoft's offerings. While there aren't many details regarding the configuration, the company claims that the performance target is up to 40% when compared to the current generation of Arm servers running on Azure cloud. The SoC has used Arm's Neoverse CSS platform customized for Microsoft, with presumably Arm Neoverse N2 cores.

The next and hottest topic in the server space is AI acceleration, which is needed for running today's large language models. Microsoft hosts OpenAI's ChatGPT, Microsoft's Copilot, and many other AI services. To help make them run as fast as possible, Microsoft's project Athena now has the name of Maia 100 AI accelerator, which is manufactured on TSMC's 5 nm process. It features 105 billion transistors and supports various MX data formats, even those smaller than 8-bit bit, for maximum performance. Currently tested on GPT 3.5 Turbo, we have yet to see performance figures and comparisons with competing hardware from NVIDIA, like H100/H200 and AMD, with MI300X. The Maia 100 has an aggregate bandwidth of 4.8 Terabits per accelerator, which uses a custom Ethernet-based networking protocol for scaling. These chips are expected to appear in Microsoft data centers early next year, and we hope to get some performance numbers soon.

NVIDIA Announces up to 5x Faster TensorRT-LLM for Windows, and ChatGPT API-like Interface

Even as CPU vendors are working to mainstream accelerated AI for client PCs, and Microsoft setting the pace for more AI in everyday applications with Windows 11 23H2 Update; NVIDIA is out there reminding you that every GeForce RTX GPU is an AI accelerator. This is thanks to its Tensor cores, and the SIMD muscle of the ubiquitous CUDA cores. NVIDIA has been making these for over 5 years now, and has an install base of over 100 million. The company is hence focusing on bring generative AI acceleration to more client- and enthusiast relevant use-cases, such as large language models.

NVIDIA at the Microsoft Ignite event announced new optimizations, models, and resources to bring accelerated AI to everyone with an NVIDIA GPU that meets the hardware requirements. To begin with, the company introduced an update to TensorRT-LLM for Windows, a library that leverages NVIDIA RTX architecture for accelerating large language models (LLMs). The new TensorRT-LLM version 0.6.0 will release later this month, and improve LLM inference performance by up to 5 times in terms of tokens per second, when compared to the initial release of TensorRT-LLM from October 2023. In addition, TensorRT-LLM 0.6.0 will introduce support for popular LLMs, including Mistral 7B and Nemtron-3 8B. Accelerating these two will require a GeForce RTX 30-series "Ampere" or 40-series "Ada" GPU with at least 8 GB of main memory.

NVIDIA Turbocharges Generative AI Training in MLPerf Benchmarks

NVIDIA's AI platform raised the bar for AI training and high performance computing in the latest MLPerf industry benchmarks. Among many new records and milestones, one in generative AI stands out: NVIDIA Eos - an AI supercomputer powered by a whopping 10,752 NVIDIA H100 Tensor Core GPUs and NVIDIA Quantum-2 InfiniBand networking - completed a training benchmark based on a GPT-3 model with 175 billion parameters trained on one billion tokens in just 3.9 minutes. That's a nearly 3x gain from 10.9 minutes, the record NVIDIA set when the test was introduced less than six months ago.

The benchmark uses a portion of the full GPT-3 data set behind the popular ChatGPT service that, by extrapolation, Eos could now train in just eight days, 73x faster than a prior state-of-the-art system using 512 A100 GPUs. The acceleration in training time reduces costs, saves energy and speeds time-to-market. It's heavy lifting that makes large language models widely available so every business can adopt them with tools like NVIDIA NeMo, a framework for customizing LLMs. In a new generative AI test ‌this round, 1,024 NVIDIA Hopper architecture GPUs completed a training benchmark based on the Stable Diffusion text-to-image model in 2.5 minutes, setting a high bar on this new workload. By adopting these two tests, MLPerf reinforces its leadership as the industry standard for measuring AI performance, since generative AI is the most transformative technology of our time.

Gigabyte Announces AI Strategy for Consumer Products to Map the Future of AI

GIGABYTE, a leader in cloud computing and AI server markets, announced a new strategic framework for AI outlining a blueprint for the company's direction in the AI-driven future of the consumer PC market. The framework features three fundamental pillars: offering a comprehensive AI operating platform, implementing AI-based product design, and engaging in the AI ecosystem with the goal of introducing consumers to a new AI-driven experience.

Providing a comprehensive AI operating platform to meet all-end computing applications
GIGABYTE's AI operating platform caters to all-end computing applications, spanning from the cloud to the edge. In the cloud, GIGABYTE's AI servers deliver robust computing power for demanding AI workloads, encompassing generative AI services and machine learning applications like ChatGPT. At the edge, GIGABYTE's consumer products, such as high-performance graphics cards and gaming laptops, furnish users with instant and reliable AI computing power for a diverse array of applications, ranging from real-time video processing to AI-driven gaming. In scenarios involving AI collaboration systems like Microsoft Copilot, GIGABYTE offers a power-saving, secure, and user-friendly AI operating platform explicitly engineered for the next-generation AI processors like NPUs.

OpenAI Could Make Custom Chips to Power Next-Generation AI Models

OpenAI, the company behind ChatGPT and the GPT-4 large language model, is reportedly exploring the possibility of creating custom silicon to power its next-generation AI models. According to Reuters, Insider sources have even alluded to the firm evaluating potential acquisitions of chip design firms. While a final decision is yet to be cemented, conversations from as early as last year highlighted OpenAI's struggle with the growing scarcity and escalating costs of AI chips, with NVIDIA being its primary supplier. The CEO of OpenAI, Sam Altman, has been rather vocal about the shortage of GPUs, a sector predominantly monopolized by NVIDIA, which holds control over an astounding 80% of the global market for AI-optimized chips.

Back in 2020, OpenAI banked on a colossal supercomputer crafted by Microsoft, a significant investor in OpenAI, which harnesses the power of 10,000 NVIDIA GPUs. This setup is instrumental in driving the operations of ChatGPT, which, as per Bernstein's analyst Stacy Rasgon, comes with its own hefty price tag. Each interaction with ChatGPT is estimated to cost around 4 cents. Drawing a comparative scale with Google search, if ChatGPT queries ever burgeoned to a mere tenth of Google's search volume, the initial GPU investment would skyrocket to an overwhelming $48.1 billion, with a recurring annual expenditure of approximately $16 billion for sustained operations. For an invitation to comment, OpenAI declined to provide any statements. The potential entry into the world of custom silicon signals a strategic move towards greater self-reliance and cost optimization so further development of AI can be sustained.

Run AI on Your PC? NVIDIA GeForce Users Are Ahead of the Curve

Generative AI is no longer just for tech giants. With GeForce, it's already at your fingertips. Gone are the days when AI was the domain of sprawling data centers or elite researchers. For GeForce RTX users, AI is now running on your PC. It's personal, enhancing every keystroke, every frame and every moment. Gamers are already enjoying the benefits of AI in over 300 RTX games. Meanwhile, content creators have access to over 100 RTX creative and design apps, with AI enhancing everything from video and photo editing to asset generation. And for GeForce enthusiasts, it's just the beginning. RTX is the platform for today and the accelerator that will power the AI of tomorrow.

How Did AI and Gaming Converge?
NVIDIA pioneered the integration of AI and gaming with DLSS, a technique that uses AI to generate pixels in video games automatically and which has increased frame rates by up to 4x. And with the recent introduction of DLSS 3.5, NVIDIA has enhanced the visual quality in some of the world's top titles, setting a new standard for visually richer and more immersive gameplay. But NVIDIA's AI integration doesn't stop there. Tools like RTX Remix empower game modders to remaster classic content using high-quality textures and materials generated by AI.

Intel Shows Strong AI Inference Performance

Today, MLCommons published results of its MLPerf Inference v3.1 performance benchmark for GPT-J, the 6 billion parameter large language model, as well as computer vision and natural language processing models. Intel submitted results for Habana Gaudi 2 accelerators, 4th Gen Intel Xeon Scalable processors, and Intel Xeon CPU Max Series. The results show Intel's competitive performance for AI inference and reinforce the company's commitment to making artificial intelligence more accessible at scale across the continuum of AI workloads - from client and edge to the network and cloud.

"As demonstrated through the recent MLCommons results, we have a strong, competitive AI product portfolio, designed to meet our customers' needs for high-performance, high-efficiency deep learning inference and training, for the complete spectrum of AI models - from the smallest to the largest - with leading price/performance." -Sandra Rivera, Intel executive vice president and general manager of the Data Center and AI Group

Strong Cloud AI Server Demand Propels NVIDIA's FY2Q24 Data Center Business to Surpass 76% for the First Time

NVIDIA's latest financial report for FY2Q24 reveals that its data center business reached US$10.32 billion—a QoQ growth of 141% and YoY increase of 171%. The company remains optimistic about its future growth. TrendForce believes that the primary driver behind NVIDIA's robust revenue growth stems from its data center's AI server-related solutions. Key products include AI-accelerated GPUs and AI server HGX reference architecture, which serve as the foundational AI infrastructure for large data centers.

TrendForce further anticipates that NVIDIA will integrate its software and hardware resources. Utilizing a refined approach, NVIDIA will align its high-end, mid-tier, and entry-level GPU AI accelerator chips with various ODMs and OEMs, establishing a collaborative system certification model. Beyond accelerating the deployment of CSP cloud AI server infrastructures, NVIDIA is also partnering with entities like VMware on solutions including the Private AI Foundation. This strategy extends NVIDIA's reach into the edge enterprise AI server market, underpinning steady growth in its data center business for the next two years.

Newegg's PC Builder ChatGPT Plugin Tested

Newegg released its PC Builder ChatGPT Plugin yesterday, and former TPU writer—Francisco Pires—decided to put it through the proverbial ringer. His hands-on adventures with AI-assisted PC build suggestions were documented in a Tom's Hardware article. Initial impressions are a mixed bag—he brings in a metaphor to describe his experience: "it was akin to entering Alice in Wonderland (the Tim Burton version): everything's interesting and somewhat faithful, but laid out in just the wrong way." The beta version (released back in March) proved to be a confusing mess, according to Avram Piltch, Editor-in-Chief at Tom's Hardware.

Pires proposed that the tool is decent enough for fledgling PC build novices to utilize, but the chatbot was found to overvalue certain components: "the typical price for the Radeon RX 6700 XT hovers around the $330-$370 range so the $558.99 MSI card the bot recommends is overpriced by more than $230!" The assistant also struggled to keep a suggested system build within a specified $1000 budget, the total was stretched to $1123.09. He also discovered some quirks related to the assistant's (apparently) incomplete GPU model database: "why did ChatGPT suggest a GeForce RTX 4060 for the build, if its knowledge cut-off is set at September 2021?" The plugin seems to have scraped information about newer products from Newegg's store, but the bot's full text answer (see the attached screenshot) provides a comparison between older generations.

OpenAI Degrades GPT-4 Performance While GPT-3.5 Gets Better

When OpenAI announced its GPT-4 model, it first became a part of ChatGPT, behind the paywall for premium users. The GPT-4 is the latest installment in the Generative Pretrained Transformer (GPT) Large Language Models (LLMs). The GPT-4 aims to be a more capable version than the GPT-3.5 that powered ChatGPT at first, which was capable once it launched. However, it seems like the performance of GPT-4 has been steadily dropping since its introduction. Many users noted the regression, and today we have researchers from Stanford University and UC Berkeley, who benchmarked the GPT-4 performance in March 2023, and the model's performance in June 2023 in tasks like solving math problems, visual reasoning, code generation, and answering sensitive questions.

The results? The paper shows that GPT-4 performance has been significantly degraded in all the tasks. This could be attributed to improving stability, lowering the massive compute demand, and much more. What is unexpected, GPT-3.5 experienced a significant uplift in the same period. Below, you can see the examples that were benchmarked by the researchers, which also compare GTP-4 and GPT-3.5 performance in all cases.
Return to Keyword Browsing
Jan 20th, 2025 20:42 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts