News Posts matching #Large Language Model

Return to Keyword Browsing

NVIDIA & ServiceNow CEOs Jointly Present "Super Genius" Open-source Apriel Nemotron 15B LLM

Press Release by

May 7th, 2025 09:55 Discuss (0 Comments)

ServiceNow is accelerating enterprise AI with a new reasoning model built in partnership with NVIDIA—enabling AI agents that respond in real time, handle complex workflows and scale functions like IT, HR and customer service teams worldwide. Unveiled today at ServiceNow's Knowledge 2025—where NVIDIA CEO and founder Jensen Huang joined ServiceNow chairman and CEO Bill McDermott during his keynote address—Apriel Nemotron 15B is compact, cost-efficient and tuned for action. It's designed to drive the next step forward in enterprise large language models (LLMs).

Apriel Nemotron 15B was developed with NVIDIA NeMo, the open NVIDIA Llama Nemotron Post-Training Dataset and ServiceNow domain-specific data, and was trained on NVIDIA DGX Cloud running on Amazon Web Services (AWS). The news follows the April release of the NVIDIA Llama Nemotron Ultra model, which harnesses the NVIDIA open dataset that ServiceNow used to build its Apriel Nemotron 15B model. Ultra is among the strongest open-source models at reasoning, including scientific reasoning, coding, advanced math and other agentic AI tasks.

Read full story

NVIDIA's Project G-Assist Plug-In Builder Explained: Anyone Can Customize AI on GeForce RTX AI PCs

Press Release by

Apr 23rd, 2025 12:48 Discuss (3 Comments)

AI is rapidly reshaping what's possible on a PC—whether for real-time image generation or voice-controlled workflows. As AI capabilities grow, so does their complexity. Tapping into the power of AI can entail navigating a maze of system settings, software and hardware configurations. Enabling users to explore how on-device AI can simplify and enhance the PC experience, Project G-Assist—an AI assistant that helps tune, control and optimize GeForce RTX systems—is now available as an experimental feature in the NVIDIA app. Developers can try out AI-powered voice and text commands for tasks like monitoring performance, adjusting settings and interacting with supporting peripherals. Users can even summon other AIs powered by GeForce RTX AI PCs.

And it doesn't stop there. For those looking to expand Project G-Assist capabilities in creative ways, the AI supports custom plug-ins. With the new ChatGPT-based G-Assist Plug-In Builder, developers and enthusiasts can create and customize G-Assist's functionality, adding new commands, connecting external tools and building AI workflows tailored to specific needs. With the plug-in builder, users can generate properly formatted code with AI, then integrate the code into G-Assist—enabling quick, AI-assisted functionality that responds to text and voice commands.

Read full story

MediaTek Unveils New Flagship Dimensity 9400+ Mobile Platform; with Enhanced AI Performance

Press Release by

Apr 10th, 2025 10:51 Discuss (2 Comments)

MediaTek today announced the Dimensity 9400+ SoC, the latest addition to MediaTek's Dimensity flagship chipset family. Providing exceptional Generative and agentic AI capabilities as well as other performance enhancements, the Dimensity 9400+ supports the latest Large Language Models (LLM) while sustaining a super power-efficient design. The Dimensity 9400+ features an All Big Core design, integrating one Arm Cortex-X925 core operating up to 3.73 GHz, combined with 3x Cortex-X4 and 4x Cortex-A720 cores. This powerful configuration accelerates single and multithreaded performance for top-tier Android UX experiences.

"The MediaTek Dimensity 9400+ will make it easier to deliver innovative, personalized AI experiences on-device, combined with enhanced overall performance to ensure your device can handle all tasks with ease," said JC Hsu, Corporate Senior Vice President at MediaTek. "We are working closely with developers and manufacturers to continue building a robust ecosystem of AI applications and other features that will bring a number of speed and privacy benefits to consumers."

Read full story

UGREEN Showcases the New AI-Powered NASync iDX Series at NAB Show 2025

Press Release by

Apr 7th, 2025 07:24 Discuss (0 Comments)

From April 6-9th, UGREEN, a global leader in consumer electronics and charging technology, is showcasing its innovative NASync series at the NAB Show in Las Vegas. The UGREEN NASync iDX6011 and iDX6011 Pro have been the highlights of the display at Booth SL9210 in the Las Vegas Convention Center. These latest UGREEN NASync iDX models revolutionize data management and security for content creators through advanced AI technology, setting a new standard as the world-first AI-powered NAS.

UGREEN NASync is a series of network-attached storage devices tailored for personal, home, or business use. In March 2024, UGREEN launched a 44-day crowdfunding campaign on Kickstarter for the NASync DXP series, successfully raising over $6.6 million achieved No.1 in the NAS category. This remarkable support highlights the strong demand for advanced storage solutions.

Read full story

AMD Introduces GAIA - an Open-Source Project That Runs Local LLMs on Ryzen AI NPUs

Press Release by

Mar 21st, 2025 14:34 Discuss (30 Comments)

AMD has launched a new open-source project called, GAIA (pronounced /ˈɡaɪ.ə/), an awesome application that leverages the power of Ryzen AI Neural Processing Unit (NPU) to run private and local large language models (LLMs). In this blog, we'll dive into the features and benefits of GAIA, while introducing how you can take advantage of GAIA's open-source project to adopt into your own applications.

Introduction to GAIA
GAIA is a generative AI application designed to run local, private LLMs on Windows PCs and is optimized for AMD Ryzen AI hardware (AMD Ryzen AI 300 Series Processors). This integration allows for faster, more efficient processing - i.e. lower power- while keeping your data local and secure. On Ryzen AI PCs, GAIA interacts with the NPU and iGPU to run models seamlessly by using the open-source Lemonade (LLM-Aid) SDK from ONNX TurnkeyML for LLM inference. GAIA supports a variety of local LLMs optimized to run on Ryzen AI PCs. Popular models like Llama and Phi derivatives can be tailored for different use cases, such as Q&A, summarization, and complex reasoning tasks.

Read full story

Tencent Will Launch Hunyuan T1 Inference Model on March 21

by

Mar 19th, 2025 12:59 Discuss (0 Comments)

Tencent's large language model (LLM) specialist division has announced the imminent launch of their T1 AI inference model. The Chinese technology giant's Hunyuan social media accounts revealed a grand arrival, scheduled to take place on Friday (March 21). A friendly reminder was issued to interested parties, regarding the upcoming broadcast/showcase: "please set aside your valuable time. Let's step into T1 together." Earlier in the week, the Tencent AI team started to tease their "first ultra-large Mamba-powered reasoning model." Local news reports have highlighted Hunyuan's claim of Mamba architecture being applied losslessly to a super-large Mixture of Experts (MoE) model.

Late last month, the company released its Hunyuan Turbo S AI model—advertised as offering faster replies than DeepSeek's R1 system. Tencent's plucky solution has quickly climbed up the Chatbot Arena LLM Leaderboard. The Hunyuan team was in a boastful mood earlier today, and loudly proclaimed that their proprietary Turbo S model had charted in fifteenth place. At the time of writing, DeepSeek R1 is ranked seventh on the leaderboard. As explained by ITHome, this community-driven platform is driven by users interactions: "with multiple models anonymously, voting to decide which model is better, and then generating a ranking list based on the scores. This kind of evaluation is also seen as an arena for big models to compete directly, which is simple and direct."

Phison Expands aiDAPTIV+ GPU Memory Extension Capabilities

Press Release by

Mar 18th, 2025 23:33 Discuss (0 Comments)

Phison Electronics (8299TT), a leading innovator in NAND flash technologies, today announced an array of expanded capabilities on aiDAPTIV+, the affordable AI training and inferencing solution for on-premises environments. aiDAPTIV+ will be integrated into a ML-series Maingear laptop, the first AI laptop PC capable of LLMOps, utilizing NVIDIA GPUs and available for concept demonstration and registration this week at NVIDIA GTC 2025. Customers will be able to fine-tune Large Language Models (LLMs) up to 8 billion parameters using their own data.

Phison also expanded aiDAPTIV+ capabilities to run on edge computing devices powered by the NVIDIA Jetson platform, for enhanced generative AI inference at the edge and robotics deployments. With today's announcement, new and current aiDAPTIV+ users can look forward to the new aiDAPTIVLink 3.0 middleware, which will provide faster Time to First Token (TTFT) recall and extend the token length for greater context, improving inferencing performance and accuracy. These expansions will unlock access for users ranging from university students and AI industry professionals learning to train LLMs, or researchers uncovering deeper insights within their own data using a PC, all the way to manufacturing engineers automating factory floor enhancements via edge devices.

Read full story

ASUS Introduces New "AI Cache Boost" BIOS Feature - R&D Team Claims Performance Uplift

Press Release by

Mar 18th, 2025 13:38 Discuss (9 Comments)

Large language models (LLMs) love large quantities of memory—so much so, in fact, that AI enthusiasts are turning to multi-GPU setups to make even more VRAM available for their AI apps. But since many current LLMs are extremely large, even this approach has its limits. At times, the GPU will decide to make use of CPU processing power for this data, and when it does, the performance of your CPU cache and DRAM comes into play. All this means that when it comes to the performance of AI applications, it's not just the GPU that matters, but the entire pathway that connects the GPU to the CPU to the I/O die to the DRAM modules. It stands to reason, then, that there are opportunities to boost AI performance by optimizing these elements.

That's exactly what we've found as we've spent time in our R&D labs with the latest AMD Ryzen CPUs. AMD just launched two new Ryzen CPUs with AMD 3D V-Cache Technology, the AMD Ryzen 9 9950X3D and Ryzen 9 9900X3D, pushing the series into new performance territory. After testing a wide range of optimizations in a variety of workloads, we uncovered a range of settings that offer tangible benefits for AI enthusiasts. Now, we're ready to share these optimizations with you through a new BIOS feature: AI Cache Boost. Available through an ASUS AMD 800 Series motherboard and our most recent firmware update, AI Cache Boost can accelerate performance up to 12.75% when you're working with massive LLMs.

Read full story

MSI - Micro-Star International

MSI GeForce RTX 50 Laptops Are Prepped for High-end Gaming & Local AI Applications

Press Release by

Mar 17th, 2025 11:41 Discuss (0 Comments)

MSI's latest high-end gaming laptops, including the Titan 18 HX AI, Raider 18 HX AI, and Vector 16 HX AI, feature Intel Core Ultra 200 HX series CPUs and NVIDIA RTX 50 series GPUs, while Raider A18 HX and Vector A18 HX run on AMD Ryzen 9000 series. With NVIDIA's last major GPU upgrade in over two years, these laptops deliver top-tier performance for ultra-high-resolution gaming. Beyond gaming, MSI's Titan, Raider, Vector, and Stealth series excel in AI applications, particularly Small Language Models (SLM), making them ideal for both gaming and AI-driven tasks.

Next-Gen GPUs: A Breakthrough for AI Applications
NVIDIA's latest RTX 50 series GPUs, built on the cutting-edge Blackwell architecture, introduce 5th-generation Tensor Cores, 4th-generation RT Cores, and Neural Rendering technology for the first time. With expanded memory capacity and GDDR7, these GPUs optimize AI-enhanced neural computations, reducing memory usage while boosting graphics rendering and AI processing efficiency. This results in unmatched performance for both gaming and creative workloads, enabling smoother, more efficient execution of complex tasks.

Read full story

Niantic Offloads Games Division to Scopely - Deal Valued at $3.5 Billion

Press Release by

Mar 13th, 2025 11:11 Discuss (2 Comments)

We're announcing changes at Niantic that will set us on a bold new course. Nearly a decade ago, we spun out as a small team from Google with a bold vision: to use technology to overlay the world with rich digital experiences. Our goal: to inspire people to explore their surroundings and foster real-world connections, especially at a time when relationships were becoming increasingly digital. To bring this mission and technology to life, we started building games; today, more than 100 million people play our games annually, with more than a billion friend connections made across the world.

People have discovered their neighborhoods, explored new places, and moved more than 30 billion miles. They've also come together at our live events - where everyone is a participant, not just a spectator—contributing over a billion dollars in economic impact in the cities that host them. As we grew, the company naturally evolved along two complementary paths - one focused on creating games and bringing them to the world, and the other dedicated to advancing augmented reality, artificial intelligence, and geospatial technology. Meanwhile, the rapid progress in AI reinforces our belief in the future of geospatial computing to unlock new possibilities for both consumer experiences and enterprise applications. At the same time, we remain committed to creating "forever games" that will last for generations.

Read full story

Moore Threads Teases Excellent Performance of DeepSeek-R1 Model on MTT GPUs

by

Feb 6th, 2025 10:36 Discuss (4 Comments)

Moore Threads, a Chinese manufacturer of proprietary GPU designs is (reportedly) the latest company to jump onto the DeepSeek-R1 bandwagon. Since late January, NVIDIA, Microsoft and AMD have swooped in with their own interpretations/deployments. By global standards, Moore Threads GPUs trail behind Western-developed offerings—early 2024 evaluations presented the firm's MTT S80 dedicated desktop graphics card struggling against an AMD integrated solution: Radeon 760M. The recent emergence of DeepSeek's open source models has signalled a shift away from reliance on extremely powerful and expensive AI-crunching hardware (often accessed via the cloud)—widespread excitement has been generated by DeepSeek solutions being relatively frugal, in terms of processing requirements. Tom's Hardware has observed cases of open source AI models running (locally) on: "inexpensive hardware, like the Raspberry Pi."

According to recent Chinese press coverage, Moore Threads has announced a successful deployment of DeepSeek's R1-Distill-Qwen-7B distilled model on the aforementioned MTT S80 GPU. The company also revealed that it had taken similar steps with its MTT S4000 datacenter-oriented graphics hardware. On the subject of adaptation, a Moore Threads spokesperson stated: "based on the Ollama open source framework, Moore Threads completed the deployment of the DeepSeek-R1-Distill-Qwen-7B distillation model and demonstrated excellent performance in a variety of Chinese tasks, verifying the versatility and CUDA compatibility of Moore Threads' self-developed full-featured GPU." Exact performance figures, benchmark results and technical details were not disclosed to the Chinese public, so Moore Threads appears to be teasing the prowess of its MTT GPU designs. ITHome reported that: "users can also perform inference deployment of the DeepSeek-R1 distillation model based on MTT S80 and MTT S4000. Some users have previously completed the practice manually on MTT S80." Moore Threads believes that its: "self-developed high-performance inference engine, combined with software and hardware co-optimization technology, significantly improves the model's computing efficiency and resource utilization through customized operator acceleration and memory management. This engine not only supports the efficient operation of the DeepSeek distillation model, but also provides technical support for the deployment of more large-scale models in the future."

Aetina & Qualcomm Collaborate on Flagship MegaEdge AIP-FR68 Edge AI Solution

Press Release by

Jan 10th, 2025 12:42 Discuss (6 Comments)

Aetina, a leading provider of edge AI solutions and a subsidiary of Innodisk Group, today announced a collaboration with Qualcomm Technologies, Inc., who unveiled a revolutionary Qualcomm AI On-Prem Appliance Solution and Qualcomm AI Inference Suite for On-Prem. This collaboration combines Qualcomm Technologies' cutting-edge inference accelerators and advanced software with Aetina's edge computing hardware to deliver unprecedented computing power and ready-to-use AI applications for enterprises and industrial organizations.

The flagship offering, the Aetina MegaEdge AIP-FR68, sets a new industry benchmark by integrating Qualcomm Cloud AI family of accelerator cards. Each Cloud AI 100 Ultra card delivers an impressive 870 TOPS of AI computing power at 8-bit integer (INT8) while maintaining remarkable energy efficiency at just 150 W power consumption. The system supports dual Cloud AI 100 Ultra cards in a single desktop workstation. This groundbreaking combination of power and efficiency in a compact form factor revolutionizes on-premises AI processing, making enterprise-grade computing more accessible than ever.

Read full story

Gigabyte Demonstrates Omni-AI Capabilities at CES 2025

CES Press Release by

Jan 7th, 2025 01:31 Discuss (0 Comments)

GIGABYTE Technology, internationally renowned for its R&D capabilities and a leading innovator in server and data center solutions, continues to lead technological innovation during this critical period of AI and computing advancement. With its comprehensive AI product portfolio, GIGABYTE will showcase its complete range of AI computing solutions at CES 2025, from data center infrastructure to IoT applications and personal computing, demonstrating how its extensive product line enables digital transformation across all sectors in this AI-driven era.

Powering AI from the Cloud
With AI Large Language Models (LLMs) now routinely featuring parameters in the hundreds of billions to trillions, robust training environments (data centers) have become a critical requirement in the AI race. GIGABYTE offers three distinctive solutions for AI infrastructure.
⁠

Read full story

IBM Develops Co-Packaged Optical Interconnect for Data Center

by

Dec 17th, 2024 03:58 Discuss (0 Comments)

IBM Research has unveiled a significant advancement in optical interconnect technology for advanced data center communications. The breakthrough centers on a novel co-packaged optics (CPO) system featuring a sophisticated Polymer Optical Waveguide (PWG) design, marking a potential shift from traditional copper-based interconnects. The innovation introduces a Photonic Integrated Circuit (PIC) measuring 8x10mm, mounted on a 17x17mm substrate, capable of converting electrical signals to optical ones and vice versa. The system's waveguide, spanning 12 mm in width, efficiently channels light waves through precisely engineered pathways, with channels converging from 250 to 50 micrometers.

While current copper-based solutions like NVIDIA's NVLink offer impressive 1.8 TB/s bandwidth rates, and Intel's Optical Compute Interconnect achieves 4 TBit/s bidirectional throughput, IBM's technology focuses on scalability and efficiency. The company plans to implement 12 carrier waves initially, with the potential to accommodate up to 32 waves by reducing spacing to 18 micrometers. Furthermore, the design allows for vertical stacking of up to four PWGs, potentially enabling 128 transmission channels. The technology has undergone rigorous JEDEC-standard testing, including 1,000 cycles of thermal stress between -40°C and 125°C, and extended exposure to extreme conditions including 85% humidity at 85°C. The components have also proven reliable during thousand-hour storage tests at various temperature extremes. The bandwidth of the CPO is currently unknown, but we expect it to surpass current solutions.

Read full story

SK Hynix Begins Mass-production of 12-layer HBM3E Memory

Press Release by

Sep 25th, 2024 21:43 Discuss (0 Comments)

SK hynix Inc. announced today that it has begun mass production of the world's first 12-layer HBM3E product with 36 GB, the largest capacity of existing HBM to date. The company plans to supply mass-produced products to customers within the year, proving its overwhelming technology once again six months after delivering the HBM3E 8-layer product to customers for the first time in the industry in March this year.

SK hynix is the only company in the world that has developed and supplied the entire HBM lineup from the first generation (HBM1) to the fifth generation (HBM3E), since releasing the world's first HBM in 2013. The company plans to continue its leadership in the AI memory market, addressing the growing needs of AI companies by being the first in the industry to mass-produce the 12-layer HBM3E.

Read full story

AMD Completes Acquisition of Silo AI

Press Release by

Aug 12th, 2024 08:15 Discuss (34 Comments)

AMD today announced the completion of its acquisition of Silo AI, the largest private AI lab in Europe. The all-cash transaction valued at approximately $665 million furthers the company's commitment to deliver end-to-end AI solutions based on open standards and in strong partnership with the global AI ecosystem. Silo AI brings a team of world-class AI scientists and engineers to AMD experienced in developing cutting-edge AI models, platforms and solutions for large enterprise customers including Allianz, Philips, Rolls-Royce and Unilever. Their expertise spans diverse markets and they have created state-of-the-art open source multilingual Large Language Models (LLMs) including Poro and Viking on AMD platforms. The Silo AI team will join the AMD Artificial Intelligence Group (AIG), led by AMD Senior Vice President Vamsi Boppana.

"AI is our number one strategic priority, and we continue to invest in both the talent and software capabilities to support our growing customer deployments and roadmaps," said Vamsi Boppana, AMD senior vice president, AIG. "The Silo AI team has developed state-of-the-art language models that have been trained at scale on AMD Instinct accelerators and they have broad experience developing and integrating AI models to solve critical problems for end customers. We expect their expertise and software capabilities will directly improve the experience for customers in delivering the best performing AI solutions on AMD platforms."

Ubisoft Exploring Generative AI, Could Revolutionize NPC Narratives

Press Release by

Mar 19th, 2024 13:45 Discuss (13 Comments)

Have you ever dreamed of having a real conversation with an NPC in a video game? Not just one gated within a dialogue tree of pre-determined answers, but an actual conversation, conducted through spontaneous action and reaction? Lately, a small R&D team at Ubisoft's Paris studio, in collaboration with Nvidia's Audio2Face application and Inworld's Large Language Model (LLM), have been experimenting with generative AI in an attempt to turn this dream into a reality. Their project, NEO NPC, uses GenAI to prod at the limits of how a player can interact with an NPC without breaking the authenticity of the situation they are in, or the character of the NPC itself.

Considering that word—authenticity—the project has had to be a hugely collaborative effort across artistic and scientific disciplines. Generative AI is a hot topic of conversation in the videogame industry, and Senior Vice President of Production Technology Guillemette Picard is keen to stress that the goal behind all genAI projects at Ubisoft is to bring value to the player; and that means continuing to focus on human creativity behind the scenes. "The way we worked on this project, is always with our players and our developers in mind," says Picard. "With the player in mind, we know that developers and their creativity must still drive our projects. Generative AI is only of value if it has value for them."

Read full story

MAINGEAR Introduces PRO AI Workstations Featuring aiDAPTIV+ For Cost-Effective Large Language Model Training

Press Release by

Mar 18th, 2024 10:06 Discuss (3 Comments)

MAINGEAR, a leading provider of high-performance custom PC systems, and Phison, a global leader in NAND controllers and storage solutions, today unveiled groundbreaking MAINGEAR PRO AI workstations with Phison's aiDAPTIV+ technology. Specifically engineered to democratize Large Language Model (LLM) development and training for small and medium-sized businesses (SMBs), these ultra-powerful workstations incorporate aiDAPTIV+ technology to deliver supercomputer LLM training capabilities at a fraction of the cost of traditional AI training servers.

As the demand for large-scale generative AI models continues to surge and their complexity increases, the potential for LLMs also expands. However, this rapid advancement in LLM AI technology has led to a notable boost in hardware requirements, making model training cost-prohibitive and inaccessible for many small to medium businesses.

Read full story

AMD Publishes User Guide for LM Studio - a Local AI Chatbot

by

Mar 8th, 2024 13:17 Discuss (18 Comments)

AMD has caught up with NVIDIA and Intel in the race to get a locally run AI chatbot up and running on its respective hardware. Team Red's community hub welcomed a new blog entry on Wednesday—AI staffers published a handy "How to run a Large Language Model (LLM) on your AMD Ryzen AI PC or Radeon Graphics Card" step-by-step guide. They recommend that interested parties are best served by downloading the correct version of LM Studio. Their CPU-bound Windows variant—designed for higher-end Phoenix and Hawk Point chips—compatible Ryzen AI PCs can deploy instances of a GPT based LLM-powered AI chatbot. The LM Studio ROCm technical preview functions similarly, but is reliant on Radeon RX 7000 graphics card ownership. Supported GPU targets include: gfx1100, gfx1101 and gfx1102.

AMD believes that: "AI assistants are quickly becoming essential resources to help increase productivity, efficiency or even brainstorm for ideas." Their blog also puts a spotlight on LM Studio's offline functionality: "Not only does the local AI chatbot on your machine not require an internet connection—but your conversations stay on your local machine." The six-step guide invites curious members to experiment with a handful of large language models—most notably Mistral 7b and LLAMA v2 7b. They thoroughly recommend that you select options with "Q4 K M" (AKA 4-bit quantization). You can learn about spooling up "your very own AI chatbot" here.

Apple Wants to Store LLMs on Flash Memory to Bring AI to Smartphones and Laptops

by

Dec 21st, 2023 08:47 Discuss (24 Comments)

Apple has been experimenting with Large Language Models (LLMs) that power most of today's AI applications. The company wants these LLMs to serve the users best and deliver them efficiently, which is a difficult task as they require a lot of resources, including compute and memory. Traditionally, LLMs have required AI accelerators in combination with large DRAM capacity to store model weights. However, Apple has published a paper that aims to bring LLMs to devices with limited memory capacity. By storing LLMs on NAND flash memory (regular storage), the method involves constructing an inference cost model that harmonizes with the flash memory behavior, guiding optimization in two critical areas: reducing the volume of data transferred from flash and reading data in larger, more contiguous chunks. Instead of storing the model weights on DRAM, Apple wants to utilize flash memory to store weights and only pull them on-demand to DRAM once it is needed.

Two principal techniques are introduced within this flash memory-informed framework: "windowing" and "row-column bundling." These methods collectively enable running models up to twice the size of the available DRAM, with a 4-5x and 20-25x increase in inference speed compared to native loading approaches on CPU and GPU, respectively. Integrating sparsity awareness, context-adaptive loading, and a hardware-oriented design pave the way for practical inference of LLMs on devices with limited memory, such as SoCs with 8/16/32 GB of available DRAM. Especially with DRAM prices outweighing NAND Flash, setups such as smartphone configurations could easily store and inference LLMs with multi-billion parameters, even if the DRAM available isn't sufficient. For a more technical deep dive, read the paper on arXiv here.

AMD Reports Third Quarter 2023 Financial Results, Revenue Up 4% YoY

Press Release by

Oct 31st, 2023 15:28 Discuss (23 Comments)

AMD (NASDAQ:AMD) today announced revenue for the third quarter of 2023 of $5.8 billion, gross margin of 47%, operating income of $224 million, net income of $299 million and diluted earnings per share of $0.18. On a non-GAAP basis, gross margin was 51%, operating income was $1.3 billion, net income was $1.1 billion and diluted earnings per share was $0.70.

"We delivered strong revenue and earnings growth driven by demand for our Ryzen 7000 series PC processors and record server processor sales," said AMD Chair and CEO Dr. Lisa Su. "Our data center business is on a significant growth trajectory based on the strength of our EPYC CPU portfolio and the ramp of Instinct MI300 accelerator shipments to support multiple deployments with hyperscale, enterprise and AI customers."

Read full story

Lenovo Group Releases First Quarter Results 2023/24

Press Release by

Aug 17th, 2023 03:04 Discuss (7 Comments)

Lenovo Group today announced first quarter results, reporting Group revenue of US$12.9 billion and net income of US$191 million on a non-Hong Kong Financial Reporting Standards (HKFRS) basis. Revenue from the non-PC businesses accounted for 41% of Group revenue, with the service-led business achieving strong growth and sustained profitability - further demonstrating the effectiveness of Lenovo's intelligent transformation strategy.

The Group continues to take proactive actions to keep its Expenses-to-Revenue (E/R) ratio resilient and drive sustainable profitability, whilst also investing for growth and transformation. It remains committed to doubling investment in innovation in the mid-term, including an additional US$1 billion investment over three years to accelerate artificial intelligence (AI) deployment for businesses around the world - specifically AI devices, AI infrastructure, and AI solutions.

Read full story

Cerebras and G42 Unveil World's Largest Supercomputer for AI Training with 4 ExaFLOPS

Press Release by

Jul 21st, 2023 02:05 Discuss (16 Comments)

Cerebras Systems, the pioneer in accelerating generative AI, and G42, the UAE-based technology holding group, today announced Condor Galaxy, a network of nine interconnected supercomputers, offering a new approach to AI compute that promises to significantly reduce AI model training time. The first AI supercomputer on this network, Condor Galaxy 1 (CG-1), has 4 exaFLOPs and 54 million cores. Cerebras and G42 are planning to deploy two more such supercomputers, CG-2 and CG-3, in the U.S. in early 2024. With a planned capacity of 36 exaFLOPs in total, this unprecedented supercomputing network will revolutionize the advancement of AI globally.

"Collaborating with Cerebras to rapidly deliver the world's fastest AI training supercomputer and laying the foundation for interconnecting a constellation of these supercomputers across the world has been enormously exciting. This partnership brings together Cerebras' extraordinary compute capabilities, together with G42's multi-industry AI expertise. G42 and Cerebras' shared vision is that Condor Galaxy will be used to address society's most pressing challenges across healthcare, energy, climate action and more," said Talal Alkaissi, CEO of G42 Cloud, a subsidiary of G42.

Read full story

OpenAI Degrades GPT-4 Performance While GPT-3.5 Gets Better

by

Jul 19th, 2023 08:17 Discuss (9 Comments)

When OpenAI announced its GPT-4 model, it first became a part of ChatGPT, behind the paywall for premium users. The GPT-4 is the latest installment in the Generative Pretrained Transformer (GPT) Large Language Models (LLMs). The GPT-4 aims to be a more capable version than the GPT-3.5 that powered ChatGPT at first, which was capable once it launched. However, it seems like the performance of GPT-4 has been steadily dropping since its introduction. Many users noted the regression, and today we have researchers from Stanford University and UC Berkeley, who benchmarked the GPT-4 performance in March 2023, and the model's performance in June 2023 in tasks like solving math problems, visual reasoning, code generation, and answering sensitive questions.

The results? The paper shows that GPT-4 performance has been significantly degraded in all the tasks. This could be attributed to improving stability, lowering the massive compute demand, and much more. What is unexpected, GPT-3.5 experienced a significant uplift in the same period. Below, you can see the examples that were benchmarked by the researchers, which also compare GTP-4 and GPT-3.5 performance in all cases.

Oracle Fusion Cloud HCM Enhanced with Generative AI, Projected to Boost HR Productivity

Press Release by

Jun 28th, 2023 09:57 Discuss (2 Comments)

Oracle today announced the addition of generative AI-powered capabilities within Oracle Fusion Cloud Human Capital Management (HCM). Supported by the Oracle Cloud Infrastructure (OCI) generative AI service, the new capabilities are embedded in existing HR processes to drive faster business value, improve productivity, enhance the candidate and employee experience, and streamline HR processes.

"Generative AI is boosting productivity and unlocking a new world of skills, ideas, and creativity that can have an immediate impact in the workplace," said Chris Leone, executive vice president, applications development, Oracle Cloud HCM. "With the ability to summarize, author, and recommend content, generative AI helps to reduce friction as employees complete important HR functions. For example, with the new embedded generative AI capabilities in Oracle Cloud HCM, our customers will be able to take advantage of large language models to drastically reduce the time required to complete tasks, improve the employee experience, enhance the accuracy of workforce insights, and ultimately increase business value."

Read full story

Return to Keyword Browsing

Jul 16th, 2025 04:31 CDT change timezone

Latest GPU Drivers

New Forum Posts

04:29 by de.das.dude
AI Job Losses: let's count the losses up, total losses to AI so far 94,000 and counting (11)
04:07 by avidgamer121
What's your latest tech purchase? (24282)
04:05 by Greenslade
TPU's Nostalgic Hardware Club (20515)
03:54 by Greenslade
Best motherboards for XP gaming (159)
03:31 by _roman_
Choosing the right motherboard (13)
03:23 by AVATARAT
[GPU-Z Test Build] New Kernel Driver, Everyone: Please Test (98)
03:20 by Rover4444
Is this dual channel or async? (4)
03:20 by leeamtheone
Need a new webhost, any suggestions? (7)
03:17 by AVATARAT
6400c30 vs 8000c36 Ryzen 9800X3D (28)
03:13 by Rover4444
Game Informer magazine is back! (17)

Popular Reviews

Jul 14th, 2025 MSI GeForce RTX 5060 Gaming OC Review
Jul 15th, 2025 SilverStone SETA H2 Review
Jul 11th, 2025 Our Visit to the Hunter Super Computer
Jul 11th, 2025 Lexar NM1090 Pro 4 TB Review
Jul 4th, 2025 NVIDIA GeForce RTX 5050 8 GB Review
Jul 9th, 2025 Fractal Design Epoch RGB TG Review
Jun 20th, 2025 Sapphire Radeon RX 9060 XT Pulse OC 16 GB Review - An Excellent Choice
Nov 6th, 2024 AMD Ryzen 7 9800X3D Review - The Best Gaming Processor
May 13th, 2025 Upcoming Hardware Launches 2025 (Updated May 2025)
Jul 8th, 2025 Corsair FRAME 5000D RS Review

TPU on YouTube

Controversial News Posts