News Posts matching #NVIDIA

Return to Keyword Browsing

US Government Wants Nuclear Plants to Offload AI Data Center Expansion

The expansion of AI technology affects not only the production and demand for graphics cards but also the electricity grid that powers them. Data centers hosting thousands of GPUs are becoming more common, and the industry has been building new facilities for GPU-enhanced servers to serve the need for more AI. However, these powerful GPUs often consume over 500 Watts per single card, and NVIDIA's latest Blackwell B200 GPU has a TGP of 1000 Watts or a single kilowatt. These kilowatt GPUs will be present in data centers with 10s of thousands of cards, resulting in multi-megawatt facilities. To combat the load on the national electricity grid, US President Joe Biden's administration has been discussing with big tech to re-evaluate their power sources, possibly using smaller nuclear plants. According to an Axios interview with Energy Secretary Jennifer Granholm, she has noted that "AI itself isn't a problem because AI could help to solve the problem." However, the problem is the load-bearing of the national electricity grid, which can't sustain the rapid expansion of the AI data centers.

The Department of Energy (DOE) has been reportedly talking with firms, most notably hyperscalers like Microsoft, Google, and Amazon, to start considering nuclear fusion and fission power plants to satisfy the need for AI expansion. We have already discussed the plan by Microsoft to embed a nuclear reactor near its data center facility and help manage the load of thousands of GPUs running AI training/inference. However, this time, it is not just Microsoft. Other tech giants are reportedly thinking about nuclear as well. They all need to offload their AI expansion from the US national power grid and develop a nuclear solution. Nuclear power is a mere 20% of the US power sourcing, and DOE is currently financing a Holtec Palisades 800-MW electric nuclear generating station with $1.52 billion in funds for restoration and resumption of service. Microsoft is investing in a Small Modular Reactors (SMRs) microreactor energy strategy, which could be an example for other big tech companies to follow.

ASUS ROG Strix GeForce RTX 4090 D Tweaked to Match RTX 4090 FE Performance

NVIDIA's GeForce RTX 4090 D GPU was launched late last year in China—this weakened variant (of the standard RTX 4090) was designed with US trade regulations in mind. Chinese media outlets have toyed around with various custom models for several months—January 2024 evaluations indicated a 5% performance disadvantage when lined up against unrestricted models. The GeForce RTX 4090 D GPU is a potent beast despite a reduced core count and restricted TDP limit, but Chinese enthusiasts have continued to struggle with the implementation of worthwhile overclocks. HKEPC—a Hong Kong-situated PC hardware review outlet—has bucked that trend.

The mega-sized flagship ZOTAC RTX 4090 D PGF model has the technical credentials to break beyond the expected overclock increase of "2 to 5%," courtesy of a powerful 28-phase power PCB design and 530 W max. TGP limit. The Expreview team pulled a paltry 3.7% extra bit of performance from ZOTAC China's behemoth. In contrast, HKEPC wrangled out some bigger numbers with a sampled ASUS ROG STRIX RTX 4090 GAMING OC graphics card—matching unrestricted variants: "it turns out that NVIDIA only does not allow AIC manufacturers to preset overclocking, but it does not restrict users from overclocking by themselves. After a high degree of overclocking adjustment, the ROG Strix RTX 4090 D actually has a way to achieve the performance level of the RTX 4090 FE."

Latest Dragon's Dogma 2 Update Improves DLSS Quality, Fixes Bugs, and More

Capcom has released the newest update for Dragon's Dogma 2 on PC and PlayStation 5, while Xbox will get it in next few days. On the PC, the update improves quality when DLSS Super Resolution is enabled, and fixes various bugs. Earlier, Dragon's Dogma 2 Devs have already announced an investigation for PC performance and stability issues, specifically frame rate and various crashes, bugs and the option to start a new game, which has been addressed with this update.

According to the release notes release by Capcom, PC specific updates include improved quality when NVIDIA DLSS Super Resolution is enabled and a fix for an issue related to the display of models under some specific settings. On both PlayStation 5 and the PC, the update adds the option to start a new game when save game data already exists, changes the number of "Art of Metamorphosis" items available in Pawn Shops to 99, makes the "dwelling quest" available earlier in the game, as well as fixes various bugs and text display issues. On the PlayStation 5, the update also adds the option to switch on/off the Motion Blur and Ray Tracing, as well as to set the maximum frame rate to 30 FPS. Capcom also notes that motion blur and ray tracing options should not affect frame rate significantly, and that improvements to frame rate are planned for future updates. The release notes also say that updates to Xbox Series X/S version of the game are planned in the next few days.

PGL Investigating GeForce RTX 4080 GPU Driver Crash, Following Esports Event Disruption

The Professional Gamers League (PGL) showcased its newly upgraded tournament rig specification prior to the kick-off of their (still ongoing) CS2 Major Copenhagen 2024 esports event. As reported, over a week ago, competitors have been treated to modern systems decked out with AMD's popular gaming-oriented Ryzen 7 7800X3D CPU and NVIDIA GeForce RTX 4080 graphics cards, while BenQ's ZOWIE XL2566K 24.5" 360 Hz gaming monitor delivers a superfast visual feed. A hefty chunk of change has been spent on new hardware, but expensive cutting-edge tech can falter. Virtus.pro team member—Jame—experienced a major software crash during a match against rival group, G2.

PCGamesN noted that this frustrating incident ended the affected team's chance to grab a substantial cash reward. Their report put a spotlight on this unfortunate moment: "in the second round of a best of three, Virtus Pro were a few rounds away from qualifying for the playoffs, only for their aspirations to be squashed through no fault of their own...Jame experiences a graphics card driver crash that irrecoverably steers the round in G2's favor, culminating in Virtus Pro losing the match 11-13. Virtus Pro would then go on to lose the subsequent tie-break match as the round was not replayed. In effect, the graphics card driver crash partly cost the team their chance at winning an eventual $1.25 million prize pool." PGL revealed, via a social media post, that officials are doing some detective work: "we wish to clarify the situation involving Jame during the second map, Inferno, in the series against G2. A technical malfunction occurred due to an NVIDIA driver crash, resulting in a game crash. We are continuing our investigation into the matter." The new tournament rigs were "meticulously optimized" and tested in the weeks leading up to CS2 Major Copenhagen 2024—it is believed that the driver crash was a random anomaly. PGL and NVIDIA are currently working on a way to "identify and fix the issue."

Lenovo Anticipates Great Demand for AMD Instinct MI300X Accelerator Products

Ryan McCurdy, President of Lenovo North America, revealed ambitious forward-thinking product roadmap during an interview with CRN magazine. A hybrid strategic approach will create an anticipated AI fast lane on future hardware—McCurdy, a former Intel veteran, stated: "there will be a steady stream of product development to add (AI PC) hardware capabilities in a chicken-and-egg scenario for the OS and for the (independent software vendor) community to develop their latest AI capabilities on top of that hardware...So we are really paving the AI autobahn from a hardware perspective so that we can get the AI software cars to go faster on them." Lenovo—as expected—is jumping on the AI-on-device train, but it will be diversifying its range of AI server systems with new AMD and Intel-powered options. The company has reacted to recent Team Green AI GPU supply issues—alternative units are now in the picture: "with NVIDIA, I think there's obviously lead times associated with it, and there's some end customer identification, to make sure that the products are going to certain identified end customers. As we showcased at Tech World with NVIDIA on stage, AMD on stage, Intel on stage and Microsoft on stage, those industry partnerships are critical to not only how we operate on a tactical supply chain question but also on a strategic what's our value proposition."

McCurdy did not go into detail about upcoming Intel-based server equipment, but seemed excited about AMD's Instinct MI300X accelerator—Lenovo was (previously) announced as one of the early OEM takers of Team Red's latest CDNA 3.0 tech. CRN asked about the firm's outlook for upcoming MI300X-based inventory—McCurdy responded with: "I won't comment on an unreleased product, but the partnership I think illustrates the larger point, which is the industry is looking for a broad array of options. Obviously, when you have any sort of lead times, especially six-month, nine-month and 12-month lead times, there is interest in this incredible technology to be more broadly available. I think you could say in a very generic sense, demand is as high as we've ever seen for the product. And then it comes down to getting the infrastructure launched, getting testing done, and getting workloads validated, and all that work is underway. So I think there is a very hungry end customer-partner user base when it comes to alternatives and a more broad, diverse set of solutions."

GeForce NOW Thursday: Get Cozy With "Palia" & Five New Titles

Ease into spring with the warm, cozy vibes of Palia, coming to the cloud this GFN Thursday. It's part of six new titles joining the GeForce NOW library of over 1,800 games. Welcome Home. Escape to a cozy world with Palia, a free-to-play massively multiplayer online game from Singularity 6 Corporation. The game, which has made its way onto more than 200,000 wishlists on Steam, has launched in the cloud this week.

Farm, fish, craft and explore with friendly villagers across a stunning variety of different biomes—from sprawling flower fields to hilly forests and rocky beaches—in the world of Palia. Inhabit the land, furnish a dream home, unravel ancient mysteries and interact with a vibrant online community. Get ready for a captivating adventure across devices by streaming Palia from the cloud. GeForce NOW Ultimate and Priority members get faster access to servers and longer gaming sessions over Free members.

NVIDIA Hopper Leaps Ahead in Generative AI at MLPerf

It's official: NVIDIA delivered the world's fastest platform in industry-standard tests for inference on generative AI. In the latest MLPerf benchmarks, NVIDIA TensorRT-LLM—software that speeds and simplifies the complex job of inference on large language models—boosted the performance of NVIDIA Hopper architecture GPUs on the GPT-J LLM nearly 3x over their results just six months ago. The dramatic speedup demonstrates the power of NVIDIA's full-stack platform of chips, systems and software to handle the demanding requirements of running generative AI. Leading companies are using TensorRT-LLM to optimize their models. And NVIDIA NIM—a set of inference microservices that includes inferencing engines like TensorRT-LLM—makes it easier than ever for businesses to deploy NVIDIA's inference platform.

Raising the Bar in Generative AI
TensorRT-LLM running on NVIDIA H200 Tensor Core GPUs—the latest, memory-enhanced Hopper GPUs—delivered the fastest performance running inference in MLPerf's biggest test of generative AI to date. The new benchmark uses the largest version of Llama 2, a state-of-the-art large language model packing 70 billion parameters. The model is more than 10x larger than the GPT-J LLM first used in the September benchmarks. The memory-enhanced H200 GPUs, in their MLPerf debut, used TensorRT-LLM to produce up to 31,000 tokens/second, a record on MLPerf's Llama 2 benchmark. The H200 GPU results include up to 14% gains from a custom thermal solution. It's one example of innovations beyond standard air cooling that systems builders are applying to their NVIDIA MGX designs to take the performance of Hopper GPUs to new heights.

NVIDIA GeForce RTX 4060 Slides Down to $279

With competition in the performance segment of graphics cards heating up, the GeForce RTX 4060 "Ada" finds itself embattled at its $299 price point, with the Radeon RX 7600 XT at $325, the RX 7600 (non-XT) down to $250. This has prompted a retailer-level price-cut for a Zotac-branded RTX 4060 graphics card. The Zotac RTX 4060 Twin Edge OC White is listed on Newegg for $279, which puts it $20 below the NVIDIA MSRP. The RTX 4060 is squarely a 1080p-class GPU, designed for AAA gameplay with maxed out settings, and ray tracing. The one ace the RTX 4060 wields over similarly-priced GPUs from the previous generation has to be DLSS 3 Frame Generation. Our most recent testing puts the RX 7600 within 2% of the RTX 4060 at 1080p raster workloads, although the ray tracing performance of the RTX 4060 is significantly ahead, by around 16%.

Outpost: Infinity Siege Launches With DLSS 3 & New DLSS 2 Games Out Now

Over 500 games and applications feature RTX technologies, and barely a week goes by without new blockbuster games and incredible indie releases integrating NVIDIA DLSS, NVIDIA Reflex, and advanced ray-traced effects to deliver the definitive PC experience for GeForce RTX gamers.

This week, we're highlighting the release of DLSS 3-accelerated release of Outpost: Infinity Siege, and the launch of Alone In The Dark and Lightyear Frontier, which both feature DLSS 2. This batch of great new RTX releases follows the release of Horizon Forbidden West Complete Edition, which boasted day-one support for NVIDIA DLSS 3, NVIDIA DLAA, and NVIDIA Reflex. Additionally, Diablo IV's ray tracing update is out now—learn more about each new announcement below.

CyberpowerPC Releases New Tracer VIII Gaming Laptops

CyberPowerPC, a leading name in the gaming PC industry, today announced its latest gaming laptops, the Tracer VIII Series. These gaming laptops come in three different models with different feature sets for all levels of gaming. Features a crisp 17.4" WQXGA 2560x1600 240HZ screen complemented by 14th Gen Intel Mobile Processors and NVIDIA GeForce RTX 40 Series graphics. The keyboard is mechanical with RGB backlighting making for a comfortable gaming and typing experience. Comes ready for an optional detachable liquid cooler.

Features a clear 16" WQXGA 2560x1600 240 Hz SRGB 100% display with 14th Gen Intel Core Mobile Processors and NVIDIA GeForce RTX 40 Series graphics. Also has a mechanical RGB Backlit keyboard for a pleasant tactile typing experience. This model gives you the best of the Edge and Gaming models, such as 14th Gen Intel Mobile Processors and NVIDIA GeForce RTX 40 Series graphics, in a smaller, thinner package. A minimalistic design keeps things lightweight and portable. A 180-degree hinge on the screen allows the laptop to lay completely flat for ultimate viewing flexibility. The charger also sees a makeover, with a 240 W ultra slim adapter for better charging on the go. Available with 15.3" WQXGA 2560x1600 120 Hz SRGB 100% or 16" WQXGA 2560x1600 165 Hz SRGB 100% displays.

NVIDIA Modulus & Omniverse Drive Physics-informed Models and Simulations

A manufacturing plant near Hsinchu, Taiwan's Silicon Valley, is among facilities worldwide boosting energy efficiency with AI-enabled digital twins. A virtual model can help streamline operations, maximizing throughput for its physical counterpart, say engineers at Wistron, a global designer and manufacturer of computers and electronics systems. In the first of several use cases, the company built a digital copy of a room where NVIDIA DGX systems undergo thermal stress tests (pictured above). Early results were impressive.

Making Smart Simulations
Using NVIDIA Modulus, a framework for building AI models that understand the laws of physics, Wistron created digital twins that let them accurately predict the airflow and temperature in test facilities that must remain between 27 and 32 degrees C. A simulation that would've taken nearly 15 hours with traditional methods on a CPU took just 3.3 seconds on an NVIDIA GPU running inference with an AI model developed using Modulus, a whopping 15,000x speedup. The results were fed into tools and applications built by Wistron developers with NVIDIA Omniverse, a platform for creating 3D workflows and applications based on OpenUSD.

Product Pages of Samsung 28 Gbps and 32 Gbps GDDR7 Chips Go Live

Samsung is ready with a GDDR7 memory chip rated at an oddly-specific 28 Gbps. This speed aligns with the reported default memory speeds of next-generation NVIDIA GeForce RTX "Blackwell" GPUs. The Samsung GDDR7 memory chip bearing model number K4VAF325ZC-SC28, pictured below, ticks at 3500 MHz, yielding 28 Gbps (GDDR7-effective) memory speeds, and comes with a density of 16 Gbit (2 GB). This isn't Samsung's only GDDR7 chip at launch, the company also has a 32 Gbps high performance part that it built in hopes that certain high-end SKUs or professional graphics cards may implement it. The 32 Gbps GDDR7 chip, bearing the chip model number K4VAF325ZC-SC32, offers the same 16 Gbit density, but at a higher 4000 MHz clock. The Samsung website part-identification pages for both chips say that the parts are sampling to customers, which is usually just before it enters mass-production, and is marked "shipping."

Tiny Corp. Prepping Separate AMD & NVIDIA GPU-based AI Compute Systems

George Hotz and his startup operation (Tiny Corporation) appeared ready to completely abandon AMD Radeon GPUs last week, after experiencing a period of firmware-related headaches. The original plan involved the development of a pre-orderable $15,000 TinyBox AI compute cluster that housed six XFX Speedster MERC310 RX 7900 XTX graphics cards, but software/driver issues prompted experimentation via alternative hardware routes. A lot of media coverage has focused on the unusual adoption of consumer-grade GPUs—Tiny Corp.'s struggles with RDNA 3 (rather than CDNA 3) were maneuvered further into public view, after top AMD brass pitched in.

The startup's social media feed is very transparent about showcasing everyday tasks, problem-solving and important decision-making. Several Acer Predator BiFrost Arc A770 OC cards were purchased and promptly integrated into a colorfully-lit TinyBox prototype, but Hotz & Co. swiftly moved onto Team Green pastures. Tiny Corp. has begrudgingly adopted NVIDIA GeForce RTX 4090 GPUs. Earlier today, it was announced that work on the AMD-based system has resumed—although customers were forewarned about anticipated teething problems. The surprising message arrived in the early hours: "a hard to find 'umr' repo has turned around the feasibility of the AMD TinyBox. It will be a journey, but it gives us an ability to debug. We're going to sell both, red for $15,000 and green for $25,000. When you realize your pre-order you'll choose your color. Website has been updated. If you like to tinker and feel pain, buy red. The driver still crashes the GPU and hangs sometimes, but we can work together to improve it."

Samsung Introduces "Petabyte SSD as a Service" at GTC 2024, "Petascale" Servers Showcased

Leaked Samsung PBSSD presentation material popped up online a couple of days prior to the kick-off day of NVIDIA's GTC 2024 conference (March 18)—reports (at the time) jumped on the potential introduction of a "petabyte (PB)-level SSD solution," alongside an enterprise subscription service for the US market. Tom's Hardware took the time to investigate this matter—in-person—on the showroom floor up in San Jose, California. It turns out that interpretations of pre-event information were slightly off—according to on-site investigations: "despite the name, PBSSD is not a petabyte-scale solid-state drive (Samsung's highest-capacity drive can store circa 240 TB), but rather a 'petascale' storage system that can scale-out all-flash storage capacity to petabytes."

Samsung showcased a Supermicro Petascale server design, but a lone unit is nowhere near capable of providing a petabyte of storage—the Tom's Hardware reporter found out that the demonstration model housed: "sixteen 15.36 TB SSDs, so for now the whole 1U unit can only pack up to 245.76 TB of 3D NAND storage (which is pretty far from a petabyte), so four of such units will be needed to store a petabyte of data." Company representatives also had another Supermicro product at their booth: "(an) H13 all-flash petascale system with CXL support that can house eight E3.S SSDs (with) four front-loading E3.S CXL bays for memory expansion."

NVIDIA's Bryan Catanzaro Discusses Future of AI Personal Computing

Imagine a world where you can whisper your digital wishes into your device, and poof, it happens. That world may be coming sooner than you think. But if you're worried about AI doing your thinking for you, you might be waiting for a while. In a fireside chat Wednesday (March 20) at NVIDIA GTC, the global AI conference, Kanjun Qiu, CEO of Imbue, and Bryan Catanzaro, VP of applied deep learning research at NVIDIA, challenged many of the clichés that have long dominated conversations about AI. Launched in October 2022, Imbue made headlines with its Series B fundraiser last year, raising over $200 million at a $1 billion valuation.

The Future of Personal Computing
Qiu and Catanzaro discussed the role that virtual worlds will play in this, and how they could serve as interfaces for human-technology interaction. "I think it's pretty clear that AI is going to help build virtual worlds," said Catanzaro. "I think the maybe more controversial part is virtual worlds are going to be necessary for humans to interact with AI." People have an almost primal fear of being displaced, Catanzaro said, but what's much more likely is that our capabilities will be amplified as the technology fades into the background. Catanzaro compared it to the adoption of electricity. A century ago, people talked a lot about electricity. Now that it's ubiquitous, it's no longer the focus of broader conversations, even as it makes our day-to-day lives better.

NVIDIA GeForce RTX 4060, 4060 Ti & 4070 GPU Refreshes Spotted in Leak

NVIDIA completed its last round of GeForce NVIDIA RTX 40-series GPU refreshes at the very end of January—new evidence suggests that another wave is scheduled for imminent release. MEGAsizeGPU has acquired and shared a tabulated list of new Ada Lovelace GPU variants—the trusted leaker's post presents a timetable that was supposed to kick off within the second half of this month. First up is the GeForce RTX 4070 GPU, with a current designation of AD104-251—the leaked table suggests that a new variant, AD103-175-KX, is due very soon (or overdue). Wccftech pointed out that the new ID was previously linked to NVIDIA's GeForce RTX 4070 SUPER SKU. Moving into April, next up is the GeForce RTX 4060 Ti—jumping from the current AD106-351 die to a new unit; AD104-150-KX. The third adjustment (allegedly) affects the GeForce RTX 4060—going from AD107-400 to AD106-255, also timetabled for next month. MEGAsizeGPU reckons that Team Green will be swapping chips, but not rolling out broadly adjusted specifications—a best case scenario could include higher CUDA, RT, and Tensor core counts. According to VideoCardz, the new die designations have popped up in freshly released official driver notes—it is inferred that the variants are getting an "under the radar" launch treatment.

EMTEK Launches GeForce RTX 4070 SUPER MIRACLE X3 White 12 GB Graphics Card

EMTEK products rarely pop up on TPU's news section, but the GPU database contains a smattering of the South Korean manufacturer's Ampere-based GeForce RTX graphics card. VideoCardz has discovered an updated MIRACLE X3 White model—EMTEK's latest release is a GeForce RTX 4070 SUPER 12 GB card. The triple-fan model seems to stick with NVIDIA's reference specifications—VideoCardz also noticed a physical similarity: "under the cooler shroud, the card boasts a non-standard U-shaped PCB, reminiscent of Team Green's Founders Edition. However, it remains uncertain whether EMTEK utilizes the same PCB as NVIDIA." The asking price—of ₩919,990—converts to around $680, when factoring in regional taxes. EMTEK's MIRACLE X3 cooling solution seems to be fairly robust—featuring four 6 mm heat pipes—so an adherence to stock clocks is a slight surprise. The company's GAMING PRO line includes a couple of factory overclocked options.

Nvidia CEO Reiterates Solid Partnership with TSMC

One key takeaway from the ongoing GTC is that Nvidia's AI empire has taken shape with strong partnerships from TSMC and other Taiwanese makers, such as those major server ODMs.

According to the news report from the technology-focused media DIGITIMES Asia, during his keynote at GTC on March 18, Huang underscored his company's partnerships with TSMC, as well as the supply chain in Taiwan. Speaking to the press later, Huang said Nvidia will have a very strong demand for CoWoS, the advanced packaging services TSMC offers.

Dragon's Dogma 2 Comes to NVIDIA GeForce NOW

Arise for a new adventure with Dragon's Dogma 2, leading two new titles joining the GeForce NOW library this week. Dragon's Dogma 2, the long-awaited sequel to Capcom's legendary action role-playing game, streams this week on GeForce NOW.

The game challenges players to choose their own experience, including their Arisen's appearance, vocation, party, approaches to different situations and more. Wield swords, bows and magick across an immersive fantasy world full of life and battle. But players won't be alone. Recruit Pawns - mysterious otherworldly beings - to aid in battle and work with other players' Pawns to fight the diverse monsters inhabiting the ever-changing lands. Upgrade to a GeForce NOW Ultimate membership to stream Dragon's Dogma 2 from NVIDIA GeForce RTX 4080 servers in the cloud for the highest performance, even on low-powered devices. Ultimate members also get exclusive access to servers to get right into gaming without waiting for any downloads.

NVIDIA to Implement GDDR7 Memory on Top-3 "Blackwell" GPUs

NVIDIA is confirmed to implement the GDDR7 memory standard with the top three GPU ASICs powering the next-generation "Blackwell" GeForce RTX 50-series, Tweaktown reports, citing XpeaGPU. By this, we mean the top three physical silicon types from which NVIDIA will carve out the majority of its SKUs. This would include the GB202, the GB203, and GB205; which will power successors to everything from the current RTX 4070 to the RTX 4090. NVIDIA is expected to build these chips on the TSMC 4N foundry node.

There will be certain GPU ASIC types in the "Blackwell" generation that will stick to older memory standards such as GDDR6 or even the GDDR6X. These would be successors to the current AD106 and AD107 ASICs, powering SKUs such as the RTX 4060 Ti, and below. NVIDIA co-developed the GDDR6X standard with Micron Technology, which is the chip's exclusive supplier to NVIDIA. GDDR6X scales up to 23 Gbps and 16 Gbit, which means NVIDIA can avail plenty of performance for the lower-end of its product stack using GDDR6X; especially considering that its GDDR7 implementation will only run at 28 Gbps, despite chips being available in the market for 32 Gbps, or even 36 Gbps. Even if NVIDIA chooses the regular GDDR6 standard for its entry-mainstream chips, the tech scales up to 20 Gbps.

Samsung Prepares Mach-1 Chip to Rival NVIDIA in AI Inference

During its 55th annual shareholders' meeting, Samsung Electronics announced its entry into the AI processor market with the upcoming launch of its Mach-1 AI accelerator chips in early 2025. The South Korean tech giant revealed its plans to compete with established players like NVIDIA in the rapidly growing AI hardware sector. The Mach-1 generation of chips is an application-specific integrated circuit (ASIC) design equipped with LPDDR memory that is envisioned to excel in edge computing applications. While Samsung does not aim to directly rival NVIDIA's ultra-high-end AI solutions like the H100, B100, or B200, the company's strategy focuses on carving out a niche in the market by offering unique features and performance enhancements at the edge, where low power and efficient computing is what matters the most.

According to SeDaily, the Mach-1 chips boast a groundbreaking feature that significantly reduces memory bandwidth requirements for inference to approximately 0.125x compared to existing designs, which is an 87.5% reduction. This innovation could give Samsung a competitive edge in terms of efficiency and cost-effectiveness. As the demand for AI-powered devices and services continues to soar, Samsung's foray into the AI chip market is expected to intensify competition and drive innovation in the industry. While NVIDIA currently holds a dominant position, Samsung's cutting-edge technology and access to advanced semiconductor manufacturing nodes could make it a formidable contender. The Mach-1 has been field-verified on an FPGA, while the final design is currently going through a physical design for SoC, which includes placement, routing, and other layout optimizations.

NVIDIA CEO Jensen Huang: AGI Within Five Years, AI Hallucinations are Solvable

After giving a vivid GTC talk, NVIDIA's CEO Jensen Huang took on a Q&A session with many interesting ideas for debate. One of them is addressing the pressing concerns surrounding AI hallucinations and the future of Artificial General Intelligence (AGI). With a tone of confidence, Huang reassured the tech community that the phenomenon of AI hallucinations—where AI systems generate plausible yet unfounded answers—is a solvable issue. His solution emphasizes the importance of well-researched and accurate data feeding into AI systems to mitigate these occurrences. "The AI shouldn't just answer; it should do research first to determine which of the answers are the best," noted Mr. Huang as he added that for every single question, there should be a rule that makes AI research the answer. This also refers to Retrieval-Augmented Generation (RAG), where LLMs fetch data from external sources, like additional databases, for fact-checking.

Another interesting comment made by the CEO is that the pinnacle of AI evolution—Artificial General Intelligence—is just five years away. Many people working in AI are divided between the AGI timeline. While Mr. Huang predicted five years, some leading researchers like Meta's Yann LeCunn think we are far from the AGI singularity threshold and will be stuck with dog/cat-level AI systems first. AGI has long been a topic of both fascination and apprehension, with debates often revolving around its potential to exceed human intelligence and the ethical implications of such a development. Critics worry about the unpredictability and uncontrollability of AGI once it reaches a certain level of autonomy, raising questions about aligning its objectives with human values and priorities. Timeline-wise, no one knows, and everyone makes their prediction, so time will tell who was right.

Jensen Huang Discloses NVIDIA Blackwell GPU Pricing: $30,000 to $40,000

Jensen Huang has been talking to media outlets following the conclusion of his keynote presentation at NVIDIA's GTC 2024 conference—an NBC TV "exclusive" interview with the Team Green boss has caused a stir in tech circles. Jim Cramer's long-running "Squawk on the Street" trade segment hosted Huang for just under five minutes—NBC's presenter labelled the latest edition of GTC the "Woodstock of AI." NVIDIA's leader reckoned that around $1 trillion of industry was in attendance at this year's event—folks turned up to witness the unveiling of "Blackwell" B200 and GB200 AI GPUs. In the interview, Huang estimated that his company had invested around $10 billion into the research and development of its latest architecture: "we had to invent some new technology to make it possible."

Industry watchdogs have seized on a major revelation—as disclosed during the televised NBC report—Huang revealed that his next-gen AI GPUs "will cost between $30,000 and $40,000 per unit." NVIDIA (and its rivals) are not known to publicly announce price ranges for AI and HPC chips—leaks from hardware partners and individuals within industry supply chains are the "usual" sources. An investment banking company has already delved into alleged Blackwell production costs—as shared by Tae Kim/firstadopter: "Raymond James estimates it will cost NVIDIA more than $6000 to make a B200 and they will price the GPU at a 50-60% premium to H100...(the bank) estimates it costs NVIDIA $3320 to make the H100, which is then sold to customers for $25,000 to $30,000." Huang's disclosure should be treated as an approximation, since his company (normally) deals with the supply of basic building blocks.

SK hynix Unveils Highest-Performing SSD for AI PCs at NVIDIA GTC 2024

SK hynix unveiled a new consumer product based on its latest solid-state drive (SSD), PCB01, which boasts industry-leading performance levels at GPU Technology Conference (GTC) 2024. Hosted by NVIDIA in San Jose, California from March 18-21, GTC is one of the world's leading conferences for AI developers. Applied to on-device AI PCs, PCB01 is a PCIe fifth-generation SSD which recently had its performance and reliability verified by a major global customer. After completing product development in the first half of 2024, SK hynix plans to launch two versions of PCB01 by the end of the year which target both major technology companies and general consumers.

Optimized for AI PCs, Capable of Loading LLMs Within One Second
Offering the industry's highest sequential read speed of 14 gigabytes per second (GB/s) and a sequential write speed of 12 GB/s, PCB01 doubles the speed specifications of its previous generation. This enables the loading of LLMs required for AI learning and inference in less than one second. To make on-device AIs operational, PC manufacturers create a structure that stores an LLM in the PC's internal storage and quickly transfers the data to DRAMs for AI tasks. In this process, the PCB01 inside the PC efficiently supports the loading of LLMs. SK hynix expects these characteristics of its latest SSD to greatly increase the speed and quality of on-device AIs.

PNY Technologies Unveils NVIDIA IGX Orin, NVIDIA Holoscan, and Magic Leap 2 Developer Platform

PNY Technologies, a pioneer in high-performance computing, proudly announces the launch of a groundbreaking developer platform, uniting the formidable capabilities of NVIDIA IGX Orin, NVIDIA Holoscan and Magic Leap 2. This visionary kit empowers software and technology vendors to pioneer cutting-edge solutions in healthcare and other industries, redefining the boundaries of innovation.

Key Features of the NVIDIA IGX + Magic Leap 2 XR Bundle:
  • Zero Physical World Latency for Mission-Critical Applications: Ensure zero physical world latency for mission-critical applications, offering unparalleled precision and real-time data processing.
  • AI Inference and Local Computation: Leverage NVIDIA IGX Orin for AI inference and local computation of complex models, using NVIDIA Holoscan as its real-time multimodal AI sensor processing platform and NVIDIA Metropolis software to offer XR use cases.
  • Ultra-Precise Augmented Reality Interface: Magic Leap 2 delivers an ultra-precise augmented reality interface for accurate and immersive experiences.
Return to Keyword Browsing
May 16th, 2024 23:14 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts