News Posts matching #AI

Return to Keyword Browsing

Next-Generation NVIDIA DGX Systems Could Launch Soon with Liquid Cooling

During the 2024 SIEPR Economic Summit, NVIDIA CEO Jensen Huang acknowledged that the company's next-generation DGX systems, designed for AI and high-performance computing workloads, will require liquid cooling due to their immense power consumption. Huang also hinted that these new systems are set to be released in the near future. The revelation comes as no surprise, given the increasing power of GPUs needed to satisfy AI and machine learning applications. As computational requirements continue to grow, so does the need for more powerful hardware. However, with great power comes great heat generation, necessitating advanced cooling solutions to maintain optimal performance and system stability. Liquid cooling has long been a staple in high-end computing systems, offering superior thermal management compared to traditional air cooling methods.

By implementing liquid cooling in the upcoming DGX systems, NVIDIA aims to push the boundaries of performance while ensuring the hardware remains reliable and efficient. Although Huang did not provide a specific release date for the new DGX systems, his statement suggests that they are on the horizon. Whether the next generation of DGX systems uses the current NVIDIA H200 or the upcoming Blackwell B100 GPU as their primary accelerator, the performance will undoubtedly be delivered. As the AI and high-performance computing landscape continues to evolve, NVIDIA's position continues to strengthen, and liquid-cooled systems will certainly play a crucial role in shaping the future of these industries.

SK Hynix To Invest $1 Billion into Advanced Chip Packaging Facilities

Lee Kang-Wook, Vice President of Research and Development at SK Hynix, has discussed the increased importance of advanced chip packaging with Bloomberg News. In an interview with the media company's business section, Lee referred to a tradition of prioritizing the design and fabrication of chips: "the first 50 years of the semiconductor industry has been about the front-end." He believes that the latter half of production processes will take precedence in the future: "...but the next 50 years is going to be all about the back-end." He outlined a "more than $1 billion" investment into South Korean facilities—his department is hoping to "improve the final steps" of chip manufacturing.

SK Hynix's Head of Packaging Development pioneered a novel method of packaging the third generation of high bandwidth technology (HBM2E)—that innovation secured NVIDIA as a high-profile and long term customer. Demand for Team Green's AI GPUs has boosted the significance of HBM technologies—Micron and Samsung are attempting to play catch up with new designs. South Korea's leading memory supplier is hoping to stay ahead in the next-gen HBM contest—supposedly 12-layer fifth generation samples have been submitted to NVIDIA for approval. SK Hynix's Vice President recently revealed that HBM production volumes for 2024 have sold out—currently company leadership is considering the next steps for market dominance in 2025. The majority of the firm's newly announced $1 billion budget will be spent on the advancement of MR-MUF and TSV technologies, according to their R&D chief.

Tiny Corp. CEO Expresses "70% Confidence" in AMD Open-Sourcing Certain GPU Firmware

Lately Tiny Corp. CEO—George Hotz—has used his company's social media account to publicly criticize AMD Radeon RX 7900 XTX GPU firmware. The creator of Tinybox, a pre-orderable $15,000 AI compute cluster, has not selected "traditional" hardware for his systems—it is possible that AMD's Instinct MI300X accelerator is quite difficult to acquire, especially for a young startup operation. The decision to utilize gaming-oriented XFX-branded RDNA 3.0 GPUs instead of purpose-built CDNA 3.0 platforms—for local model training and AI inference—is certainly a peculiar one. Hotz and his colleagues have encountered roadblocks in the development of their Tinybox system—recently, public attention was drawn to an "LLVM spilling bug." AMD President/CEO/Chair, Dr. Lisa Su, swiftly stepped in and promised a "good solution." Earlier in the week, Tiny Corp. reported satisfaction with a delivery of fixes—courtesy of Team Red's software engineering department. They also disclosed that they would be discussing matters with AMD directly, regarding the possibility of open-sourcing Radeon GPU MES firmware.

Subsequently, Hotz documented his interactions with Team Red representatives—he expressed 70% confidence in AMD approving open-sourcing certain bits of firmware in a week's time: "Call went pretty well. We are gating the commitment to 6x Radeon RX 7900 XTX on a public release of a roadmap to get the firmware open source. (and obviously the MLPerf training bug being fixed). We aren't open source purists, it doesn't matter to us if the HDCP stuff is open for example. But we need the scheduler and the memory hierarchy management to be open. This is what it takes to push the performance of neural networks. The Groq 500 T/s mixtral demo should be possible on a tinybox, but it requires god tier software and deep integration with the scheduler. We also advised that the build process for amdgpu-dkms should be more open. While the driver itself is open, we haven't found it easy to rebuild and install. Easy REPL cycle is a key driver for community open source. We want the firmware to be easy to rebuild and install also." Prior to this week's co-operations, Tiny Corp. hinted that it could move on from utilizing Radeon RX 7900 XTX, in favor of Intel Alchemist graphics hardware—if AMD's decision making does not favor them, Hotz & Co. could pivot to builds including Acer Predator BiFrost Arc A770 16 GB OC cards.

Jensen Huang Celebrates Rise of Portable AI Workstations

2024 will be the year generative AI gets personal, the CEOs of NVIDIA and HP said today in a fireside chat, unveiling new laptops that can build, test and run large language models. "This is a renaissance of the personal computer," said NVIDIA founder and CEO Jensen Huang at HP Amplify, a gathering in Las Vegas of about 1,500 resellers and distributors. "The work of creators, designers and data scientists is going to be revolutionized by these new workstations."

Greater Speed and Security
"AI is the biggest thing to come to the PC in decades," said HP's Enrique Lores, in the runup to the announcement of what his company billed as "the industry's largest portfolio of AI PCs and workstations." Compared to running their AI work in the cloud, the new systems will provide increased speed and security while reducing costs and energy, Lores said in a keynote at the event. New HP ZBooks provide a portfolio of mobile AI workstations powered by a full range of NVIDIA RTX Ada Generation GPUs. Entry-level systems with the NVIDIA RTX 500 Ada Generation Laptop GPU let users run generative AI apps and tools wherever they go. High-end models pack the RTX 5000 to deliver up to 682 TOPS, so they can create and run LLMs locally, using retrieval-augmented generation (RAG) to connect to their content for results that are both personalized and private.

NVIDIA Introduces Generative AI Professional Certification

NVIDIA is offering a new professional certification in generative AI to enable developers to establish technical credibility in this important domain. Generative AI is revolutionizing industries worldwide, yet there's a critical skills gap and need to uplevel employees to more fully harness the technology. Available for the first time from NVIDIA, this new professional certification enables developers, career professionals, and others to validate and showcase their generative AI skills and expertise. Our new professional certification program introduces two associate-level generative AI certifications, focusing on proficiency in large language models and multimodal workflow skills.

"Generative AI has moved to center stage as governments, industries and organizations everywhere look to harness its transformative capabilities," NVIDIA founder and CEO Jensen Huang recently said. The certification will become available starting at GTC, where in-person attendees can also access recommended training to prepare for a certification exam. "Organizations in every industry need to increase their expertise in this transformative technology," said Greg Estes, VP of developer programs at NVIDIA. "Our goals are to assist in upskilling workforces, sharpen the skills of qualified professionals, and enable individuals to demonstrate their proficiency in order to gain a competitive advantage in the job market."

NVIDIA Data Center GPU Business Predicted to Generate $87 Billion in 2024

Omdia, an independent analyst and consultancy firm, has bestowed the title of "Kingmaker" on NVIDIA—thanks to impressive 2023 results in the data server market. The research firm predicts very buoyant numbers for the financial year of 2024—their February Cloud and Datacenter Market snapshot/report guesstimates that Team Green's data center GPU business group has the potential to rake in $87 billion of revenue. Omdia's forecast is based on last year's numbers—Jensen & Co. managed to pull in $34 billion, courtesy of an unmatched/dominant position in the AI GPU industry sector. Analysts have estimated a 150% rise in revenues for in 2024—the majority of popular server manufacturers are reliant on NVIDIA's supply of chips. Super Micro Computer Inc. CEO—Charles Liang—disclosed that his business is experiencing strong demand for cutting-edge server equipment, but complications have slowed down production: "once we have more supply from the chip companies, from NVIDIA, we can ship more to customers."

Demand for AI inference in 2023 accounted for 40% of NVIDIA data center GPU revenue—according Omdia's expert analysis—they predict further growth this year. Team Green's comfortable AI-centric business model could expand to a greater extent—2023 market trends indicated that enterprise customers had spent less on acquiring/upgrading traditional server equipment. Instead, they prioritized the channeling of significant funds into "AI heavyweight hardware." Omdia's report discussed these shifted priorities: "This reaffirms our thesis that end users are prioritizing investment in highly configured server clusters for AI to the detriment of other projects, including delaying the refresh of older server fleets." Late February reports suggest that NVIDIA H100 GPU supply issues are largely resolved—with much improved production timeframes. Insiders at unnamed AI-oriented organizations have admitted that leadership has resorted to selling-off of excess stock. The Omdia forecast proposes—somewhat surprisingly—that H100 GPUs will continue to be "supply-constrained" throughout 2024.

HP Unveils Industry's Largest Portfolio of AI PCs

HP Inc. today announced the industry's largest portfolio of AI PCs leveraging the power of AI to enhance productivity, creativity, and user experiences in hybrid work settings.

In an ever-changing hybrid work landscape, workers are still struggling with disconnection and digital fatigue. HP's 2023 Work Relationship Index reveals that only 27% of knowledge workers have a healthy relationship with work, and 83% believe it's time to redefine our relationships with work. Most employees believe AI will open new opportunities to enjoy work and make their jobs easier, but they need the right AI tools and technology to succeed.

NVIDIA and HP Supercharge Data Science and Generative AI on Workstations

NVIDIA and HP Inc. today announced that NVIDIA CUDA-X data processing libraries will be integrated with HP AI workstation solutions to turbocharge the data preparation and processing work that forms the foundation of generative AI development.

Built on the NVIDIA CUDA compute platform, CUDA-X libraries speed data processing for a broad range of data types, including tables, text, images and video. They include the NVIDIA RAPIDS cuDF library, which accelerates the work of the nearly 10 million data scientists using pandas software by up to 110x using an NVIDIA RTX 6000 Ada Generation GPU instead of a CPU-only system, without requiring any code changes.

Marvell Announces Industry's First 2 nm Platform for Accelerated Infrastructure Silicon

Marvell Technology, Inc., a leader in data infrastructure semiconductor solutions, is extending its collaboration with TSMC to develop the industry's first technology platform to produce 2 nm semiconductors optimized for accelerated infrastructure.

Behind the Marvell 2 nm platform is the company's industry-leading IP portfolio that covers the full spectrum of infrastructure requirements, including high-speed long-reach SerDes at speeds beyond 200 Gbps, processor subsystems, encryption engines, system-on-chip fabrics, chip-to-chip interconnects, and a variety of high-bandwidth physical layer interfaces for compute, memory, networking and storage architectures. These technologies will serve as the foundation for producing cloud-optimized custom compute accelerators, Ethernet switches, optical and copper interconnect digital signal processors, and other devices for powering AI clusters, cloud data centers and other accelerated infrastructure.

Dr. Lisa Su Responds to TinyBox's Radeon RX 7900 XTX GPU Firmware Problems

The TinyBox AI server system attracted plenty of media attention last week—its creator, George Hotz, decided to build with AMD RDNA 3.0 GPU hardware rather than the expected/traditional choice of CDNA 3.0. Tiny Corp. is a startup firm dealing in neural network frameworks—they currently "write and maintain tinygrad." Hotz & Co. are in the process of assembling rack-mounted 12U TinyBox systems for customers—an individual server houses an AMD EPYC 7532 processor and six XFX Speedster MERC310 Radeon RX 7900 XTX graphics cards. The Tiny Corp. social media account has engaged in numerous NVIDIA vs. AMD AI hardware debates/tirades—Hotz appears to favor the latter, as evidenced in his latest choice of components. ROCm support on Team Red AI Instinct accelerators is fairly mature at this point in time, but a much newer prospect on gaming-oriented graphics cards.

Tiny Corporation's unusual leveraging of Radeon RX 7900 XTX GPUs in a data center configuration has already hit a developmental roadblock. Yesterday, the company's social media account expressed driver-related frustrations in a public forum: "If AMD open sources their firmware, I'll fix their LLVM spilling bug and write a fuzzer for HSA. Otherwise, it's not worth putting tons of effort into fixing bugs on a platform you don't own." Hotz's latest complaint was taken onboard by AMD's top brass—Dr. Lisa Su responded with the following message: "Thanks for the collaboration and feedback. We are all in to get you a good solution. Team is on it." Her software engineers—within a few hours—managed to fling out a set of fixes in Tiny Corporation's direction. Hotz appreciated the quick turnaround, and proceeded to run a model without encountering major stability issues: "AMD sent me an updated set of firmware blobs to try. They are responsive, and there have been big strides in the driver in the last year. It will be good! This training run is almost 5 hours in, hasn't crashed yet." Tiny Corp. drummed up speculation about AMD open sourcing GPU MES firmware—Hotz disclosed that he will be talking (on the phone) to Team Red leadership.

Intel Gaudi 2 AI Accelerator Powers Through Llama 2 Text Generation

Intel's "AI Everywhere" hype campaign has generated the most noise in mainstream and enterprise segments. Team Blue's Gaudi—a family of deep learning accelerators—does not hit the headlines all that often. Their current generation model, Gaudi 2, is overshadowed by Team Green and Red alternatives—according to Intel's official marketing spiel: "it performs competitively on deep learning training and inference, with up to 2.4x faster performance than NVIDIA A100." Habana, an Intel subsidiary, has been working on optimizing Large Language Model (LLM) inference on Gaudi 1 and 2 for a while—their co-operation with Hugging Face has produced impressive results, as of late February. Siddhant Jagtap, an Intel Data Scientist, has demonstrated: "how easy it is to generate text with the Llama 2 family of models (7b, 13b and 70b) using Optimum Habana and a custom pipeline class."

Jagtap reckons that folks will be able to: "run the models with just a few lines of code" on Gaudi 2 accelerators—additionally, Intel's hardware is capable of accepting single and multiple prompts. The custom pipeline class: "has been designed to offer great flexibility and ease of use. Moreover, it provides a high level of abstraction and performs end-to-end text-generation which involves pre-processing and post-processing." His article/blog outlines various prerequisites and methods of getting Llama 2 text generation up and running on Gaudi 2. Jagtap concluded that Habana/Intel has: "presented a custom text-generation pipeline on Intel Gaudi 2 AI accelerator that accepts single or multiple prompts as input. This pipeline offers great flexibility in terms of model size as well as parameters affecting text-generation quality. Furthermore, it is also very easy to use and to plug into your scripts, and is compatible with LangChain." Hugging Face reckons that Gaudi 2 delivers roughly twice the throughput speed of NVIDIA A100 80 GB in both training and inference scenarios. Intel has teased third generation Gaudi accelerators—industry watchdogs believe that next-gen solutions are designed to compete with Team Green H100 AI GPUs.

Chinese Governing Bodies Reportedly Offering "Compute Vouchers" to AI Startups

Regional Chinese governments are attempting to prop up local AI startup companies with an intriguing "voucher" support system. A Financial Times article outlines "computing" support packages valued between "$140,000 to $280,000" for fledgling organizations involved in LLM training. Widespread shortages of AI chips and rising data center operation costs are cited as the main factors driving a rollout of strategic subsidizations. The big three—Alibaba, Tencent, and ByteDance—are reportedly less willing to rent out their AI-crunching servers, due to internal operations demanding lengthy compute sessions. China's largest technology companies are believed to hording the vast majority of NVIDIA AI hardware, while smaller competitors are believed to fighting over table scraps. US trade restrictions have further escalated supply issues, with lower-performance/China-specific models being rejected—AMD's Instinct MI309 AI accelerator being the latest example.

The "computer voucher" initiative could be the first part of a wider scheme—reports suggest that regional governing bodies (including Shanghai) are devising another subsidy tier for domestic AI chips. Charlie Chai, an 86Research analyst, reckons that the initial support package is only a short-term solution. He shared this observation with FT: "the voucher is helpful to address the cost barrier, but it will not help with the scarcity of the resources." The Chinese government is reportedly looking into the creation of an alternative state-run system, that will become less reliant on a "Big Tech" data center model. A proposed "East Data West Computing" project could produce a more energy-efficient cluster of AI data centers, combined with a centralized management system.

AMD Stalls on Instinct MI309 China AI Chip Launch Amid US Export Hurdles

According to the latest report from Bloomberg, AMD has hit a roadblock in offering its top-of-the-line AI accelerator in the Chinese market. The newest AI chip is called Instinct MI309, a lower-performance Instinct MI300 variant tailored to meet the latest US export rules for selling advanced chips to China-based entities. However, the Instinct MI309 still appears too powerful to gain unconditional approval from the US Department of Commerce, leaving AMD in need of an export license. Originally, the US Department of Commerce made a rule: Total Processing Performance (TPP) score should not exceed 4800, effectively capping AI performance at 600 FP8 TFLOPS. This rule ensures that processors with slightly lower performance may still be sold to Chinese customers, provided their performance density (PD) is sufficiently low.

However, AMD's latest creation, Instinct MI309, is everything but slow. Based on the powerful Instinct MI300, AMD has not managed to bring it down to acceptable levels to acquire a US export license from the Department of Commerce. It is still unknown which Chinese customer was trying to acquire AMD's Instinct MI309; however, it could be one of the Chinese AI labs trying to get ahold of more training hardware for their domestic models. NVIDIA has employed a similar tactic, selling A800 and H800 chips to China, until the US also ended the export of these chips to China. AI labs located in China can only use domestic hardware, including accelerators from Alibaba, Huawei, and Baidu. Cloud services hosting GPUs in US can still be accessed by Chinese companies, but that is currently under US regulators watchlist.

AMD Hires Thomas Zacharia to Expand Strategic AI Relationships

AMD announced that Thomas Zacharia has joined AMD as senior vice president of strategic technology partnerships and public policy. Zacharia will lead the global expansion of AMD public/private relationships with governments, non-governmental organizations (NGOs) and other organizations to help fast-track the deployment of customized AMD-powered AI solutions to meet rapidly growing number of global projects and applications targeting the deployment of AI for the public good.

"Thomas is a distinguished leader with decades of experience successfully creating public/private partnerships that have resulted in consistently deploying the world's most powerful and advanced computing solutions, including the world's fastest supercomputer Frontier," said AMD Chair and CEO Lisa Su. "As the former Director of the U.S.'s largest multi-program science and energy research lab, Thomas is uniquely positioned to leverage his extensive experience advancing the frontiers of science and technology to help countries around the world deploy AMD-powered AI solutions for the public good."

IBM Announces Availability of Open-Source Mistral AI Model on watsonx

IBM announced the availability of the popular open-source Mixtral-8x7B large language model (LLM), developed by Mistral AI, on its watsonx AI and data platform, as it continues to expand capabilities to help clients innovate with IBM's own foundation models and those from a range of open-source providers. IBM offers an optimized version of Mixtral-8x7B that, in internal testing, was able to increase throughput—or the amount of data that can be processed in a given time period—by 50 percent when compared to the regular model. This could potentially cut latency by 35-75 percent, depending on batch size—speeding time to insights. This is achieved through a process called quantization, which reduces model size and memory requirements for LLMs and, in turn, can speed up processing to help lower costs and energy consumption.

The addition of Mixtral-8x7B expands IBM's open, multi-model strategy to meet clients where they are and give them choice and flexibility to scale enterprise AI solutions across their businesses. Through decades-long AI research and development, open collaboration with Meta and Hugging Face, and partnerships with model leaders, IBM is expanding its watsonx.ai model catalog and bringing in new capabilities, languages, and modalities. IBM's enterprise-ready foundation model choices and its watsonx AI and data platform can empower clients to use generative AI to gain new insights and efficiencies, and create new business models based on principles of trust. IBM enables clients to select the right model for the right use cases and price-performance goals for targeted business domains like finance.

CNET Demoted to Untrusted Sources by Wikipedia Editors Due to AI-Generated Content

Once trusted as the staple of technology journalism, the website CNET has been publically demoted to Untrusted Sources on Wikipedia. CNET has faced public criticism since late 2022 for publishing AI-generated articles without disclosing humans did not write them. This practice has culminated in CNET being demoted from Trusted to Untrusted Sources on Wikipedia, following extensive debates between Wikipedia editors. CNET's reputation first declined in 2020 when it was acquired by publisher Red Ventures, who appeared to prioritize advertising and SEO traffic over editorial standards. However, the AI content scandal accelerated CNET's fall from grace. After discovering the AI-written articles, Wikipedia editors argued that CNET should be removed entirely as a reliable source, citing Red Ventures' pattern of misinformation.

One editor called for targeting Red Ventures as "a spam network." AI-generated content poses familiar challenges to spam bots - machine-created text that is frequently low quality or inaccurate. However, CNET claims it has stopped publishing AI content. This controversy highlights rising concerns about AI-generated text online. Using AI-generated stories might seem interesting as it lowers the publishing time; however, these stories usually rank low in the Google search index, as the engine detects and penalizes AI-generated content probably because Google's AI detection algorithms used the same training datasets as models used to write the text. Lawsuits like The New York Times v. OpenAI also allege AIs must scrape vast amounts of text without permission. As AI capabilities advance, maintaining information quality on the web will require increased diligence. But demoting once-reputable sites like CNET as trusted sources when they disregard ethics and quality control helps set a necessary precedent. Below, you can see the Wikipedia table about CNET.

Google: CPUs are Leading AI Inference Workloads, Not GPUs

The AI infrastructure of today is mostly fueled by the expansion that relies on GPU-accelerated servers. Google, one of the world's largest hyperscalers, has noted that CPUs are still a leading compute for AI/ML workloads, recorded on their Google Cloud Services cloud internal analysis. During the TechFieldDay event, a speech by Brandon Royal, product manager at Google Cloud, explained the position of CPUs in today's AI game. The AI lifecycle is divided into two parts: training and inference. During training, massive compute capacity is needed, along with enormous memory capacity, to fit ever-expanding AI models into memory. The latest models, like GPT-4 and Gemini, contain billions of parameters and require thousands of GPUs or other accelerators working in parallel to train efficiently.

On the other hand, inference requires less compute intensity but still benefits from acceleration. The pre-trained model is optimized and deployed during inference to make predictions on new data. While less compute is needed than training, latency and throughput are essential for real-time inference. Google found out that, while GPUs are ideal for the training phase, models are often optimized and run inference on CPUs. This means that there are customers who choose CPUs as their medium of AI inference for a wide variety of reasons.

Elon Musk Sues Open AI and Sam Altman for Breach of Founding Contract

Elon Musk in his individual capacity has sued Sam Altman, Gregory Brockman, Open AI and its affiliate companies, of breach of founding contract, and a deviation from its founding goal to be a non-profit tasked with the development of AI toward the benefit of humanity. This lawsuit comes in the wake of Open AI's relationship with Microsoft, which Musk says compromises its founding contract. Musk alleges breach of contract, breach of fiduciary duty, and unfair business practices against Open AI, and demands that the company revert to being open-source with all its technology, and function as a non-profit.

Musk also requests an injunction to prevent Open AI and the other defendants from profiting off Open AI technology. In particular, Musk alleges that GPT-4 isn't open-source, claiming that only Open AI and Microsoft know its inner workings, and Microsoft stands to monetize GPT-4 "for a fortune." Microsoft, interestingly, was not named in the lawsuit as a defendant. Elon Musk sat on the original board of Open AI until his departure in 2018, is said to be a key sponsor of AI acceleration hardware used in the pioneering work done by Open AI.

Intel Sets 100 Million CPU Supply Goal for AI PCs by 2025

Intel has been hyping up their artificial intelligence-augmented processor products since late last year—their "AI Everywhere" marketing push started with the official launch of Intel Core Ultra mobile CPUs, AKA the much-delayed Meteor Lake processor family. CEO, Pat Gelsinger stated (mid-December 2023): "AI innovation is poised to raise the digital economy's impact up to as much as one-third of global gross domestic product...Intel is developing the technologies and solutions that empower customers to seamlessly integrate and effectively run AI in all their applications—in the cloud and, increasingly, locally at the PC and edge, where data is generated and used." Team Blue's presence at this week's MWC Barcelona 2024 event introduced "AI Everywhere Across Network, Edge, Enterprise."

Nikkei Asia sat down with Intel's David Feng—Vice President of Client Computing Group and General Manager of Client Segments. The impressively job-titled executive discussed the "future of AI PCs," and set some lofty sales goals for his firm. According to the Nikkei report, Intel leadership expects to "deliver 40 million AI PCs" this year and a further 60 million units next year—representing "more than 20% of the projected total global PC market in 2025." Feng and his colleagues predict that mainstream customers will prefer to use local "on-device" AI solutions (equipped with NPUs), rather than rely on remote cloud services. Significant Edge AI improvements are expected to arrive with next generation Lunar Lake and Arrow Lake processor families, the latter will be bringing Team Blue NPU technologies to desktop platforms—AMD's Ryzen 8000G series of AM5 APUs launched with XDNA engines last month.

Tiny Corp. Builds AI Platform with Six AMD Radeon RX 7900 XTX GPUs

Tiny Corp., a neural network framework specialist, has revealed intimate details about the ongoing development and building of its "tinybox" system: "I don't think there's much value in secrecy. We have the parts to build 12 boxes and a case that's pretty close to final. Beating back all the PCI-E AER errors was hard, as anyone knows who has tried to build a system like this. Our BOM cost is around $10k, and we are selling them for $15k. We've put a year of engineering into this, it's a lot harder than it first seemed. You are welcome to believe me or not, but unless you are building in huge quantity, you are getting a great deal for $15k." The startup has taken the unusual step of integrating Team Red's current flagship gaming GPU into its AI-crunching platform. Tiny Corp. founder—George Hotz—has documented his past rejections of NVIDIA AI hardware on social media, but TinyBox will not be running AMD's latest Instinct MI300X accelerators. RDNA 3.0 is seemingly favored over CDNA 3.0—perhaps due to growing industry demand for enterprise-grade GPUs.

The rack-mounted 12U TinyBox build houses an AMD EPYC 7532 processor with 128 GB of system memory. Five 1 TB SN850X SSDs take care of storage duties (4 in raid, 1 for boot), and an unoccupied 16x OCP 3.0 slot is designated for networking tasks Two 1600 W PSUs provide necessary electrical juice. The Tiny Corp. social media picture feed indicates that they have acquired a pile of XFX Speedster MERC310 RX 7900 XTX graphics cards—six units are hooked up inside of each TinyBox system. Hotz's young startup has ambitious plans: "The system image shipping with the box will be Ubuntu 22.04. It will only include tinygrad out of the box, but PyTorch and JAX support on AMD have come a long way, and your hardware is your hardware. We make money either way, you are welcome to buy it for any purpose. The goal of the tiny corp is to commoditize the petaflop, and we believe tinygrad is the best way to do it. Solving problems in software is cheaper than in hardware. tinygrad will elucidate the deep structure of what neural networks are. We have 583 preorders, and next week we'll place an order for 100 sets of parts. This is $1M in outlay. We will also ship five of the 12 boxes we have to a few early people who I've communicated with. For everyone else, they start shipping in April. The production line started running yesterday."

Global Server Shipments Expected to Increase by 2.05% in 2024, with AI Servers Accounting For Around 12.1%

TrendForce underscores that the primary momentum for server shipments this year remains with American CSPs. However, due to persistently high inflation and elevated corporate financing costs curtailing capital expenditures, overall demand has not yet returned to pre-pandemic growth levels. Global server shipments are estimated to reach approximately. 13.654 million units in 2024, an increase of about 2.05% YoY. Meanwhile, the market continues to focus on the deployment of AI servers, with their shipment share estimated at around 12.1%.

Foxconn is expected to see the highest growth rate, with an estimated annual increase of about 5-7%. This growth includes significant orders such as Dell's 16G platform, AWS Graviton 3 and 4, Google Genoa, and Microsoft Gen9. In terms of AI server orders, Foxconn has made notable inroads with Oracle and has also secured some AWS ASIC orders.

AAEON BOXER-8653AI & BOXER-8623AI Expand Vertical Market Potential in a More Compact Form

Leading provider of embedded PC solutions, AAEON, is delighted to announce the official launch of two new additions to its rich line of embedded AI systems, the BOXER-8653AI and BOXER-8623AI, which are powered by the NVIDIA Jetson Orin NX and Jetson Orin Nano, respectively. Measuring just 180 mm x 136 mm x 75 mm, both systems are compact and easily wall-mounted for discreet deployment, which AAEON indicate make them ideal for use in both indoor and outdoor settings such as factories and parking lots. Adding to this is the systems' environmental resilience, with the BOXER-8653AI sporting a wide -15°C to 60°C temperature tolerance and the BOXER-8623AI able to operate between -15°C and 65°C, with both supporting a 12 V ~ 24 V power input range via a 2-pin terminal block.

The BOXER-8653AI benefits from the NVIDIA Jetson Orin NX module, offering up to 70 TOPS of AI inference performance for applications that require extremely fast analysis of vast quantities of data. Meanwhile, the BOXER-8623AI utilizes the more efficient, yet still powerful NVIDIA Jetson Orin Nano module, capable of up to 40 TOPS. Both systems consequently make use of the 1024-core NVIDIA Ampere architecture GPU with 32 Tensor Cores.

ServiceNow, Hugging Face & NVIDIA Release StarCoder2 - a New Open-Access LLM Family

ServiceNow, Hugging Face, and NVIDIA today announced the release of StarCoder2, a family of open-access large language models for code generation that sets new standards for performance, transparency, and cost-effectiveness. StarCoder2 was developed in partnership with the BigCode Community, managed by ServiceNow, the leading digital workflow company making the world work better for everyone, and Hugging Face, the most-used open-source platform, where the machine learning community collaborates on models, datasets, and applications. Trained on 619 programming languages, StarCoder2 can be further trained and embedded in enterprise applications to perform specialized tasks such as application source code generation, workflow generation, text summarization, and more. Developers can use its code completion, advanced code summarization, code snippets retrieval, and other capabilities to accelerate innovation and improve productivity.

StarCoder2 offers three model sizes: a 3-billion-parameter model trained by ServiceNow; a 7-billion-parameter model trained by Hugging Face; and a 15-billion-parameter model built by NVIDIA with NVIDIA NeMo and trained on NVIDIA accelerated infrastructure. The smaller variants provide powerful performance while saving on compute costs, as fewer parameters require less computing during inference. In fact, the new 3-billion-parameter model matches the performance of the original StarCoder 15-billion-parameter model. "StarCoder2 stands as a testament to the combined power of open scientific collaboration and responsible AI practices with an ethical data supply chain," emphasized Harm de Vries, lead of ServiceNow's StarCoder2 development team and co-lead of BigCode. "The state-of-the-art open-access model improves on prior generative AI performance to increase developer productivity and provides developers equal access to the benefits of code generation AI, which in turn enables organizations of any size to more easily meet their full business potential."

MiTAC Unleashes Revolutionary Server Solutions, Powering Ahead with 5th Gen Intel Xeon Scalable Processors Accelerated by Intel Data Center GPUs

MiTAC Computing Technology, a subsidiary of MiTAC Holdings Corp., proudly reveals its groundbreaking suite of server solutions that deliver unsurpassed capabilities with the 5th Gen Intel Xeon Scalable Processors. MiTAC introduces its cutting-edge signature platforms that seamlessly integrate the Intel Data Center GPUs, both Intel Max Series and Intel Flex Series, an unparalleled leap in computing performance is unleashed targeting HPC and AI applications.

MiTAC Announce its Full Array of Platforms Supporting the latest 5th Gen Intel Xeon Scalable Processors
Last year, Intel transitioned the right to manufacture and sell products based on Intel Data Center Solution Group designs to MiTAC. MiTAC confidently announces a transformative upgrade to its product offerings, unveiling advanced platforms that epitomize the future of computing. Featured with up to 64 cores, expanded shared cache, increased UPI and DDR5 support, the latest 5th Gen Intel Xeon Scalable Processors deliver remarkable performance per watt gains across various workloads. MiTAC's Intel Server M50FCP Family and Intel Server D50DNP Family fully support the latest 5th Gen Intel Xeon Scalable Processors, made possible through a quick BIOS update and easy technical resource revisions which provide unsurpassed performance to diverse computing environments.

IBM Intros AI-enhanced Data Resilience Solution - a Cyberattack Countermeasure

Cyberattacks are an existential risk, with 89% of organizations ranking ransomware as one of the top five threats to their viability, according to a November 2023 report from TechTarget's Enterprise Strategy Group, a leading analyst firm. And this is just one of many risks to corporate data—insider threats, data exfiltration, hardware failures, and natural disasters also pose significant danger. Moreover, as the just-released 2024 IBM X-Force Threat Intelligence Index states, as the generative AI market becomes more established, it could trigger the maturity of AI as an attack surface, mobilizing even further investment in new tools from cybercriminals. The report notes that enterprises should also recognize that their existing underlying infrastructure is a gateway to their AI models that doesn't require novel tactics from attackers to target.

To help clients counter these threats with earlier and more accurate detection, we're announcing new AI-enhanced versions of the IBM FlashCore Module technology available inside new IBM Storage FlashSystem products and a new version of IBM Storage Defender software to help organizations improve their ability to detect and respond to ransomware and other cyberattacks that threaten their data. The newly available fourth generation of FlashCore Module (FCM) technology enables artificial intelligence capabilities within the IBM Storage FlashSystem family. FCM works with Storage Defender to provide end-to-end data resilience across primary and secondary workloads with AI-powered sensors designed for earlier notification of cyber threats to help enterprises recover faster.
Return to Keyword Browsing
May 21st, 2024 21:46 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts