News Posts matching #AI

Return to Keyword Browsing

NVIDIA Introduces Generative AI Professional Certification

Press Release by

Mar 7th, 2024 15:26 Discuss (9 Comments)

NVIDIA is offering a new professional certification in generative AI to enable developers to establish technical credibility in this important domain. Generative AI is revolutionizing industries worldwide, yet there's a critical skills gap and need to uplevel employees to more fully harness the technology. Available for the first time from NVIDIA, this new professional certification enables developers, career professionals, and others to validate and showcase their generative AI skills and expertise. Our new professional certification program introduces two associate-level generative AI certifications, focusing on proficiency in large language models and multimodal workflow skills.

"Generative AI has moved to center stage as governments, industries and organizations everywhere look to harness its transformative capabilities," NVIDIA founder and CEO Jensen Huang recently said. The certification will become available starting at GTC, where in-person attendees can also access recommended training to prepare for a certification exam. "Organizations in every industry need to increase their expertise in this transformative technology," said Greg Estes, VP of developer programs at NVIDIA. "Our goals are to assist in upskilling workforces, sharpen the skills of qualified professionals, and enable individuals to demonstrate their proficiency in order to gain a competitive advantage in the job market."

Read full story

NVIDIA Data Center GPU Business Predicted to Generate $87 Billion in 2024

by

Mar 7th, 2024 14:07 Discuss (5 Comments)

Omdia, an independent analyst and consultancy firm, has bestowed the title of "Kingmaker" on NVIDIA—thanks to impressive 2023 results in the data server market. The research firm predicts very buoyant numbers for the financial year of 2024—their February Cloud and Datacenter Market snapshot/report guesstimates that Team Green's data center GPU business group has the potential to rake in $87 billion of revenue. Omdia's forecast is based on last year's numbers—Jensen & Co. managed to pull in $34 billion, courtesy of an unmatched/dominant position in the AI GPU industry sector. Analysts have estimated a 150% rise in revenues for in 2024—the majority of popular server manufacturers are reliant on NVIDIA's supply of chips. Super Micro Computer Inc. CEO—Charles Liang—disclosed that his business is experiencing strong demand for cutting-edge server equipment, but complications have slowed down production: "once we have more supply from the chip companies, from NVIDIA, we can ship more to customers."

Demand for AI inference in 2023 accounted for 40% of NVIDIA data center GPU revenue—according Omdia's expert analysis—they predict further growth this year. Team Green's comfortable AI-centric business model could expand to a greater extent—2023 market trends indicated that enterprise customers had spent less on acquiring/upgrading traditional server equipment. Instead, they prioritized the channeling of significant funds into "AI heavyweight hardware." Omdia's report discussed these shifted priorities: "This reaffirms our thesis that end users are prioritizing investment in highly configured server clusters for AI to the detriment of other projects, including delaying the refresh of older server fleets." Late February reports suggest that NVIDIA H100 GPU supply issues are largely resolved—with much improved production timeframes. Insiders at unnamed AI-oriented organizations have admitted that leadership has resorted to selling-off of excess stock. The Omdia forecast proposes—somewhat surprisingly—that H100 GPUs will continue to be "supply-constrained" throughout 2024.

HP Unveils Industry's Largest Portfolio of AI PCs

Press Release by

Mar 7th, 2024 13:08 Discuss (5 Comments)

HP Inc. today announced the industry's largest portfolio of AI PCs leveraging the power of AI to enhance productivity, creativity, and user experiences in hybrid work settings.

In an ever-changing hybrid work landscape, workers are still struggling with disconnection and digital fatigue. HP's 2023 Work Relationship Index reveals that only 27% of knowledge workers have a healthy relationship with work, and 83% believe it's time to redefine our relationships with work. Most employees believe AI will open new opportunities to enjoy work and make their jobs easier, but they need the right AI tools and technology to succeed.

Read full story

NVIDIA and HP Supercharge Data Science and Generative AI on Workstations

Press Release by

Mar 7th, 2024 12:46 Discuss (0 Comments)

NVIDIA and HP Inc. today announced that NVIDIA CUDA-X data processing libraries will be integrated with HP AI workstation solutions to turbocharge the data preparation and processing work that forms the foundation of generative AI development.

Built on the NVIDIA CUDA compute platform, CUDA-X libraries speed data processing for a broad range of data types, including tables, text, images and video. They include the NVIDIA RAPIDS cuDF library, which accelerates the work of the nearly 10 million data scientists using pandas software by up to 110x using an NVIDIA RTX 6000 Ada Generation GPU instead of a CPU-only system, without requiring any code changes.

Read full story

Marvell Announces Industry's First 2 nm Platform for Accelerated Infrastructure Silicon

Press Release by

Mar 7th, 2024 12:36 Discuss (0 Comments)

Marvell Technology, Inc., a leader in data infrastructure semiconductor solutions, is extending its collaboration with TSMC to develop the industry's first technology platform to produce 2 nm semiconductors optimized for accelerated infrastructure.

Behind the Marvell 2 nm platform is the company's industry-leading IP portfolio that covers the full spectrum of infrastructure requirements, including high-speed long-reach SerDes at speeds beyond 200 Gbps, processor subsystems, encryption engines, system-on-chip fabrics, chip-to-chip interconnects, and a variety of high-bandwidth physical layer interfaces for compute, memory, networking and storage architectures. These technologies will serve as the foundation for producing cloud-optimized custom compute accelerators, Ethernet switches, optical and copper interconnect digital signal processors, and other devices for powering AI clusters, cloud data centers and other accelerated infrastructure.

Read full story

AMD Radeon Graphics

Dr. Lisa Su Responds to TinyBox's Radeon RX 7900 XTX GPU Firmware Problems

by

Mar 6th, 2024 15:22 Discuss (24 Comments)

The TinyBox AI server system attracted plenty of media attention last week—its creator, George Hotz, decided to build with AMD RDNA 3.0 GPU hardware rather than the expected/traditional choice of CDNA 3.0. Tiny Corp. is a startup firm dealing in neural network frameworks—they currently "write and maintain tinygrad." Hotz & Co. are in the process of assembling rack-mounted 12U TinyBox systems for customers—an individual server houses an AMD EPYC 7532 processor and six XFX Speedster MERC310 Radeon RX 7900 XTX graphics cards. The Tiny Corp. social media account has engaged in numerous NVIDIA vs. AMD AI hardware debates/tirades—Hotz appears to favor the latter, as evidenced in his latest choice of components. ROCm support on Team Red AI Instinct accelerators is fairly mature at this point in time, but a much newer prospect on gaming-oriented graphics cards.

Tiny Corporation's unusual leveraging of Radeon RX 7900 XTX GPUs in a data center configuration has already hit a developmental roadblock. Yesterday, the company's social media account expressed driver-related frustrations in a public forum: "If AMD open sources their firmware, I'll fix their LLVM spilling bug and write a fuzzer for HSA. Otherwise, it's not worth putting tons of effort into fixing bugs on a platform you don't own." Hotz's latest complaint was taken onboard by AMD's top brass—Dr. Lisa Su responded with the following message: "Thanks for the collaboration and feedback. We are all in to get you a good solution. Team is on it." Her software engineers—within a few hours—managed to fling out a set of fixes in Tiny Corporation's direction. Hotz appreciated the quick turnaround, and proceeded to run a model without encountering major stability issues: "AMD sent me an updated set of firmware blobs to try. They are responsive, and there have been big strides in the driver in the last year. It will be good! This training run is almost 5 hours in, hasn't crashed yet." Tiny Corp. drummed up speculation about AMD open sourcing GPU MES firmware—Hotz disclosed that he will be talking (on the phone) to Team Red leadership.

Intel Gaudi 2 AI Accelerator Powers Through Llama 2 Text Generation

by

Mar 6th, 2024 13:23 Discuss (0 Comments)

Intel's "AI Everywhere" hype campaign has generated the most noise in mainstream and enterprise segments. Team Blue's Gaudi—a family of deep learning accelerators—does not hit the headlines all that often. Their current generation model, Gaudi 2, is overshadowed by Team Green and Red alternatives—according to Intel's official marketing spiel: "it performs competitively on deep learning training and inference, with up to 2.4x faster performance than NVIDIA A100." Habana, an Intel subsidiary, has been working on optimizing Large Language Model (LLM) inference on Gaudi 1 and 2 for a while—their co-operation with Hugging Face has produced impressive results, as of late February. Siddhant Jagtap, an Intel Data Scientist, has demonstrated: "how easy it is to generate text with the Llama 2 family of models (7b, 13b and 70b) using Optimum Habana and a custom pipeline class."

Jagtap reckons that folks will be able to: "run the models with just a few lines of code" on Gaudi 2 accelerators—additionally, Intel's hardware is capable of accepting single and multiple prompts. The custom pipeline class: "has been designed to offer great flexibility and ease of use. Moreover, it provides a high level of abstraction and performs end-to-end text-generation which involves pre-processing and post-processing." His article/blog outlines various prerequisites and methods of getting Llama 2 text generation up and running on Gaudi 2. Jagtap concluded that Habana/Intel has: "presented a custom text-generation pipeline on Intel Gaudi 2 AI accelerator that accepts single or multiple prompts as input. This pipeline offers great flexibility in terms of model size as well as parameters affecting text-generation quality. Furthermore, it is also very easy to use and to plug into your scripts, and is compatible with LangChain." Hugging Face reckons that Gaudi 2 delivers roughly twice the throughput speed of NVIDIA A100 80 GB in both training and inference scenarios. Intel has teased third generation Gaudi accelerators—industry watchdogs believe that next-gen solutions are designed to compete with Team Green H100 AI GPUs.

Chinese Governing Bodies Reportedly Offering "Compute Vouchers" to AI Startups

by

Mar 6th, 2024 12:04 Discuss (1 Comment)

Regional Chinese governments are attempting to prop up local AI startup companies with an intriguing "voucher" support system. A Financial Times article outlines "computing" support packages valued between "$140,000 to $280,000" for fledgling organizations involved in LLM training. Widespread shortages of AI chips and rising data center operation costs are cited as the main factors driving a rollout of strategic subsidizations. The big three—Alibaba, Tencent, and ByteDance—are reportedly less willing to rent out their AI-crunching servers, due to internal operations demanding lengthy compute sessions. China's largest technology companies are believed to hording the vast majority of NVIDIA AI hardware, while smaller competitors are believed to fighting over table scraps. US trade restrictions have further escalated supply issues, with lower-performance/China-specific models being rejected—AMD's Instinct MI309 AI accelerator being the latest example.

The "computer voucher" initiative could be the first part of a wider scheme—reports suggest that regional governing bodies (including Shanghai) are devising another subsidy tier for domestic AI chips. Charlie Chai, an 86Research analyst, reckons that the initial support package is only a short-term solution. He shared this observation with FT: "the voucher is helpful to address the cost barrier, but it will not help with the scarcity of the resources." The Chinese government is reportedly looking into the creation of an alternative state-run system, that will become less reliant on a "Big Tech" data center model. A proposed "East Data West Computing" project could produce a more energy-efficient cluster of AI data centers, combined with a centralized management system.

AMD Stalls on Instinct MI309 China AI Chip Launch Amid US Export Hurdles

by

Mar 5th, 2024 13:54 Discuss (12 Comments)

According to the latest report from Bloomberg, AMD has hit a roadblock in offering its top-of-the-line AI accelerator in the Chinese market. The newest AI chip is called Instinct MI309, a lower-performance Instinct MI300 variant tailored to meet the latest US export rules for selling advanced chips to China-based entities. However, the Instinct MI309 still appears too powerful to gain unconditional approval from the US Department of Commerce, leaving AMD in need of an export license. Originally, the US Department of Commerce made a rule: Total Processing Performance (TPP) score should not exceed 4800, effectively capping AI performance at 600 FP8 TFLOPS. This rule ensures that processors with slightly lower performance may still be sold to Chinese customers, provided their performance density (PD) is sufficiently low.

However, AMD's latest creation, Instinct MI309, is everything but slow. Based on the powerful Instinct MI300, AMD has not managed to bring it down to acceptable levels to acquire a US export license from the Department of Commerce. It is still unknown which Chinese customer was trying to acquire AMD's Instinct MI309; however, it could be one of the Chinese AI labs trying to get ahold of more training hardware for their domestic models. NVIDIA has employed a similar tactic, selling A800 and H800 chips to China, until the US also ended the export of these chips to China. AI labs located in China can only use domestic hardware, including accelerators from Alibaba, Huawei, and Baidu. Cloud services hosting GPUs in US can still be accessed by Chinese companies, but that is currently under US regulators watchlist.

AMD Hires Thomas Zacharia to Expand Strategic AI Relationships

Press Release by

Mar 5th, 2024 03:05 Discuss (7 Comments)

AMD announced that Thomas Zacharia has joined AMD as senior vice president of strategic technology partnerships and public policy. Zacharia will lead the global expansion of AMD public/private relationships with governments, non-governmental organizations (NGOs) and other organizations to help fast-track the deployment of customized AMD-powered AI solutions to meet rapidly growing number of global projects and applications targeting the deployment of AI for the public good.

"Thomas is a distinguished leader with decades of experience successfully creating public/private partnerships that have resulted in consistently deploying the world's most powerful and advanced computing solutions, including the world's fastest supercomputer Frontier," said AMD Chair and CEO Lisa Su. "As the former Director of the U.S.'s largest multi-program science and energy research lab, Thomas is uniquely positioned to leverage his extensive experience advancing the frontiers of science and technology to help countries around the world deploy AMD-powered AI solutions for the public good."

Read full story

IBM Announces Availability of Open-Source Mistral AI Model on watsonx

Press Release by

Mar 5th, 2024 03:03 Discuss (0 Comments)

IBM announced the availability of the popular open-source Mixtral-8x7B large language model (LLM), developed by Mistral AI, on its watsonx AI and data platform, as it continues to expand capabilities to help clients innovate with IBM's own foundation models and those from a range of open-source providers. IBM offers an optimized version of Mixtral-8x7B that, in internal testing, was able to increase throughput—or the amount of data that can be processed in a given time period—by 50 percent when compared to the regular model. This could potentially cut latency by 35-75 percent, depending on batch size—speeding time to insights. This is achieved through a process called quantization, which reduces model size and memory requirements for LLMs and, in turn, can speed up processing to help lower costs and energy consumption.

The addition of Mixtral-8x7B expands IBM's open, multi-model strategy to meet clients where they are and give them choice and flexibility to scale enterprise AI solutions across their businesses. Through decades-long AI research and development, open collaboration with Meta and Hugging Face, and partnerships with model leaders, IBM is expanding its watsonx.ai model catalog and bringing in new capabilities, languages, and modalities. IBM's enterprise-ready foundation model choices and its watsonx AI and data platform can empower clients to use generative AI to gain new insights and efficiencies, and create new business models based on principles of trust. IBM enables clients to select the right model for the right use cases and price-performance goals for targeted business domains like finance.

Read full story

CNET Demoted to Untrusted Sources by Wikipedia Editors Due to AI-Generated Content

by

Mar 4th, 2024 03:18 Discuss (26 Comments)

Once trusted as the staple of technology journalism, the website CNET has been publically demoted to Untrusted Sources on Wikipedia. CNET has faced public criticism since late 2022 for publishing AI-generated articles without disclosing humans did not write them. This practice has culminated in CNET being demoted from Trusted to Untrusted Sources on Wikipedia, following extensive debates between Wikipedia editors. CNET's reputation first declined in 2020 when it was acquired by publisher Red Ventures, who appeared to prioritize advertising and SEO traffic over editorial standards. However, the AI content scandal accelerated CNET's fall from grace. After discovering the AI-written articles, Wikipedia editors argued that CNET should be removed entirely as a reliable source, citing Red Ventures' pattern of misinformation.

One editor called for targeting Red Ventures as "a spam network." AI-generated content poses familiar challenges to spam bots - machine-created text that is frequently low quality or inaccurate. However, CNET claims it has stopped publishing AI content. This controversy highlights rising concerns about AI-generated text online. Using AI-generated stories might seem interesting as it lowers the publishing time; however, these stories usually rank low in the Google search index, as the engine detects and penalizes AI-generated content probably because Google's AI detection algorithms used the same training datasets as models used to write the text. Lawsuits like The New York Times v. OpenAI also allege AIs must scrape vast amounts of text without permission. As AI capabilities advance, maintaining information quality on the web will require increased diligence. But demoting once-reputable sites like CNET as trusted sources when they disregard ethics and quality control helps set a necessary precedent. Below, you can see the Wikipedia table about CNET.

Google: CPUs are Leading AI Inference Workloads, Not GPUs

by

Mar 4th, 2024 01:42 Discuss (6 Comments)

The AI infrastructure of today is mostly fueled by the expansion that relies on GPU-accelerated servers. Google, one of the world's largest hyperscalers, has noted that CPUs are still a leading compute for AI/ML workloads, recorded on their Google Cloud Services cloud internal analysis. During the TechFieldDay event, a speech by Brandon Royal, product manager at Google Cloud, explained the position of CPUs in today's AI game. The AI lifecycle is divided into two parts: training and inference. During training, massive compute capacity is needed, along with enormous memory capacity, to fit ever-expanding AI models into memory. The latest models, like GPT-4 and Gemini, contain billions of parameters and require thousands of GPUs or other accelerators working in parallel to train efficiently.

On the other hand, inference requires less compute intensity but still benefits from acceleration. The pre-trained model is optimized and deployed during inference to make predictions on new data. While less compute is needed than training, latency and throughput are essential for real-time inference. Google found out that, while GPUs are ideal for the training phase, models are often optimized and run inference on CPUs. This means that there are customers who choose CPUs as their medium of AI inference for a wide variety of reasons.

Read full story

Elon Musk Sues Open AI and Sam Altman for Breach of Founding Contract

by

Mar 1st, 2024 05:11 Discuss (94 Comments)

Elon Musk in his individual capacity has sued Sam Altman, Gregory Brockman, Open AI and its affiliate companies, of breach of founding contract, and a deviation from its founding goal to be a non-profit tasked with the development of AI toward the benefit of humanity. This lawsuit comes in the wake of Open AI's relationship with Microsoft, which Musk says compromises its founding contract. Musk alleges breach of contract, breach of fiduciary duty, and unfair business practices against Open AI, and demands that the company revert to being open-source with all its technology, and function as a non-profit.

Musk also requests an injunction to prevent Open AI and the other defendants from profiting off Open AI technology. In particular, Musk alleges that GPT-4 isn't open-source, claiming that only Open AI and Microsoft know its inner workings, and Microsoft stands to monetize GPT-4 "for a fortune." Microsoft, interestingly, was not named in the lawsuit as a defendant. Elon Musk sat on the original board of Open AI until his departure in 2018, is said to be a key sponsor of AI acceleration hardware used in the pioneering work done by Open AI.

Intel Sets 100 Million CPU Supply Goal for AI PCs by 2025

by

Feb 29th, 2024 14:57 Discuss (8 Comments)

Intel has been hyping up their artificial intelligence-augmented processor products since late last year—their "AI Everywhere" marketing push started with the official launch of Intel Core Ultra mobile CPUs, AKA the much-delayed Meteor Lake processor family. CEO, Pat Gelsinger stated (mid-December 2023): "AI innovation is poised to raise the digital economy's impact up to as much as one-third of global gross domestic product...Intel is developing the technologies and solutions that empower customers to seamlessly integrate and effectively run AI in all their applications—in the cloud and, increasingly, locally at the PC and edge, where data is generated and used." Team Blue's presence at this week's MWC Barcelona 2024 event introduced "AI Everywhere Across Network, Edge, Enterprise."

Nikkei Asia sat down with Intel's David Feng—Vice President of Client Computing Group and General Manager of Client Segments. The impressively job-titled executive discussed the "future of AI PCs," and set some lofty sales goals for his firm. According to the Nikkei report, Intel leadership expects to "deliver 40 million AI PCs" this year and a further 60 million units next year—representing "more than 20% of the projected total global PC market in 2025." Feng and his colleagues predict that mainstream customers will prefer to use local "on-device" AI solutions (equipped with NPUs), rather than rely on remote cloud services. Significant Edge AI improvements are expected to arrive with next generation Lunar Lake and Arrow Lake processor families, the latter will be bringing Team Blue NPU technologies to desktop platforms—AMD's Ryzen 8000G series of AM5 APUs launched with XDNA engines last month.

AMD Radeon Graphics

Tiny Corp. Builds AI Platform with Six AMD Radeon RX 7900 XTX GPUs

by

Feb 29th, 2024 14:21 Discuss (25 Comments)

Tiny Corp., a neural network framework specialist, has revealed intimate details about the ongoing development and building of its "tinybox" system: "I don't think there's much value in secrecy. We have the parts to build 12 boxes and a case that's pretty close to final. Beating back all the PCI-E AER errors was hard, as anyone knows who has tried to build a system like this. Our BOM cost is around $10k, and we are selling them for $15k. We've put a year of engineering into this, it's a lot harder than it first seemed. You are welcome to believe me or not, but unless you are building in huge quantity, you are getting a great deal for $15k." The startup has taken the unusual step of integrating Team Red's current flagship gaming GPU into its AI-crunching platform. Tiny Corp. founder—George Hotz—has documented his past rejections of NVIDIA AI hardware on social media, but TinyBox will not be running AMD's latest Instinct MI300X accelerators. RDNA 3.0 is seemingly favored over CDNA 3.0—perhaps due to growing industry demand for enterprise-grade GPUs.

The rack-mounted 12U TinyBox build houses an AMD EPYC 7532 processor with 128 GB of system memory. Five 1 TB SN850X SSDs take care of storage duties (4 in raid, 1 for boot), and an unoccupied 16x OCP 3.0 slot is designated for networking tasks Two 1600 W PSUs provide necessary electrical juice. The Tiny Corp. social media picture feed indicates that they have acquired a pile of XFX Speedster MERC310 RX 7900 XTX graphics cards—six units are hooked up inside of each TinyBox system. Hotz's young startup has ambitious plans: "The system image shipping with the box will be Ubuntu 22.04. It will only include tinygrad out of the box, but PyTorch and JAX support on AMD have come a long way, and your hardware is your hardware. We make money either way, you are welcome to buy it for any purpose. The goal of the tiny corp is to commoditize the petaflop, and we believe tinygrad is the best way to do it. Solving problems in software is cheaper than in hardware. tinygrad will elucidate the deep structure of what neural networks are. We have 583 preorders, and next week we'll place an order for 100 sets of parts. This is $1M in outlay. We will also ship five of the 12 boxes we have to a few early people who I've communicated with. For everyone else, they start shipping in April. The production line started running yesterday."

Global Server Shipments Expected to Increase by 2.05% in 2024, with AI Servers Accounting For Around 12.1%

Press Release by

Feb 29th, 2024 04:30 Discuss (1 Comment)

TrendForce underscores that the primary momentum for server shipments this year remains with American CSPs. However, due to persistently high inflation and elevated corporate financing costs curtailing capital expenditures, overall demand has not yet returned to pre-pandemic growth levels. Global server shipments are estimated to reach approximately. 13.654 million units in 2024, an increase of about 2.05% YoY. Meanwhile, the market continues to focus on the deployment of AI servers, with their shipment share estimated at around 12.1%.

Foxconn is expected to see the highest growth rate, with an estimated annual increase of about 5-7%. This growth includes significant orders such as Dell's 16G platform, AWS Graviton 3 and 4, Google Genoa, and Microsoft Gen9. In terms of AI server orders, Foxconn has made notable inroads with Oracle and has also secured some AWS ASIC orders.

Read full story

AAEON BOXER-8653AI & BOXER-8623AI Expand Vertical Market Potential in a More Compact Form

Press Release by

Feb 28th, 2024 22:04 Discuss (0 Comments)

Leading provider of embedded PC solutions, AAEON, is delighted to announce the official launch of two new additions to its rich line of embedded AI systems, the BOXER-8653AI and BOXER-8623AI, which are powered by the NVIDIA Jetson Orin NX and Jetson Orin Nano, respectively. Measuring just 180 mm x 136 mm x 75 mm, both systems are compact and easily wall-mounted for discreet deployment, which AAEON indicate make them ideal for use in both indoor and outdoor settings such as factories and parking lots. Adding to this is the systems' environmental resilience, with the BOXER-8653AI sporting a wide -15°C to 60°C temperature tolerance and the BOXER-8623AI able to operate between -15°C and 65°C, with both supporting a 12 V ~ 24 V power input range via a 2-pin terminal block.

The BOXER-8653AI benefits from the NVIDIA Jetson Orin NX module, offering up to 70 TOPS of AI inference performance for applications that require extremely fast analysis of vast quantities of data. Meanwhile, the BOXER-8623AI utilizes the more efficient, yet still powerful NVIDIA Jetson Orin Nano module, capable of up to 40 TOPS. Both systems consequently make use of the 1024-core NVIDIA Ampere architecture GPU with 32 Tensor Cores.

Read full story

ServiceNow, Hugging Face & NVIDIA Release StarCoder2 - a New Open-Access LLM Family

Press Release by

Feb 28th, 2024 15:05 Discuss (2 Comments)

ServiceNow, Hugging Face, and NVIDIA today announced the release of StarCoder2, a family of open-access large language models for code generation that sets new standards for performance, transparency, and cost-effectiveness. StarCoder2 was developed in partnership with the BigCode Community, managed by ServiceNow, the leading digital workflow company making the world work better for everyone, and Hugging Face, the most-used open-source platform, where the machine learning community collaborates on models, datasets, and applications. Trained on 619 programming languages, StarCoder2 can be further trained and embedded in enterprise applications to perform specialized tasks such as application source code generation, workflow generation, text summarization, and more. Developers can use its code completion, advanced code summarization, code snippets retrieval, and other capabilities to accelerate innovation and improve productivity.

StarCoder2 offers three model sizes: a 3-billion-parameter model trained by ServiceNow; a 7-billion-parameter model trained by Hugging Face; and a 15-billion-parameter model built by NVIDIA with NVIDIA NeMo and trained on NVIDIA accelerated infrastructure. The smaller variants provide powerful performance while saving on compute costs, as fewer parameters require less computing during inference. In fact, the new 3-billion-parameter model matches the performance of the original StarCoder 15-billion-parameter model. "StarCoder2 stands as a testament to the combined power of open scientific collaboration and responsible AI practices with an ethical data supply chain," emphasized Harm de Vries, lead of ServiceNow's StarCoder2 development team and co-lead of BigCode. "The state-of-the-art open-access model improves on prior generative AI performance to increase developer productivity and provides developers equal access to the benefits of code generation AI, which in turn enables organizations of any size to more easily meet their full business potential."

Read full story

MiTAC Unleashes Revolutionary Server Solutions, Powering Ahead with 5th Gen Intel Xeon Scalable Processors Accelerated by Intel Data Center GPUs

Press Release by

Feb 28th, 2024 14:56 Discuss (0 Comments)

MiTAC Computing Technology, a subsidiary of MiTAC Holdings Corp., proudly reveals its groundbreaking suite of server solutions that deliver unsurpassed capabilities with the 5th Gen Intel Xeon Scalable Processors. MiTAC introduces its cutting-edge signature platforms that seamlessly integrate the Intel Data Center GPUs, both Intel Max Series and Intel Flex Series, an unparalleled leap in computing performance is unleashed targeting HPC and AI applications.

MiTAC Announce its Full Array of Platforms Supporting the latest 5th Gen Intel Xeon Scalable Processors
Last year, Intel transitioned the right to manufacture and sell products based on Intel Data Center Solution Group designs to MiTAC. MiTAC confidently announces a transformative upgrade to its product offerings, unveiling advanced platforms that epitomize the future of computing. Featured with up to 64 cores, expanded shared cache, increased UPI and DDR5 support, the latest 5th Gen Intel Xeon Scalable Processors deliver remarkable performance per watt gains across various workloads. MiTAC's Intel Server M50FCP Family and Intel Server D50DNP Family fully support the latest 5th Gen Intel Xeon Scalable Processors, made possible through a quick BIOS update and easy technical resource revisions which provide unsurpassed performance to diverse computing environments.

Read full story

IBM Intros AI-enhanced Data Resilience Solution - a Cyberattack Countermeasure

Press Release by

Feb 28th, 2024 13:59 Discuss (0 Comments)

Cyberattacks are an existential risk, with 89% of organizations ranking ransomware as one of the top five threats to their viability, according to a November 2023 report from TechTarget's Enterprise Strategy Group, a leading analyst firm. And this is just one of many risks to corporate data—insider threats, data exfiltration, hardware failures, and natural disasters also pose significant danger. Moreover, as the just-released 2024 IBM X-Force Threat Intelligence Index states, as the generative AI market becomes more established, it could trigger the maturity of AI as an attack surface, mobilizing even further investment in new tools from cybercriminals. The report notes that enterprises should also recognize that their existing underlying infrastructure is a gateway to their AI models that doesn't require novel tactics from attackers to target.

To help clients counter these threats with earlier and more accurate detection, we're announcing new AI-enhanced versions of the IBM FlashCore Module technology available inside new IBM Storage FlashSystem products and a new version of IBM Storage Defender software to help organizations improve their ability to detect and respond to ransomware and other cyberattacks that threaten their data. The newly available fourth generation of FlashCore Module (FCM) technology enables artificial intelligence capabilities within the IBM Storage FlashSystem family. FCM works with Storage Defender to provide end-to-end data resilience across primary and secondary workloads with AI-powered sensors designed for earlier notification of cyber threats to help enterprises recover faster.

Read full story

Qualcomm AI Hub Introduced at MWC 2024

Press Release by

Feb 28th, 2024 12:00 Discuss (0 Comments)

Qualcomm Technologies, Inc. unveiled its latest advancements in artificial intelligence (AI) at Mobile World Congress (MWC) Barcelona. From the new Qualcomm AI Hub, to cutting-edge research breakthroughs and a display of commercial AI-enabled devices, Qualcomm Technologies is empowering developers and revolutionizing user experiences across a wide range of devices powered by Snapdragon and Qualcomm platforms.

"With Snapdragon 8 Gen 3 for smartphones and Snapdragon X Elite for PCs, we sparked commercialization of on-device AI at scale. Now with the Qualcomm AI Hub, we will empower developers to fully harness the potential of these cutting-edge technologies and create captivating AI-enabled apps," said Durga Malladi, senior vice president and general manager, technology planning and edge solutions, Qualcomm Technologies, Inc. "The Qualcomm AI Hub provides developers with a comprehensive AI model library to quickly and easily integrate pre-optimized AI models into their applications, leading to faster, more reliable and private user experiences."

Read full story

TSMC Customers Request Construction of Additional AI Chip Fabs

by

Feb 28th, 2024 11:15 Discuss (8 Comments)

Morris Chang, TSMC's founder and semiconductor industry icon, was present at the opening ceremony of his company's new semiconductor fabrication plant in Kumamoto Prefecture, Japan. According to a Nikkei Asia article, Chang predicted that the nation will experience "a chip renaissance" during his February 24 commencement speech. The Japanese government also announced that it will supply an additional ¥732 billion ($4.86 billion) in subsidies for Taiwan Semiconductor Manufacturing Co. to expand semiconductor operations on the island of Kyūshū. Economy Minister Ken Saito stated: "TSMC is the most important partner for Japan in realizing digital transformation, and its Kumamoto factory is an important contributor for us to stably procure cutting-edge logic chips that is extremely essential for the future of industries in Japan."

Chang disclosed some interesting insights during last weekend's conference segment—according to Nikkei's report, he revealed that unnamed TSMC customers had made some outlandish requests: "They are not talking about tens of thousands of wafers. They are talking about fabs, (saying): 'We need so many fabs. We need three fabs, five fabs, 10 fabs.' Well, I can hardly believe that one." The Taiwanese chip manufacturing giant reportedly has the resources to create a new "Gigafab" within reasonable timeframes, but demands for (up to) ten new plants are extremely fanciful. Chang set expectations at a reasonable level—he predicted that demand for AI processors would lie somewhere in the middle ground: "between tens of thousands of wafers and tens of fabs." Past insider reports suggested that OpenAI has been discussing the formation of a proprietary fabrication network, with proposed investments of roughly $5 to $7 trillion. OpenAI CEO, Sam Altman, reportedly engaged in talks with notable contract chip manufacturers—The Wall Street Journal posited that TSMC would be an ideal partner.

JPR: Total PC GPU Shipments Increased by 6% From Last Quarter and 20% Year-to-Year

Press Release by

Feb 28th, 2024 08:08 Discuss (2 Comments)

Jon Peddie Research reports the growth of the global PC-based graphics processor unit (GPU) market reached 76.2 million units in Q4'23 and PC CPU shipments increased an astonishing 24% year over year, the biggest year-to-year increase in two and a half decades. Overall, GPUs will have a compound annual growth rate of 3.6% during 2024-2026 and reach an installed base of almost 5 billion units at the end of the forecast period. Over the next five years, the penetration of discrete GPUs (dGPUs) in the PC will be 30%.

AMD's overall market share decreased by -1.4% from last quarter, Intel's market share increased 2.8, and Nvidia's market share decreased by -1.36%, as indicated in the following chart.

Read full story

LG and Meta Forge Collaboration to Accelerate XR Business

Press Release by

Feb 28th, 2024 04:59 Discuss (2 Comments)

LG Electronics (LG) is ramping up its strategic collaboration with the global tech powerhouse, Meta Platforms, Inc. (Meta), aiming to expedite its extended reality (XR) ventures. The aim is to combine the strengths of both companies across products, content, services and platforms to drive innovation in customer experiences within the burgeoning virtual space.

Forging an XR Collaboration With Meta
On February 28, LG's top management, including CEO William Cho and Park Hyoung-sei, president of the Home Entertainment Company, met with Meta Founder and CEO Mark Zuckerberg at LG Twin Towers in Yeouido, Seoul. This meeting coincided with Zuckerberg's tour of Asia. The two-hour session saw discussions on business strategies and considerations for next-gen XR device development. CEO Cho, while experiencing the Meta Quest 3 headset and Ray-Ban Meta smart glasses, expressed a keen interest in Meta's advanced technology demonstrations, notably focusing on Meta's large language models and its potential for on-device AI integration.

Read full story

Return to Keyword Browsing

Jun 2nd, 2024 01:34 EDT change timezone

Latest GPU Drivers

New Forum Posts

01:09 by 64K
Anyone born in the 70s? Remember how good the 90s games were? (1)
01:02 by InVasMani
Dude, youre hotrodding a Dell... (3)
00:47 by InVasMani
Last game you purchased? (315)
00:42 by jarablue
How do you get games for PC? (16)
00:23 by InVasMani
Would you pay more for hardware with AI capabilities? (89)
00:20 by freeagent
Overclock AMD Ryzen 9 5900X On ROG STRIX B550-A GAMING After New Bios Update. (30)
23:50 by kilo
Oddballz eggs used to be on the internet (2)
23:45 by Toothless
What system around 2070 super FTW3 ultra 8GB? (9)
23:41 by thesmokingman
3D/Game Design Workstation (11)
22:35 by sgt.8nemi
how can i install intel dc ssd p3600 ??? (2)

Popular Reviews

May 31st, 2024 SilverStone KL07E Review
May 29th, 2024 ID-Cooling FX360 PRO Review - Shots Fired @ Arctic
May 29th, 2024 NuPhy Air96 V2 Low Profile Wireless Mechanical Keyboard Review
May 24th, 2024 Upcoming Hardware Launches 2024 (Updated May 2024)
May 28th, 2024 Montech Titan Gold 1000 W Review
May 30th, 2024 Elysian Acoustic Labs Pilgrim In-Ear Monitors Review
May 27th, 2024 Senua’s Saga: Hellblade II: DLSS vs. FSR vs. XeSS Comparison Review
May 17th, 2024 Ghost of Tsushima Performance Benchmark Review - 35 GPUs Tested
May 21st, 2024 Senua's Saga: Hellblade II Performance Benchmark Review
Apr 5th, 2023 AMD Ryzen 7 7800X3D Review - The Best Gaming CPU

Controversial News Posts