News Posts matching #GPGPU

Return to Keyword Browsing

"Jaguar Shores" is Intel's Successor to "Falcon Shores" Accelerator for AI and HPC

Intel has prepared "Jaguar Shores," its "next-next" generation AI and HPC accelerator, successor to its upcoming "Falcon Shores" GPU. Revealed during a technical workshop at the SC2024 conference, the chip was unveiled by Intel's Habana Labs division, albeit unintentionally. This announcement positions Jaguar Shores as the successor to Falcon Shores, which is scheduled to launch next year. While details about Jaguar Shores remain sparse, its designation suggests it could be a general-purpose GPU (GPGPU) aimed at both AI training, inferencing, and HPC tasks. Intel's strategy aligns with its push to incorporate advanced manufacturing nodes, such as the 18A process featuring RibbonFET and backside power delivery, which promise significant efficiency gains, so we can expect to see upcoming AI accelerators incorporating these technologies.

Intel's AI chip lineup has faced numerous challenges, including shifting plans for Falcon Shores, which has transitioned from a CPU-GPU hybrid to a standalone GPU, and cancellation of Ponte Vecchio. Despite financial constraints and job cuts, Intel has maintained its focus on developing cutting-edge AI solutions. "We continuously evaluate our roadmap to ensure it aligns with the evolving needs of our customers. While we don't have any new updates to share, we are committed to providing superior enterprise AI solutions across our CPU and accelerator/GPU portfolio." an Intel spokesperson stated. The announcement of Jaguar Shores shows Intel's determination to remain competitive. However, the company faces steep competition. NVIDIA and AMD continue to set benchmarks with performant designs, while Intel has struggled to capture a significant share of the AI training market. The company's Gaudi lineup ends with third generation, and Gaudi IP will get integrated into Falcon Shores.

MSI Launches AMD EPYC 9005 Series CPU-Based Server Solutions

MSI, a leading global provider of high-performance server solutions, today introduced its latest AMD EPYC 9005 Series CPU-based server boards and platforms, engineered to tackle the most demanding data center workloads with leadership performance and efficiency.

Featuring AMD EPYC 9005 Series processors with up to 192 cores and 384 threads, MSI's new server platforms deliver breakthrough compute power, unparalleled density, and exceptional energy efficiency, making them ideal for handling AI-enabled, cloud-native, and business-critical workloads in modern data centers.

JPR: Total PC GPU Shipments Increased by 6% From Last Quarter and 20% Year-to-Year

Jon Peddie Research reports the growth of the global PC-based graphics processor unit (GPU) market reached 76.2 million units in Q4'23 and PC CPU shipments increased an astonishing 24% year over year, the biggest year-to-year increase in two and a half decades. Overall, GPUs will have a compound annual growth rate of 3.6% during 2024-2026 and reach an installed base of almost 5 billion units at the end of the forecast period. Over the next five years, the penetration of discrete GPUs (dGPUs) in the PC will be 30%.

AMD's overall market share decreased by -1.4% from last quarter, Intel's market share increased 2.8, and Nvidia's market share decreased by -1.36%, as indicated in the following chart.

TYAN Upgrades HPC, AI and Data Center Solutions with the Power of 5th Gen Intel Xeon Scalable Processors

TYAN, a leading server platform design manufacturer and a MiTAC Computing Technology Corporation subsidiary, today introduced upgraded server platforms and motherboards based on the brand-new 5th Gen Intel Xeon Scalable Processors, formerly codenamed Emerald Rapids.

5th Gen Intel Xeon processor has increased to 64 cores, featuring a larger shared cache, higher UPI and DDR5 memory speed, as well as PCIe 5.0 with 80 lanes. Growing and excelling with workload-optimized performance, 5th Gen Intel Xeon delivers more compute power and faster memory within the same power envelope as the previous generation. "5th Gen Intel Xeon is the second processor offering inside the 2023 Intel Xeon Scalable platform, offering improved performance and power efficiency to accelerate TCO and operational efficiency", said Eric Kuo, Vice President of Server Infrastructure Business Unit, MiTAC Computing Technology Corporation. "By harnessing the capabilities of Intel's new Xeon CPUs, TYAN's 5th-Gen Intel Xeon-supported solutions are designed to handle the intense demands of HPC, data centers, and AI workloads.

JPR: PC GPU Shipments increased by 11.6% Sequentially from Last Quarter and Decreased by -27% Year-to-Year

Jon Peddie Research reports the growth of the global PC-based graphics processor unit (GPU) market reached 61.6 million units in Q2'23 and PC CPU shipments decreased by -23% year over year. Overall, GPUs will have a compound annual growth rate of 3.70% during 2022-2026 and reach an installed base of 2,998 million units at the end of the forecast period. Over the next five years, the penetration of discrete GPUs (dGPUs) in the PC will grow to reach a level of 32%.

Year to year, total GPU shipments, which include all platforms and all types of GPUs, decreased by -27%, desktop graphics decreased by -36%, and notebooks decreased by -23%.

Tyan Showcases Density With Updated AMD EPYC 2U Server Lineup

Tyan, subsidary of MiTAC, showed off their new range of AMD EPYC based servers with a distinct focus on compute density. These included new introductions to their Transport lineup of configurable servers which now host EPYC 9004 "Genoa" series processors with up to 96-cores each. The new additions come as 2U servers each with a different specialty focus. First up is the Transport SX TN85-B8261, aimed squarely at HPC and AI/ML deployment, with support for up to dual 96-Core EPYC "Genoa" processors, 3 TB of registered ECC DDR5-4800, dual 10GbE via an Intel x550-AT2 as well as 1GbE for IPMI, six PCI-E Gen 5 x16 slots with support for four GPGPUs for ML/HPC compute, and eight NVMe drives at the front of the chassis. An optional more storage focused configuration if you choose not to install GPUs is to have 24 total NVMe SSDs at the front soaking up the 96 lanes of PCI-E.

HBM Supply Leader SK Hynix's Market Share to Exceed 50% in 2023 Due to Demand for AI Servers

A strong growth in AI server shipments has driven demand for high bandwidth memory (HBM). TrendForce reports that the top three HBM suppliers in 2022 were SK hynix, Samsung, and Micron, with 50%, 40%, and 10% market share, respectively. Furthermore, the specifications of high-end AI GPUs designed for deep learning have led to HBM product iteration. To prepare for the launch of NVIDIA H100 and AMD MI300 in 2H23, all three major suppliers are planning for the mass production of HBM3 products. At present, SK hynix is the only supplier that mass produces HBM3 products, and as a result, is projected to increase its market share to 53% as more customers adopt HBM3. Samsung and Micron are expected to start mass production sometime towards the end of this year or early 2024, with HBM market shares of 38% and 9%, respectively.

AI server shipment volume expected to increase by 15.4% in 2023
NVIDIA's DM/ML AI servers are equipped with an average of four or eight high-end graphics cards and two mainstream x86 server CPUs. These servers are primarily used by top US cloud services providers such as Google, AWS, Meta, and Microsoft. TrendForce analysis indicates that the shipment volume of servers with high-end GPGPUs is expected to increase by around 9% in 2022, with approximately 80% of these shipments concentrated in eight major cloud service providers in China and the US. Looking ahead to 2023, Microsoft, Meta, Baidu, and ByteDance will launch generative AI products and services, further boosting AI server shipments. It is estimated that the shipment volume of AI servers will increase by 15.4% this year, and a 12.2% CAGR for AI server shipments is projected from 2023 to 2027.

Chinese GPU Maker Biren Technology Loses its Co-Founder, Only Months After Revealing New GPUs

Golf Jiao, a co-founder and general manager of Biren Technology, has left the company late last month according to insider sources in China. No official statement has been issued by the executive team at Biren Tech, and Jiao has not provided any details regarding his departure from the fabless semiconductor design company. The Shanghai-based firm is a relatively new startup - it was founded in 2019 by several former NVIDIA, Qualcomm and Alibaba veterans. Biren Tech received $726.6 million in funding for its debut range of general-purpose graphics processing units (GPGPUs), also defined as high-performance computing graphics processing units (HPC GPUs).

The company revealed its ambitions to take on NVIDIA's Ampere A100 and Hopper H100 compute platforms, and last August announced two HPC GPUs in the form of the BR100 and BR104. The specifications and performance charts demonstrated impressive figures, but Biren Tech had to roll back its numbers when it was hit by U.S Government enforced sanctions in October 2022. The fabless company had contracted with TSMC to produce its Biren range, and the new set of rules resulted in shipments from the Taiwanese foundry being halted. Biren Tech cut its work force by a third soon after losing its supply chain with TSMC, and the engineering team had to reassess how the BR100 and BR104 would perform on a process node larger than the original 7 nm design. It was decided that a downgrade in transfer rates would appease the legal teams, and get newly redesigned Biren silicon back onto the assembly line.

Shipments of AI Servers Will Climb at CAGR of 10.8% from 2022 to 2026

According to TrendForce's latest survey of the server market, many cloud service providers (CSPs) have begun large-scale investments in the kinds of equipment that support artificial intelligence (AI) technologies. This development is in response to the emergence of new applications such as self-driving cars, artificial intelligence of things (AIoT), and edge computing since 2018. TrendForce estimates that in 2022, AI servers that are equipped with general-purpose GPUs (GPGPUs) accounted for almost 1% of annual global server shipments. Moving into 2023, shipments of AI servers are projected to grow by 8% YoY thanks to ChatBot and similar applications generating demand across AI-related fields. Furthermore, shipments of AI servers are forecasted to increase at a CAGR of 10.8% from 2022 to 2026.

congatec launches 10 new COM-HPC and COM Express Computer-on-Modules with 12th Gen Intel Core processors

congatec - a leading vendor of embedded and edge computing technology - introduces the 12th Generation Intel Core mobile and desktop processors (formerly code named Alder Lake) on 10 new COM-HPC and COM Express Computer-on-Modules. Featuring the latest high performance cores from Intel, the new modules in COM-HPC Size A and C as well as COM Express Type 6 form factors offer major performance gains and improvements for the world of embedded and edge computing systems. Most impressive is the fact that engineers can now leverage Intel's innovative performance hybrid architecture. Offering of up to 14 cores/20 threads on BGA and 16 cores/24 threads on desktop variants (LGA mounted), 12th Gen Intel Core processors provide a quantum leap [1] in multitasking and scalability levels. Next-gen IoT and edge applications benefit from up to 6 or 8 (BGA/LGA) optimized Performance-cores (P-cores) plus up to 8 low power Efficient-cores (E-cores) and DDR5 memory support to accelerate multithreaded applications and execute background tasks more efficiently.

Tianshu Zhixin Big Island GPU is a 37 TeraFLOP FP32 Computing Monster

Tianshu Zhixin, a Chinese startup company dedicated to designing advanced processors for accelerating various kinds of tasks, has officially entered the production of its latest GPGPU design. Called "Big Island" GPU, it is the company's entry into the GPU market, currently dominated by AMD, NVIDIA, and soon Intel. So what is so special about Tianshu Zhixin's Big Island GPU, making it so important? Firstly, it represents China's attempt of independence from the outside processor suppliers, ensuring maximum security at all times. Secondly, it is an interesting feat to enter a market that is controlled by big players and attempt to grab a piece of that cake. To be successful, the GPU needs to represent a great design.

And great it is, at least on paper. The specifications list that Big Island is currently being manufactured on TSMC's 7 nm node using CoWoS packaging technology, enabling the die to feature over 24 billion transistors. When it comes to performance, the company claims that the GPU is capable of crunching 37 TeraFLOPs of single-precision FP32 data. At FP16/BF16 half-precision, the chip is capable of outputting 147 TeraFLOPs. When it comes to integer performance, it can achieve 317, 147, and 295 TOPS in INT32, INT16, and IN8 respectively. There is no data on double-precision floating-point numbers, so the chip is optimized for single-precision workloads. There is also 32 GB of HBM2 memory present, and it has 1.2 TB of bandwidth. If we compare the chip to the competing offers like NVIDIA A100 or AMD MI100, the new Big Island GPU outperforms both at single-precision FP32 compute tasks, for which it is designed.
Tianshu Zhixin Big Island Tianshu Zhixin Big Island Tianshu Zhixin Big Island Tianshu Zhixin Big Island
Pictures of possible solutions follow.

Chinese Tianshu Zhixin Announces Big Island GPGPU on 7 nm, 24 billion Transistors

Chinese company Shanghai Tianshu Zhixin Semiconductor Co., Ltd., commonly known (at least in Asia) as Tianshu Zhixin, has announced the availability of their special-purpose GPGPU, affectionately referred to as Big Island (BI). The BI chip is the first fully domestic-designed solution for the market it caters to, and features close to the latest in semiconductor manufacturing, being built on a 7 nm process featuring 2.5D CoWoS (chip-on-wafer-on-substrate) packaging. The chip is built towards AI and HPC applications foremost, with applications in other industries such as education, medicine, and security. The manufacturing and packaging processes seem eerily similar to those available from Taiwanese TSMC.

Tianshu Zhixin started work on the BI chip as early as 2018, and has announced that the chip features support for most AI and HPC data processing formats, including FP32, FP16, BF16, INT32, INT16, and INT8 (this list is not exhaustive). The company says the chip offers twice the performance of existing mainstream products on the market, and emphasizes its price/performance ratio. The huge chip (it packs as many as 24 billion transistors) is being teased by the company as offering as much as 147 TFLOPs in FP126 workloads, compared to 77.97 TFLOPs in the NVIDIA A100 (54 billion transistors) and 184.6 TFLOPS from the AMD Radeon Instinct MI100 (estimated at 50 billion transistors).

Aetina Launches New Edge AI Computer Powered by the NVIDIA Jetson

Aetina Corp., a provider of high-performance GPGPU solutions, announced the new AN110-XNX edge AI computer leveraging the powerful capabilities of the NVIDIA Jetson Xavier NX, expanding its range of edge AI systems built on the Jetson platform for applications in smart transportation, factories, retail, healthcare, AIoT, robotics, and more.

The AN110-XNX combines the NVIDIA Jetson Xavier NX and Aetina AN110 carrier board in a compact form factor of 87.4 x 68.2 x 52 mm (with fan). AN110-XNX supports the MIPI CSI-2 interface for 1x4k or 2xFHD cameras to handle intensive AI workloads from ultra-high-resolution cameras to more accurate image analysis. It is as small as Aetina's AN110-NAO based on the NVIDIA Jetson Nano platform, but delivers more powerful AI computing via the new Jetson Xavier NX. With 384 CUDA cores, 48 Tensor Cores, and cloud-native capability the Jetson Xavier NX delivers up to 21 TOPS and is the ideal platform to accelerate AI applications. Bundled with the latest NVIDIA Jetpack 4.4 SDK, the energy-efficient module significantly expands the choices now available for developers and customers looking for embedded edge-computing options that demand increased performance to support AI workloads but are constrained by size, weight, power budget, or cost.

Khronos Group Releases OpenCL 3.0

Today, The Khronos Group, an open consortium of industry-leading companies creating advanced interoperability standards, publicly releases the OpenCL 3.0 Provisional Specifications. OpenCL 3.0 realigns the OpenCL roadmap to enable developer-requested functionality to be broadly deployed by hardware vendors, and it significantly increases deployment flexibility by empowering conformant OpenCL implementations to focus on functionality relevant to their target markets. OpenCL 3.0 also integrates subgroup functionality into the core specification, ships with a new OpenCL C 3.0 language specification, uses a new unified specification format, and introduces extensions for asynchronous data copies to enable a new class of embedded processors. The provisional OpenCL 3.0 specifications enable the developer community to provide feedback on GitHub before the specifications and conformance tests are finalized.
OpenCL

AMD-made PlayStation 5 Semi-custom Chip Has Ray-tracing Hardware (not a software solution)

Sony's next-generation PlayStation 5 could land under many Christmas trees...in the year 2020, as the company plans a Holiday 2020 launch for the 4K-ready, 8K-capable entertainment system that has a semi-custom chip many times more powerful than the current generation, to support its lofty design goals. By late-2020, Sony calculates that some form of ray-tracing could be a must-have for gaming, and is working with its chip designer AMD to add just that - hardware-acceleration for ray-tracing, and not just something that's pre-baked or emulated over GPGPU.

Mark Cerny, a system architect at Sony's US headquarters, in an interview with Wired, got into the specifics of the hardware driving the company's big platform launch for the turn of the decade. "There is ray-tracing acceleration in the GPU hardware," he said, adding "which I believe is the statement that people were looking for." Besides raw processing power increases, Sony will focus on getting the memory and storage subsystems right. Both are interdependent, and with fast NAND flash-based storage, Sony can rework memory-management to free up more processing resources. AMD has been rather tight-lipped about ray-tracing on its Radeon GPUs. CEO Lisa Su has been dismissive about the prominence of the tech saying "it's one of the many technologies these days." The company's mid-2019 launch of the "Navi" family of GPUs sees the company skip ray-tracing hardware. The semi-custom chip's GPU at the heart of PlayStation 5 was last reported to be based on the same RDNA architecture.

AMD Doesn't Believe in NVIDIA's DLSS, Stands for Open SMAA and TAA Solutions

A report via PCGamesN places AMD's stance on NVIDIA's DLSS as a rather decided one: the company stands for further development of SMAA (Enhanced Subpixel Morphological Antialiasing) and TAA (Temporal Antialising) solutions on current, open frameworks, which, according to AMD's director of marketing, Sasa Marinkovic, "(...) are going to be widely implemented in today's games, and that run exceptionally well on Radeon VII", instead of investing in yet another proprietary solution. While AMD pointed out that DLSS' market penetration was a low one, that's not the main issue of contention. In fact, AMD decides to go head-on against NVIDIA's own technical presentations, comparing DLSS' image quality and performance benefits against a native-resolution, TAA-enhanced image - they say that SMAA and TAA can work equally as well without "the image artefacts caused by the upscaling and harsh sharpening of DLSS."

Of course, AMD may only be speaking from the point of view of a competitor that has no competing solution. However, company representatives said that they could, in theory, develop something along the lines of DLSS via a GPGPU framework - a task for which AMD's architectures are usually extremely well-suited. But AMD seems to take the eyes of its DLSS-defusing moves, however, as AMD's Nish Neelalojanan, a Gaming division exec, talks about potential DLSS-like implementations across "Some of the other broader available frameworks, like WindowsML and DirectML", and that these are "something we [AMD] are actively looking at optimizing… At some of the previous shows we've shown some of the upscaling, some of the filters available with WindowsML, running really well with some of our Radeon cards." So whether it's an actual image-quality philosophy, or just a competing technology's TTM (time to market) one, only AMD knows.

NVIDIA Does a TrueAudio: RT Cores Also Compute Sound Ray-tracing

Positional audio, like Socialism, follows a cycle of glamorization and investment every few years. Back in 2011-12 when AMD maintained a relatively stronger position in the discrete GPU market, and held GPGPU superiority, it gave a lot of money to GenAudio and Tensilica to co-develop the TrueAudio technology, a GPU-accelerated positional audio DSP, which had a whopping four game title implementations, including and limited to "Thief," "Star Citizen," "Lichdom: Battlemage," and "Murdered: Soul Suspect." The TrueAudio Next DSP which debuted with "Polaris," introduced GPU-accelerated "audio ray-casting" technology, which assumes that audio waves interact differently with different surfaces, much like light; and hence positional audio could be made more realistic. There were a grand total of zero takers for TrueAudio Next. Riding on the presumed success of its RTX technology, NVIDIA wants to develop audio ray-tracing further.

A very curious sentence caught our eye in NVIDIA's micro-site for Turing. The description of RT cores reads that they are specialized components that "accelerate the computation of how light and sound travel in 3D environments at up to 10 Giga Rays per second." This is an ominous sign that NVIDIA is developing a full-blown positional audio programming model that's part of RTX, with an implementation through GameWorks. Such a technology, like TrueAudio Next, could improve positional audio realism by treating sound waves like light and tracing their paths from their origin (think speech from an NPC in a game), to the listener as the sound bounces off the various surfaces in the 3D scene. Real-time ray-tracing(-ish) has captured the entirety of imagination at NVIDIA marketing to the extent that it is allegedly willing to replace "GTX" with "RTX" in its GeForce GPU nomenclature. We don't mean to doomsay emerging technology, but 20 years of development in positional audio has shown that it's better left to game developers to create their own technology that sounds somewhat real; and that initiatives from makers of discrete sound cards (a device on the brink of extinction) and GPUs makers bore no fruit.

Tesla Motors Develops Semi-custom AI Chip with AMD

Tesla Motors, which arguably brought electric vehicles to the luxury-mainstream, is investing big in self-driving cars. Despite its leader Elon Musk's fears and reservations on just how much one must allow artificial intelligence (AI) to develop, the company realized that a true self-driving car cannot be made without giving the car a degree of machine learning and AI, so it can learn its surroundings in real-time, and maneuver itself with some agility. To that extent, Tesla is designing its own AI processor. This SoC (system on chip) will be a semi-custom development, in collaboration with the reigning king of semi-custom chips, AMD.

AMD has with it a clear GPGPU performance advantage over NVIDIA, despite the latter's heavy investments in deep-learning. AMD is probably also banking on good pricing, greater freedom over the IP thanks to open standards, and a vast semi-custom track-record, having developed semi-custom chips with technology giants such as Sony and Microsoft. Musk confirmed that the first car in which you can simply get in, fall asleep, and wake up at your destination, will roll out within two years, hinting at a 2019 rollout. This would mean a bulk of the chip's development is done.

Raja Koduri On a Sabbatical from RTG till December, AMD CEO Takes Over

Raja Koduri, chief of AMD's Radeon Technologies Group (RTG), has reportedly taken an extended leave from the company, running up to December 2017. Ryan Shrout, editor of PC Perspective stated that he got confirmation from the company about this development. Company CEO Lisa Su has taken direct control over RTG in the meantime.

Formed in 2015 after a major internal reorganization, RTG handles a bulk of AMD's graphics IP, developing and marketing products under the Radeon brand, including Radeon RX series consumer graphics chips, Radeon Pro series professional graphics chips, and Radeon Instinct line of GPGPU accelerators. This move is of particular significance as Q4 tends to be the biggest revenue quarter, as sales rally on account of Holiday.

Ethereum Mining Wipes Out Radeon Inventory, AMD Stock Rallies

AMD Radeon graphics cards have always been too good at GPGPU for their own good. The new Ethereum block-chain compute network, with the Ethereum crypto-currency, works really really good with AMD Radeon Graphics CoreNext architecture-based GPUs (that's every AMD GPU since Radeon HD 7000 series). As a result, not only have Ethereum prospectors bought out nearly all inventory of AMD Radeon graphics cards from the market, but also forced an inflation of used AMD Radeon graphics cards on online tech-forums, and used-goods stores on eBay and Amazon. Some of these used cards are priced higher than even launch-prices.

Every $1,000 spent on AMD Radeon hardware towards Ethereum mining is recovered within 2 months, and then as long as your hardware lasts and you're paying your power bills, you're swimming in crypto-currency that can be converted to Bitcoin and even US Dollars. One Ethereum (ETH) exchanges to USD $265 at the time of this writing. There's already $330 million worth Ethereum being traded, and that number is only going to grow as people sell USD or BTC to buy ETH and pay for entry into the Ethereum network, and use ETH as a crypto-currency.

AMD VEGA Cube is a Coffee Mug-sized Contraption with 100 TFLOP/s Compute Power

AMD VEGA Cube (working name), is an unannounced product by the company, which could see the light of the day as a Radeon Instinct deep-learning GPGPU solution. This [grande] coffee mug-sized contraption is four GPU subunit boards making up four sides of a cube (well, cuboid), with two sides making up the air channel, likely with space for a compound heatsink or liquid-cooling block, drawing heat from the GPUs lining the inner walls of the cube. The combined compute power of the VEGA Cube, hence, is 100 TFLOP/s (FP16), or 50 TFLOP/s (FP32, single-precision).

Each GPU board is similar in function to NVIDIA's Tesla P100 NVLink board. It has the GPU, VRM, and a high-speed interconnect. The GPUs here in question could be VEGA 10, a multi-chip module with a 25 TFLOP/s (FP16, 12.5 TFLOP/s FP32) GPU die, and 8 GB of HBM2 memory. There are four such GPU boards facing each other. AMD could deploy its much talked about NVLink-alternative, the GMI Coherent Data Fabric, which enables a 100 GB/s data path between neighboring GPUs. It remains to be seen if AMD makes an actual Radeon Instinct product out of this, or of it will remain a really groovy proof of concept.

NVIDIA Tesla P100 Available on Google Cloud Platform

NVIDIA announced that its flagship GPGPU accelerator, the Tesla P100, will be available through Google Cloud Platform. The company's Tesla K80 accelerator will also be offered. The Google Cloud Platform allows customers to perform specific computing tasks at an infinitesimally lower cost than having to rent hardware in-situ or having to buy it; by offloading your computing tasks to offsite data-centers. IT professionals can build and deploy servers, HPC farms, or even supercomputers, of all shapes and sizes within hours of placing an order online with Google.

The Tesla P100 is a GPGPU with the most powerful GPU in existence - the NVIDIA GP100 "Pascal," featuring 3,584 CUDA cores, up to 16 GB of HBM2 memory, and NVLink high-bandwidth interconnect support. The other high-end GPU accelerators on offer by Google are the Tesla K80, based on a pair of GK210 "Kepler" GPUs, and the AMD FirePro S9300 X2, based on a pair of "Fiji" GPUs.

AMD Corporate Fellow Phil Rogers Jumps Ship to NVIDIA

Phil Rogers, a senior Corporate Fellow with AMD, and one of its longest serving employees, left the company for rival NVIDIA. He was with AMD/ATI for 21 years. Rogers joined AMD in 1994, and was promoted to Corporate Fellow back in 2007. One of his key contributions to the company was seeing the potential in integrating CPU and GPU into one powerful chip, under the Fusion initiative. In his new job at NVIDIA, Rogers will be Chief Software Architect for compute servers. This is a big deal.

NVIDIA is investing a lot of money into cloud computing (as in computing on the cloud), in which people will not only play games rendered on the cloud, but also more serious applications based on deep-learning and AI could be cloud-driven. With the right cloud computing service subscribed, even a tiny personal device like a smartwatch can crunch complex problems. As a chipmaker with high TFLOP/s GPGPU chips, NVIDIA is eyeing itself a big slice of the emerging industry, and wants the best hands on the job. NVIDIA CEO Jen-Hsun Huang has a personal interest in compute servers, and he has entrusted one of its most important jobs to Rogers. This is the second major event of attrition by innovative people at AMD. Recently, CPU architect Jim Keller left AMD after completing a stint at designing its new "Zen" micro-architecture.

AMD Expands Embedded Graphics Lineup

AMD today announced multiple new discrete AMD Embedded Radeon graphics options suitable for multiple form factors. The suite of products is specifically designed to advance the visual and parallel processing capabilities of embedded applications. The graphics cards represent continued AMD commitment to embedded market innovation, providing engineers with more choices to achieve their design goals, from leading performance to energy efficiency.

The new offerings cover a broad range of needs, from 192 GFLOPS to 3 TFLOPS of single precision performance, and from 20 to less than 95 watts of thermal design power. The products are offered as a Multi-Chip Module (MCM), Mobile PCI Express Module (MXM) and PCIe options, with AMD offering the only MCM solutions. All of these products offer extended support and longevity. The new discrete graphics cards offer the right balance of performance, power and graphics memory size, to meet the needs of most customers.

"The demand for rich, vibrant graphics in embedded systems is greater than ever before, and that demand is growing," said Scott Aylor, corporate vice president and general manager, AMD Embedded Solutions. "Our latest additions to the embedded product lineup help designers build mesmerizing user experiences with 4K multi-screen installations and 3-D and interactive displays. In addition, the powerful capabilities of our GPUs can address the toughest parallel compute challenges."

AMD Details Exascale Heterogenous Processor (EHP) for Supercomputers

AMD published a paper with the IEEE for a new high-density computing device concept, which it calls the Exascale Heterogenous Processor or (EHP). It may be a similar acronym to APU (accelerated processing unit), but is both similar and different to it in many ways, which make it suitable for high-density supercomputing nodes. The EHP is a chip that has quite a bit in common with the recently launched "Fiji" GPU, that drives the company's flagship Radeon R9 Fury X graphics card.

The EHP is a combination of a main die, housing a large number of CPU cores, a large GPGPU unit, and an interposer, which connects the main die to 32 GB of HBM2 memory that's on-package, and is used as both main-memory and memory for the integrated GPGPU unit, without memory partitioning, using hUMA (heterogeneous unified memory access). The CPU component consists of 32 cores likely based on the "Zen" micro-architecture, using eight "Zen" quad-core subunits. There's no word on the CU (compute unit) count of the GPGPU core. The EHP in itself will be highly scalable. AMD hopes to get a working sample of this chip out by 2016-17.
Return to Keyword Browsing
Nov 23rd, 2024 10:23 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts