News Posts matching #Tesla

Return to Keyword Browsing

NVIDIA Unveils Tesla V100s Compute Accelerator

NVIDIA updated its compute accelerator product stack with the new Tesla V100s. Available only in the PCIe add-in card (AIC) form-factor for now, the V100s is positioned above the V100 PCIe, and is equipped with faster memory, besides a few silicon-level changes (possibly higher clock-speeds), to facilitate significant increases in throughput. To begin with, the V100s is equipped with 32 GB of HBM2 memory across a 4096-bit memory interface, with higher 553 MHz (1106 MHz effective) memory clock, compared to the 876 MHz memory clock of the V100. This yields a memory bandwidth of roughly 1,134 GB/s compared to 900 GB/s of the V100 PCIe.

NVIDIA did not detail changes to the GPU's core clock-speed, but mentioned the performance throughput numbers on offer: 8.2 TFLOP/s double-precision floating-point performance versus 7 TFLOP/s on the original V100 PCIe; 16.4 TFLOP/s single-precision compared to 14 TFLOP/s on the V100 PCIe; and 130 TFLOP/s deep-learning ops versus 112 TFLOP/s on the V100 PCIe. Company-rated power figures remain unchanged at 250 W typical board power. The company didn't reveal pricing.

NVIDIA Issues Warning to Upgrade Drivers Due to Security Patches

NVIDIA has found a total of five security vulnerabilities with its Windows drivers for GeForce, Quadro and Tesla lineup of graphics cards. These new security risks are labeled as very dangerous and have the potential to cause local code execution, denial of service, or escalation of privileges, unless the system is updated. Users are advised to update their Windows drivers as soon as possible in order to stay secure and avoid all of these vulnerabilities, so be sure to check your drivers for latest version. Exploits are only accessible on Windows based OSes, starting from Windows 7 to Windows 10.

However, one fact that's reassuring is that in order to exploit a system, attacker must have local access to the machine that is running NVIDIA GPU, as remote exploit can not happen. Bellow are the tables provided by NVIDIA that show type of exploit along with rating it carries and which driver versions are affected. There are no mitigations for this exploit, as driver update is the only available solution to secure the system.

NVIDIA Responds to Tesla's In-house Full Self-driving Hardware Development

Tesla held an investor panel in the USA yesterday (April 22) with the entire event, focusing on autonomous vehicles, also streamed on YouTube (replay here). There were many things promised in the course of the event, many of which are outside the scope of this website, but the announcement of Tesla's first full self-driving hardware module made the news in more ways than one as reported right here on TechPowerUp. We had noted how Tesla had traditionally relied on NVIDIA (and then Intel) microcontroller units, as well as NVIDIA self-driving modules in the past, but the new in-house built module had stepped away from the green camp in favor of more control over the feature set.

NVIDIA was quick to respond to this, saying Tesla was incorrect in their comparisons, in that the NVIDIA Drive Xavier at 21 TOPS was not the right comparison, and rather it should have been against NVIDIA's own full self-driving hardware the Drive AGX Pegasus capable of 320 TOPS. Oh, and NVIDIA also claimed Tesla erroneously reported Drive Xavier's performance was 21 TOPS instead of 30 TOPS. It is interesting how one company was quick to recognize itself as the unmarked competition, especially at a time when Intel, via their Mobileye division, have also given them a hard time recently. Perhaps this is a sign of things to come in that self-driving cars, and AI computing in general, is getting too big a market to be left to third-party manufacturing, with larger companies opting for in-house hardware itself. This move does hurt NVIDIA's focus in this field, as market speculation is ongoing that they may end up losing other customers following Tesla's departure.

Tesla Dumps NVIDIA, Designs and Deploys its Own Self-driving AI Chip

Tesla Motors announced the development of its own self-driving car AI processor that runs the company's Autopilot feature across its product line. The company was relying on NVIDIA's DGX processors for Autopilot. Called the Tesla FSD Chip (full self-driving), the processor has been deployed on the latest batches of Model S and Model X since March 2019, and the company looks to expand it to its popular Model 3. Tesla FSD Chip is an FPGA of 250 million gates across 6 billion transistors crammed into a 260 mm² die built on the 14 nm FinFET process at a Samsung Electronics fab in Texas. The chip packs 32 MB of SRAM cache, a 96x96 mul/add array, and a cumulative performance metric per die of 72 TOPS at its rated clock-speed of 2.00 GHz.

A typical Autopilot logic board uses two of these chips. Tesla claims that the chip offers "21 times" the performance of the NVIDIA chip it's replacing. Elon Musk referred to the FSD Chip as "the best chip in the world," and not just on the basis of its huge performance uplift over the previous solution. "Any part of this could fail, and the car will keep driving. The probability of this computer failing is substantially lower than someone losing consciousness - at least an order of magnitude," he added.
Slides with on-die details follow.

beyerdynamic Drives Wireless Innovation with Xelento

beyerdynamic, one of the world's leading manufacturers of headphones, microphones and conferencing products, announces Xelento, a wireless, in-ear headphone combining outstanding quality and cutting technology to create an exciting musical experience. The Xelento wireless features beyerdynamic's innovative, miniaturized Tesla drivers, aptX HD Bluetooth technology and MOSAYC sound personalization by Mimi Defined.

beyerdynamic's Xelento wireless bridges the gap for convenient on-the-go solutions and sophisticated luxury headphones by creating a single solution to combine two worlds. Xelento's drivers and acoustic design provide uncompromising sound with Bluetooth connectivity to enable comfortable listening in any environment. Xelento's breathtaking Tesla drivers are recognized among audio enthusiasts for their incredible sound, precise impulses, exceptional transparency and acoustic balance ranging from tight bass to detailed highs. Featuring a sleek design, the Xelento wireless is both an exquisite piece of jewelry and an audiophile listening device creating a platinum state for in-ear headphones.

NVIDIA Announces Tesla T4 Tensor Core GPU

Fueling the growth of AI services worldwide, NVIDIA today launched an AI data center platform that delivers the industry's most advanced inference acceleration for voice, video, image and recommendation services. The NVIDIA TensorRT Hyperscale Inference Platform features NVIDIA Tesla T4 GPUs based on the company's breakthrough NVIDIA Turing architecture and a comprehensive set of new inference software.

Delivering the fastest performance with lower latency for end-to-end applications, the platform enables hyperscale data centers to offer new services, such as enhanced natural language interactions and direct answers to search queries rather than a list of possible results. "Our customers are racing toward a future where every product and service will be touched and improved by AI," said Ian Buck, vice president and general manager of Accelerated Business at NVIDIA. "The NVIDIA TensorRT Hyperscale Platform has been built to bring this to reality - faster and more efficiently than had been previously thought possible."

Google Cloud Introduces NVIDIA Tesla P4 GPUs, for $430 per Month

Today, we are excited to announce a new addition to the Google Cloud Platform (GCP) GPU family that's optimized for graphics-intensive applications and machine learning inference: the NVIDIA Tesla P4 GPU.

We've come a long way since we introduced our first-generation compute accelerator, the K80 GPU, adding along the way P100 and V100 GPUs that are optimized for machine learning and HPC workloads. The new P4 accelerators, now in beta, provide a good balance of price/performance for remote display applications and real-time machine learning inference.

ASUS Introduces Full Lineup of PCI-E Servers Powered by NVIDIA Tesla GPUs

ASUS, the leading IT Company in server systems, server motherboards, workstations and workstation motherboards, today announced support for the latest NVIDIA AI solutions with NVIDIA Tesla V100 Tensor Core 32GB GPUs and Tesla P4 on its accelerated computing servers.
Artificial intelligence (AI) is translating data into meaningful insights, services and scientific breakthroughs. The size of the neural networks powering this AI revolution has grown tremendously. For instance, today's state of the art neural network model for language translation, Google's MOE model has 8 billion parameters compared to 100 million parameters of models from just two years ago.

To handle these massive models, NVIDIA Tesla V100 offers a 32GB memory configuration, which is double that of the previous generation. Providing 2X the memory improves deep learning training performance for next-generation AI models by up to 50 percent and improves developer productivity, allowing researchers to deliver more AI breakthroughs in less time. Increased memory allows HPC applications to run larger simulations more efficiently than ever before.

Acer Announces New Servers Powered by NVIDIA Tesla GPUs

Acer today announced the new Altos R880 F4 GPU server ahead of GTC Taiwan 2018. It can host up to eight NVIDIA Tesla V100 32GB SXM2 GPU accelerators, where every GPU pair includes one PCIe slot for high-speed interconnect.

The Acer Altos R880 F4 is a member of the HGX-T1 class of NVIDIA GPU-Accelerated Server Platforms. It can significantly enhance performance by using parallel computing power for various applications, including oil and gas, defense, financial services, research, manufacturing, media and entertainment, 3D rendering, deep learning, and mission-critical applications.

NVIDIA Introduces HGX-2, Fusing HPC and AI Computing into Unified Architecture

NVIDIA HGX-2 , the first unified computing platform for both artificial intelligence and high performance computing. The HGX-2 cloud server platform, with multi-precision computing capabilities, provides unique flexibility to support the future of computing. It allows high-precision calculations using FP64 and FP32 for scientific computing and simulations, while also enabling FP16 and Int8 for AI training and inference. This unprecedented versatility meets the requirements of the growing number of applications that combine HPC with AI.

A number of leading computer makers today shared plans to bring to market systems based on the NVIDIA HGX-2 platform. "The world of computing has changed," said Jensen Huang, founder and chief executive officer of NVIDIA, speaking at the GPU Technology Conference Taiwan, which kicked off today. "CPU scaling has slowed at a time when computing demand is skyrocketing. NVIDIA's HGX-2 with Tensor Core GPUs gives the industry a powerful, versatile computing platform that fuses HPC and AI to solve the world's grand challenges."

GIGABYTE Announces Two New Powerful Deep Learning Engines

GIGABYTE, an industry leader in server hardware for high performance computing, has released two new powerful 4U GPU servers to bring massive parallel computing capabilities into your datacenter: the 8 x SXM2 GPU G481-S80, and the 10 x GPU G481-HA0. Both products offer some of the highest GPU density of this form factor available on the market.

As artificial intelligence is becoming more widespread in our daily lives, such as for image recognition, autonomous vehicles or medical research, more organizations need deep learning capabilities in their datacenter. Deep learning requires a powerful engine that can deal with the massive volumes of data processing required. GIGABYTE is proud to provide our customers with two new solutions for such an engine.

beyerdynamic Launches the Amiron Wireless Bluetooth headphones

Vibrant melodies, pulsating rhythms, driving basses: music and movement go together marvelously. That is why the new Amiron wireless combines the flawless sound experience of the legendary Tesla technology by beyerdynamic with the complete freedom of movement that only wireless Bluetooth-headphones can provide. With Amiron wireless, music becomes a dynamic experience in every room and even the finest nuances become audible like never before. The innovative sound personalization via the ground-breaking MIY app takes these closed over-ear headphones to a completely new level of perfection. This makes Amiron wireless the ideal headphones for anyone who wants to enjoy their music without limits and in every room.

INDIVIDUAL LISTENING EXPERIENCE VIA MIY APP
Can tonal perfection be enhanced further? The confident answer to this question by beyerdynamic is "yes". The human ear differs from person to person and changes greatly in the course of a lifetime - as has been confirmed by latest audiology research. That is why beyerdynamic worked with the German experts at Mimi Hearing Technologies in Berlin to design its wireless headphones. Their common goal: headphones that adapt perfectly to the individual hearing of the wearer like a custom-tailored suit - consistent with the motto of the new wireless line of headphones by beyerdynamic: MAKE IT YOURS.

NVIDIA Announces the DGX-2 System - 16x Tesla V100 GPUs, 30 TB NVMe Memory for $400K

NVIDIA's DGX-2 is likely the reason why NVIDIA seems to be slightly less enamored with the consumer graphics card market as of late. Let's be honest: just look at that price-tag, and imagine the rivers of money NVIDIA is making on each of these systems sold. The data center and deep learning markets have been pouring money into NVIDIA's coffers, and so, the company is focusing its efforts in this space. Case in point: the DGX-2, which sports performance of 1920 TFLOPs (Tensor processing); 480 TFLOPs of FP16; half again that value at 240 TFLOPs for FP32 workloads; and 120 TFLOPs on FP64.

NVIDIA's DGX-2 builds upon the original DGX-1 in all ways thinkable. NVIDIA looks at these as readily-deployed processing powerhouses, which include everything any prospective user that requires gargantuan amounts of processing power can deploy in a single system. And the DGX-2 just runs laps around the DGX-1 (which originally sold for $150K) in all aspects: it features 16x 32GB Tesla V100 GPUs (the DGX-1 featured 8x 16 GB Tesla GPUs); 1.5 TB of system ram (the DGX-1 features a paltry 0.5 TB); 30 TB NVMe system storage (the DGX-1 sported 8 TB of such storage space), and even includes a pair of Xeon Platinum CPUs (admittedly, the lowest performance increase in the whole system).

Italian Multinational Gas, Oil Company Fires Off HPC4 Supercomputer

Eni has launched its new HPC4 supercomputer, at its Green Data Center in Ferrera Erbognone, 60 km away from Milan. HPC4 quadruples the Company's computing power and makes it the world's most powerful industrial system. HPC4 has a peak performance of 18.6 Petaflops which, combined with the supercomputing system already in operation (HPC3), increases Eni's computational peak capacity to 22.4 Petaflops.

According to the latest official Top 500 supercomputers list published last November (the next list is due to be published in June 2018), Eni's HPC4 is the only non-governmental and non-institutional system ranking among the top ten most powerful systems in the world. Eni's Green Data Center has been designed as a single IT Infrastructure to host all of HPC's architecture and all the other Business applications.

Tesla Motors Develops Semi-custom AI Chip with AMD

Tesla Motors, which arguably brought electric vehicles to the luxury-mainstream, is investing big in self-driving cars. Despite its leader Elon Musk's fears and reservations on just how much one must allow artificial intelligence (AI) to develop, the company realized that a true self-driving car cannot be made without giving the car a degree of machine learning and AI, so it can learn its surroundings in real-time, and maneuver itself with some agility. To that extent, Tesla is designing its own AI processor. This SoC (system on chip) will be a semi-custom development, in collaboration with the reigning king of semi-custom chips, AMD.

AMD has with it a clear GPGPU performance advantage over NVIDIA, despite the latter's heavy investments in deep-learning. AMD is probably also banking on good pricing, greater freedom over the IP thanks to open standards, and a vast semi-custom track-record, having developed semi-custom chips with technology giants such as Sony and Microsoft. Musk confirmed that the first car in which you can simply get in, fall asleep, and wake up at your destination, will roll out within two years, hinting at a 2019 rollout. This would mean a bulk of the chip's development is done.

NVIDIA, Microsoft Launch Industry-Standard Hyperscale GPU Accelerator

NVIDIA with Microsoft today unveiled blueprints for a new hyperscale GPU accelerator to drive AI cloud computing. Providing hyperscale data centers with a fast, flexible path for AI, the new HGX-1 hyperscale GPU accelerator is an open-source design released in conjunction with Microsoft's Project Olympus.

HGX-1 does for cloud-based AI workloads what ATX -- Advanced Technology eXtended -- did for PC motherboards when it was introduced more than two decades ago. It establishes an industry standard that can be rapidly and efficiently embraced to help meet surging market demand. The new architecture is designed to meet the exploding demand for AI computing in the cloud -- in fields such as autonomous driving, personalized healthcare, superhuman voice recognition, data and video analytics, and molecular simulations.

AMD's VEGA Alive and Well - Announced MI25 VEGA as Deep Learning Accelerator

The team at Videocardz has published a story with some interesting slides regarding AMD's push towards the highly-lucrative deep learning market with their INSTINCT line-up of graphics cards - and VEGA being announced as a full-fledged solution means we are perhaps (hopefully) closer to seeing a solution based on it for the consumer market as well.

Alongside the VEGA-based MI25, AMD also announced the MI6 (5.7 TFLOPS in FP32 operations, with 224 GB/s of memory bandwidth and <150 W of board power), looking suspiciously like a Polaris 10 card in disguise; and the MI8 (which appropriately delivers 8.2 TFLOPS in FP32 computations, as well as 512 GB/s memory bandwidth and <175 W typical board power), with the memory bandwidth numbers being the most telling, and putting the MI8 closely along a Fiji architecture-based solution.

NVIDIA Announces DGX SaturnV: The World's Most Efficient Supercomputer

This week NVIDIA announced their latest innovation to the HPC landscape, the DGX SaturnV. Destined for the likes of universities and companies with a need for deep learning capabilities, the DGX SaturnV sets a new benchmark for energy efficiency in High Performance Computing. While not managing the title of the fastest supercomputer this year, the SaturnV takes a respectable placing of 28th in the top 500 list, while promising much lower running costs for performance on tap.

Capable of delivering 9.46 GFLOPS of computational speed per Watt of energy consumed, it bests last years best effort of 6.67 GFLOPS/W by 42%. The SaturnV is comprised of 125 DGX-1 deep learning systems, and each DGX-1 contains no less than eight Tesla P100 cards. Where a single GTX1080 can churn out 138 GFLOPS of FP16 calculations, a single Telsa P100 can deliver a massive 21.2 TFLOPS. The singular DGX-1 units are already in the field, including being used by NVIDIA themselves.

IBM and NVIDIA Team Up on World's Fastest Deep Learning Enterprise Solution

IBM and NVIDIA today announced collaboration on a new deep learning tool optimized for the latest IBM and NVIDIA technologies to help train computers to think and learn in more human-like ways at a faster pace. Deep learning is a fast growing machine learning method that extracts information by crunching through millions of pieces of data to detect and rank the most important aspects from the data. Publicly supported among leading consumer web and mobile application companies, deep learning is quickly being adopted by more traditional business enterprises.

Deep learning and other artificial intelligence capabilities are being used across a wide range of industry sectors; in banking to advance fraud detection through facial recognition; in automotive for self-driving automobiles and in retail for fully automated call centers with computers that can better understand speech and answer questions.

NVIDIA Launches Maxed-out GP102 Based Quadro P6000

Late last week, NVIDIA announced the TITAN X Pascal, its fastest consumer graphics offering targeted at gamers and PC enthusiasts. The reign of TITAN X Pascal being the fastest single-GPU graphics card could be short-lived, as NVIDIA announced a Quadro product based on the same "GP102" silicon, which maxes out its on-die resources. The new Quadro P6000, announced at SIGGRAPH alongside the GP104-based Quadro P5000, features all 3,840 CUDA cores physically present on the chip.

Besides 3,840 CUDA cores, the P6000 features a maximum FP32 (single-precision floating point) performance of up to 12 TFLOP/s. The card also features 24 GB of GDDR5X memory, across the chip's 384-bit wide memory interface. The Quadro P5000, on the other hand, features 2,560 CUDA cores, up to 8.9 TFLOP/s FP32 performance, and 16 GB of GDDR5X memory across a 256-bit wide memory interface. It's interesting to note that neither cards feature full FP64 (double-precision) machinery, and that is cleverly relegated to NVIDIA's HPC product line, the Tesla P-series.

NVIDIA Announces a PCI-Express Variant of its Tesla P100 HPC Accelerator

NVIDIA announced a PCI-Express add-on card variant of its Tesla P100 HPC accelerator, at the 2016 International Supercomputing Conference, held in Frankfurt, Germany. The card is about 30 cm long, 2-slot thick, and of standard height, and is designed for PCIe multi-slot servers. The company had introduced the Tesla P100 earlier this year in April, with a dense mezzanine form-factor variant for servers with NVLink.

The PCIe variant of the P100 offers slightly lower performance than the NVLink variant, because of lower clock speeds, although the core-configuration of the GP100 silicon remains unchanged. It offers FP64 (double-precision floating-point) performance of 4.70 TFLOP/s, FP32 (single-precision) performance of 9.30 TFLOP/s, and FP16 performance of 18.7 TFLOP/s, compared to the NVLink variant's 5.3 TFLOP/s, 10.6 TFLOP/s, and 21 TFLOP/s, respectively. The card comes in two sub-variants based on memory, there's a 16 GB variant with 720 GB/s memory bandwidth and 4 MB L3 cache, and a 12 GB variant with 548 GB/s and 3 MB L3 cache. Both sub-variants feature 3,584 CUDA cores based on the "Pascal" architecture, and core clock speed of 1300 MHz.

NVIDIA Launches World's First Deep Learning Supercomputer

NVIDIA today unveiled the NVIDIA DGX-1, the world's first deep learning supercomputer to meet the unlimited computing demands of artificial intelligence. The NVIDIA DGX-1 is the first system designed specifically for deep learning -- it comes fully integrated with hardware, deep learning software and development tools for quick, easy deployment. It is a turnkey system that contains a new generation of GPU accelerators, delivering the equivalent throughput of 250 x86 servers.

The DGX-1 deep learning system enables researchers and data scientists to easily harness the power of GPU-accelerated computing to create a new class of intelligent machines that learn, see and perceive the world as humans do. It delivers unprecedented levels of computing power to drive next-generation AI applications, allowing researchers to dramatically reduce the time to train larger, more sophisticated deep neural networks.

NVIDIA Unveils the Tesla P100 HPC Board based on "Pascal" Architecture

NVIDIA unveiled the Tesla P100, the first product based on the company's "Pascal" GPU architecture. At its core is a swanky new multi-chip module, similar in its essential layout to the AMD "Fiji." A 15 billion-transistor GPU die sits on top of a silicon wafer, through which a 4096-bit wide HBM2 memory interface wires it to four 3D HBM2 stacks; and with the wafer sitting on the fiberglass substrate that's rooted into the PCB over a ball-grid array. With the GPU die, wafer, and memory dies put together, this package has a cumulative transistor count of 150 billion transistors. The GPU die is built on the 16 nm FinFET process, and is 600 mm² in area.

The P100 sits on top of a space-efficient PCB that looks less like a video card, and more like a compact module that can be tucked away into ultra-high density supercomputing cluster boxes, such as the new NVIDIA DGX-1. The P100 offers a double-precision (FP64) compute performance of 5.3 TFLOP/s, FP32 performance of 10.6 TFLOP/s, and FP16 performance of a whopping 21.2 TFLOP/s. The chip has registers as big as 14.2 MB, and an L2 cache of 4 MB. In addition to PCI-Express, each P100 chip will be equipped with NVLink, and in-house developed high-bandwidth interconnect by NVIDIA, with bandwidths as high as 80 GB/s per direction, 160 GB/s both directions. This allows extremely high-bandwidth paths between GPUs, so they could share memory and work more like single-GPUs. The P100 is already in volume production, with its target customers already having bought it all the way up to its OEM channel availability some time in Q1-2017.

NVIDIA GP100 Silicon to Feature 4 TFLOPs DPFP Performance

NVIDIA's upcoming flagship GPU based on its next-generation "Pascal" architecture, codenamed GP100, is shaping up to be a number-crunching monster. According to a leaked slide by an NVIDIA research fellow, the company is designing the chip to serve up double-precision floating-point (DPFP) performance as high as 4 TFLOP/s, a 3-fold increase from the 1.31 TFLOP/s offered by the Tesla K20, based on the "Kepler" GK110 silicon.

The same slide also reveals single-precision floating-point (SPFP) performance to be as high as 12 TFLOP/s, four times that of the GK110, and nearly double that of the GM200. The slide also appears to settle the speculation on whether GP100 will use stacked HBM2 memory, or GDDR5X. Given the 1 TB/s memory bandwidth mentioned on the slide, we're inclined to hand it to stacked HBM2.
Return to Keyword Browsing
Dec 18th, 2024 03:48 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts