News Posts matching #Wafer Scale Engine

Cerebras Launches the World's Fastest AI Inference

Press Release by

Aug 28th, 2024 08:34 Discuss (5 Comments)

Today, Cerebras Systems, the pioneer in high performance AI compute, announced Cerebras Inference, the fastest AI inference solution in the world. Delivering 1,800 tokens per second for Llama3.1 8B and 450 tokens per second for Llama3.1 70B, Cerebras Inference is 20 times faster than NVIDIA GPU-based solutions in hyperscale clouds. Starting at just 10c per million tokens, Cerebras Inference is priced at a fraction of GPU solutions, providing 100x higher price-performance for AI workloads.

Unlike alternative approaches that compromise accuracy for performance, Cerebras offers the fastest performance while maintaining state of the art accuracy by staying in the 16-bit domain for the entire inference run. Cerebras Inference is priced at a fraction of GPU-based competitors, with pay-as-you-go pricing of 10 cents per million tokens for Llama 3.1 8B and 60 cents per million tokens for Llama 3.1 70B.

Read full story

Cerebras Systems Unveils World's Fastest AI Chip with 4 Trillion Transistors and 900,000 AI cores

Press Release by

AleksandarK

Mar 13th, 2024 08:07 Discuss (3 Comments)

Cerebras Systems, the pioneer in accelerating generative AI, has doubled down on its existing world record of fastest AI chip with the introduction of the Wafer Scale Engine 3. The WSE-3 delivers twice the performance of the previous record-holder, the Cerebras WSE-2, at the same power draw and for the same price. Purpose built for training the industry's largest AI models, the 5 nm-based, 4 trillion transistor WSE-3 powers the Cerebras CS-3 AI supercomputer, delivering 125 petaflops of peak AI performance through 900,000 AI optimized compute cores.

Read full story

Cerebras Unveils Andromeda, a 13.5 Million Core AI Supercomputer that Delivers Near-Perfect Linear Scaling for Large Language Models

Press Release by

AleksandarK

Nov 14th, 2022 11:32 Discuss (0 Comments)

Cerebras Systems, the pioneer in accelerating artificial intelligence (AI) compute, today unveiled Andromeda, a 13.5 million core AI supercomputer, now available and being used for commercial and academic work. Built with a cluster of 16 Cerebras CS-2 systems and leveraging Cerebras MemoryX and SwarmX technologies, Andromeda delivers more than 1 Exaflop of AI compute and 120 Petaflops of dense compute at 16-bit half precision. It is the only AI supercomputer to ever demonstrate near-perfect linear scaling on large language model workloads relying on simple data parallelism alone.

With more than 13.5 million AI-optimized compute cores and fed by 18,176 3rd Gen AMD EPYC processors, Andromeda features more cores than 1,953 Nvidia A100 GPUs and 1.6 times as many cores as the largest supercomputer in the world, Frontier, which has 8.7 million cores. Unlike any known GPU-based cluster, Andromeda delivers near-perfect scaling via simple data parallelism across GPT-class large language models, including GPT-3, GPT-J and GPT-NeoX.

Read full story

Cerebras Systems Sets Record for Largest AI Models Ever Trained on A Single Device

Press Release by

AleksandarK

Jun 23rd, 2022 03:43 Discuss (3 Comments)

Cerebras Systems, the pioneer in high performance artificial intelligence (AI) computing, today announced, for the first time ever, the ability to train models with up to 20 billion parameters on a single CS-2 system - a feat not possible on any other single device. By enabling a single CS-2 to train these models, Cerebras reduces the system engineering time necessary to run large natural language processing (NLP) models from months to minutes. It also eliminates one of the most painful aspects of NLP—namely the partitioning of the model across hundreds or thousands of small graphics processing units (GPU).

"In NLP, bigger models are shown to be more accurate. But traditionally, only a very select few companies had the resources and expertise necessary to do the painstaking work of breaking up these large models and spreading them across hundreds or thousands of graphics processing units," said Andrew Feldman, CEO and Co-Founder of Cerebras Systems. "As a result, only very few companies could train large NLP models - it was too expensive, time-consuming and inaccessible for the rest of the industry. Today we are proud to democratize access to GPT-3 1.3B, GPT-J 6B, GPT-3 13B and GPT-NeoX 20B, enabling the entire AI ecosystem to set up large models in minutes and train them on a single CS-2."

Read full story

Cerebras Introduces the CS-1 System, Home to World's First Trillion Transistor Processor

Raevenlord

Nov 19th, 2019 10:55 Discuss (24 Comments)

Cerebras has introduced the world to its CS-1 system, which will house the company's (and simultaneously, the industry's) most powerful monolithic accelerator.The CS-1 is an integrated solution the size of 15 industry-standard rack units, and packs everything from the Wafer Scale Engine to cooling systems. The CS-1 consumes 20 kW of power, with a full 4 kW dedicated solely to the cooling subsystem, like fans, pumps, and the heat exchanger, 15 kW dedicated to the chip, and 1 kW is totally lost to power supply inefficiencies. Obviously, power supply modules and other cooling subsystems are redundant, and hot-swappable if need be - you can imagine the computational value lost with each millisecond of downtime that were to occur in such a system.

Read full story

Cerebras Systems' Wafer Scale Engine is a Trillion Transistor Processor in a 12" Wafer

Raevenlord

Aug 27th, 2019 09:00 Discuss (20 Comments)

This news isn't properly today's, but it's relevant and interesting enough that I think warrants a news piece on our page. My reasoning is this: in an era where Multi-Chip Modules (MCM) and a chiplet approach to processor fabrication has become a de-facto standard for improving performance and yields, a trillion-transistor processor that eschews those modular design philosophies is interesting enough to give pause.

The Wafer Scale engine has been developed by Cerebras Systems to face the ongoing increase in demand for AI-training engines. However, in workloads where latency occur a very real impact in training times and a system's capability, Cerebras wanted to design a processor that avoided the need for a communication lane for all its cores to communicate - the system is only limited, basically, by transistors' switching times. Its 400,000 cores communicate seamlessly via interconnects, etched on 42,225 square millimeters of silicon (by comparison, NVIDIA's largest GPU is 56.7 times smaller at "just" 815 square millimeters).

Read full story

Return to Keyword Browsing

News Posts matching #Wafer Scale Engine

Cerebras Launches the World's Fastest AI Inference

Cerebras Systems Unveils World's Fastest AI Chip with 4 Trillion Transistors and 900,000 AI cores

Cerebras Unveils Andromeda, a 13.5 Million Core AI Supercomputer that Delivers Near-Perfect Linear Scaling for Large Language Models

Cerebras Systems Sets Record for Largest AI Models Ever Trained on A Single Device

Cerebras Introduces the CS-1 System, Home to World's First Trillion Transistor Processor

Cerebras Systems' Wafer Scale Engine is a Trillion Transistor Processor in a 12" Wafer

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts