Tuesday, January 2nd 2024
Neuchips to Showcase Industry-Leading Gen AI Inferencing Accelerators at CES 2024
Neuchips, a leading AI Application-Specific Integrated Circuits (ASIC) solutions provider, will demo its revolutionary Raptor Gen AI accelerator chip (previously named N3000) and Evo PCIe accelerator card LLM solutions at CES 2024. Raptor, the new chip solution, enables enterprises to deploy large language models (LLMs) inference at a fraction of the cost of existing solutions.
"We are thrilled to unveil our Raptor chip and Evo card to the industry at CES 2024," said Ken Lau, CEO of Neuchips. "Neuchips' solutions represent a massive leap in price to performance for natural language processing. With Neuchips, any organisation can now access the power of LLMs for a wide range of AI applications."Democratising Access to LLMs
Together, Raptor and Evo provide an optimised stack that makes market-leading LLMs readily accessible for enterprises. Neuchips' AI solutions significantly reduce hardware costs compared to existing solutions. The high energy efficiency also minimizes electricity usage, further lowering the total cost of ownership.
At CES 2024, Neuchips will demo Raptor and Evo, accelerating the Whisper and Llama AI chatbots on a Personal AI Assistant application. This solution highlights the power of LLM inferencing for real business needs.
Enterprises interested in test-driving Neuchips' breakthrough performance can visit booth 62700 to enrol in a free trial program. Additional technical sessions will showcase how Raptor and Evo can slash deployment costs for speech-to-text applications.
Raptor Gen AI Accelerator Powers Breakthrough LLM Performance
The Raptor chip delivers up to 200 tera operations per second (TOPS) per chip. Its outstanding performance for AI inferencing operations such as Matrix Multiply, Vector, and embedding table lookup suits Gen-AI and transformer-based AI models. This groundbreaking throughput is achieved via Neuchips' patented compression and efficiency optimisations tailored to neural networks.
Evo Gen 5 PCIe Card Sets New Standard for Acceleration and Low Power Consumption
Complementing Raptor is Neuchips' ultra-low powered Evo acceleration card. Evo combines PCIe Gen 5 with eight lanes and LPDDR5 32 GB to achieve 64 GB/s host I/O bandwidth and 1.6-Tbps per second of memory bandwidth at just 55 watts per card.
As demonstrated with DLRM, Evo also features 100% scalability, allowing customers to linearly increase performance by adding more chips. This modular design ensures investment protection for future AI workloads.
An upcoming half-height half-length (HHHL) form factor product, Viper, set to be launched by the second half of 2024, will provide even greater deployment flexibility. The new series brings data centre-class AI acceleration in a compact design.
Source:
Neuchips
"We are thrilled to unveil our Raptor chip and Evo card to the industry at CES 2024," said Ken Lau, CEO of Neuchips. "Neuchips' solutions represent a massive leap in price to performance for natural language processing. With Neuchips, any organisation can now access the power of LLMs for a wide range of AI applications."Democratising Access to LLMs
Together, Raptor and Evo provide an optimised stack that makes market-leading LLMs readily accessible for enterprises. Neuchips' AI solutions significantly reduce hardware costs compared to existing solutions. The high energy efficiency also minimizes electricity usage, further lowering the total cost of ownership.
At CES 2024, Neuchips will demo Raptor and Evo, accelerating the Whisper and Llama AI chatbots on a Personal AI Assistant application. This solution highlights the power of LLM inferencing for real business needs.
Enterprises interested in test-driving Neuchips' breakthrough performance can visit booth 62700 to enrol in a free trial program. Additional technical sessions will showcase how Raptor and Evo can slash deployment costs for speech-to-text applications.
Raptor Gen AI Accelerator Powers Breakthrough LLM Performance
The Raptor chip delivers up to 200 tera operations per second (TOPS) per chip. Its outstanding performance for AI inferencing operations such as Matrix Multiply, Vector, and embedding table lookup suits Gen-AI and transformer-based AI models. This groundbreaking throughput is achieved via Neuchips' patented compression and efficiency optimisations tailored to neural networks.
Evo Gen 5 PCIe Card Sets New Standard for Acceleration and Low Power Consumption
Complementing Raptor is Neuchips' ultra-low powered Evo acceleration card. Evo combines PCIe Gen 5 with eight lanes and LPDDR5 32 GB to achieve 64 GB/s host I/O bandwidth and 1.6-Tbps per second of memory bandwidth at just 55 watts per card.
As demonstrated with DLRM, Evo also features 100% scalability, allowing customers to linearly increase performance by adding more chips. This modular design ensures investment protection for future AI workloads.
An upcoming half-height half-length (HHHL) form factor product, Viper, set to be launched by the second half of 2024, will provide even greater deployment flexibility. The new series brings data centre-class AI acceleration in a compact design.
13 Comments on Neuchips to Showcase Industry-Leading Gen AI Inferencing Accelerators at CES 2024
Let them make graphics cards and be content with only most of the money instead of all the money.
This is an "inference" hardware, not the "training" hardware. I doubt it'll even make sense to develop an ASIC specifically for training, cause it's supposed to be flexible by design.
NVidia doesn't even make that much of their money off edge AI devices and inference hardware(excluding GPUs). Jettson boards are niche dev. kits, which NVidia can't even produce in numbers. And all of their post-Mellanox stuff is even more of a niche-of-a-niche. Drive PX and CX aren't that hyped up anymore... and as far as I know Tesla dropped it awhile ago, while Mercedes and Toyota either gave up on it, or waiting for Tesla to pave the road and hit all bumps along the way for self-driving and assistive driving regulation(or as it usually goes with Musk - f it up completely).
I think THE biggest reason their ARM acquisition got blocked, is to prevent any possibility of NV creating a monopoly in inference hardware. There are already quite a few AI ASICs on the market, under a bunch of different "catchy" names, like IPU(Inference Processing Unit), VPU(Visual Processing Unit) etc. etc. etc.
Heck, your shiny new flagship phones all have those in them. Even without ASICs and GPUs, you can do it on other commodity hardware or Raspberry Pi, or even on a microcontroller, depending on tasks and performance requirements.
But the real important stuff is the higher level software that makes use of this, and with CUDA, Nvidia has a large head start.
It sits in a closet now, but I did a ton of object identification training on this thing.
I'd gladly take one off your hands for a worthy cause :toast: