Qualcomm Launches On-Prem AI Appliance Solution and Inference Suite at CES 2025

Nomad76 · Monday at 6:04 PM

At CES 2025, Qualcomm Technologies, Inc., today announced Qualcomm AI On-Prem Appliance Solution, an on-premises desktop or wall-mounted hardware solution, and Qualcomm AI Inference Suite, a set of software and services for AI inferencing spanning from near-edge to cloud. The combination of these new offerings allows for small and medium businesses, enterprises and industrial organizations to run custom and off-the-shelf AI applications on their premises, including generative workloads. Running AI inference on premises can deliver significant savings in operational costs and overall total cost of ownership (TCO), compared to the cost of renting third-party AI infrastructure.

Using the AI On-Prem Appliance Solution in concert with the AI Inference Suite, customers can now use generative AI leveraging their proprietary data, fine-tuned models, and technology infrastructure to automate human and machine processes and applications in virtually any end environment, such as retail stores, quick service restaurants, shopping outlets, dealerships, hospitals, factories and shop floors - where the workflow is well established, repeatable and ready for automation.

"Our new AI On-Prem Appliance Solution and AI Inference Suite changes the TCO economics of AI deployment by enabling processing of generative AI workloads from cloud-only to a local, on-premises deployment. For a wide variety of AI automation use cases such as in-store assistants, worker coaching, site-specific information, safety compliance, and sales and service enablement at store, dealerships or factory-floors, our AI On-Prem Appliance Solution significantly reduces AI operational costs for enterprise and industrial needs," said Nakul Duggal, Group General Manager for Automotive, Industrial IoT and Cloud Computing, Qualcomm Technologies, Inc. "Enterprises can now accelerate deployment of generative AI applications leveraging their own models, with privacy, personalization and customization while remaining in full control, with confidence that their data will not leave their premises."

The Qualcomm AI On-Prem Appliance Solution is powered by the Qualcomm Cloud AI roadmap of accelerators. It combines the accessibility and performance of a datacenter inference server with the power efficiency, weight and form factor, data privacy, personalization and control of an on-premises AI solution. The AI On-Prem Appliance Solution supports wide-ranging capabilities:

Scalability from a standalone desktop product to a wall-mounted appliance cluster which provides a host of local AI services like voice agents in-a-box, offload of small language models (SLMs), large language models (LLMs) and large multimodal models (LMMs), and retrieval augmented generation (RAG) functions for intelligent indexed search and summarization, agentic AI, AI workflow automation, image generation, code generation, computer vision, and camera processing.
Support for a wide range of generative AI, natural language processing, and computer vision models, both open-source and proprietary, to enable workflow automation for many enterprise applications such as intelligent multilingual search, custom AI assistants and agents, code generation, automated drafting and note-taking, and more.
Camera AI with image, video, and streaming processing for computer vision applications focused on security, safety and site monitoring.

The Qualcomm AI Inference Suite for On-Prem enables enterprise customers and third-party developers with a comprehensive set of tools and libraries for developing or porting generative AI applications to the AI On-Prem Appliance Solution. It features a rich set of application programming interfaces (APIs) including user management and administration, chat, image generation, audio and video generative AI capabilities, OpenAI API compatibility, and RAG. The suite supports integration with popular generative AI models, frameworks, and deployment using Kubernetes or bare containers.

Industry leaders embrace the Qualcomm AI On-Prem Solution and AI Inference Suite
Honeywell is collaborating with Qualcomm Technologies on the design, evaluation and/or deployment of AI workflow automation use cases using the AI On-Prem Solution and AI Inference Suite.

Aetina is among the first OEMs to provide on-premises equipment for enterprise deployments based on the AI On-Prem Appliance Solution. These compact, wall-powered on-premises AI appliances utilize a mix of SLMs for natural language processing, along with fine-tuned LLMs and LMMs (up to 70B parameters) to run enterprise AI agents supporting real-time responses and AI workflow automation functions such as intelligent indexed search and content creation - all enabled by best-in-class power consumption and superior cost-of-ownership architecture.

IBM is collaborating to bring its watsonx data and AI platform and Granite family of AI models for deployment across On-Prem AI appliances, in addition to cloud, to support a range of enterprise and industrial use cases in automotive, manufacturing, retail, and telecommunications.

View at TechPowerUp Main Site | Source

Qualcomm Launches On-Prem AI Appliance Solution and Inference Suite at CES 2025

Nomad76

News Editor