- Joined
- Oct 9, 2007
- Messages
- 47,165 (7.57/day)
- Location
- Hyderabad, India
System Name | RBMK-1000 |
---|---|
Processor | AMD Ryzen 7 5700G |
Motherboard | ASUS ROG Strix B450-E Gaming |
Cooling | DeepCool Gammax L240 V2 |
Memory | 2x 8GB G.Skill Sniper X |
Video Card(s) | Palit GeForce RTX 2080 SUPER GameRock |
Storage | Western Digital Black NVMe 512GB |
Display(s) | BenQ 1440p 60 Hz 27-inch |
Case | Corsair Carbide 100R |
Audio Device(s) | ASUS SupremeFX S1220A |
Power Supply | Cooler Master MWE Gold 650W |
Mouse | ASUS ROG Strix Impact |
Keyboard | Gamdias Hermes E2 |
Software | Windows 11 Pro |
Intel and Aible, an end-to-end serverless generative AI (GenAI) and augmented analytics enterprise solution, now offer solutions to shared customers to run advanced GenAI and retrieval-augmented generation (RAG) use cases on multiple generations of Intel Xeon CPUs. The collaboration, which includes engineering optimizations and a benchmarking program, enhances Aible's ability to deliver GenAI results at a low cost for enterprise customers and helps developers embed AI intelligence into applications. Together, the companies offer scalable and efficient AI solutions that draw on high-performing hardware to help customers solve challenges with AI and Intel.
"Customers are looking for efficient, enterprise-grade solutions to harness the power of AI. Our collaboration with Aible shows how we're closely working with the industry to deliver innovation in AI and lowering the barrier to entry for many customers to run the latest GenAI workloads using Intel Xeon processors," said Mishali Naik, Intel senior principal engineer, Data Center and AI Group.
Aible's solutions demonstrate how CPUs can significantly enhance performance across a range of the latest AI workloads, from running language models to RAG. Optimized for Intel processors, Aible's technology utilizes an efficient serverless end-to-end approach for AI, consuming resources only when there are active user requests. For example, the vector database activates for just a few seconds to retrieve information relevant to a user query, and the language model similarly powers up briefly to process and respond to the request. This on-demand operation helps reduce the total cost of ownership (TCO).
While RAG is often implemented using GPUs (graphics processing units) and accelerators to leverage their parallel processing capabilities, Aible's serverless technique, combined with Intel Xeon Scalable processors, allows RAG use cases to be powered entirely by CPUs. The performance data shows that multiple generations of Intel Xeon processors can run RAG workloads efficiently.
Aible enables customers to lower the operational costs of GenAI projects by exclusively utilizing CPUs in serverless form to share the same underlying compute resources more securely across multiple customers. As a comparison, the lowered operational costs can be compared to buying electricity when it's used rather than renting an electricity generator. Moreover, as demand for generative AI grows, the need to optimize both performance and energy consumption becomes more crucial. Aible's CPU-based services offer customers a cost-effective and energy-efficient solution.
According to Aible's benchmark analysis, customers can realize up to a 55x cost saving when running RAG models on their CPU-based serverless solutions1. This cost reduction is a testament to the effectiveness of Aible's CPU-exclusive approach, which sidesteps the need for more expensive GPU-based infrastructures with shared services or dedicated servers.
Intel - including Intel Labs - has worked with Aible to optimize AI workloads on Xeon processors. Notably, by optimizing Aible's code for AVX-512, Aible saw significant performance gains and improved its throughput on Xeon processors, highlighting the impact of strategic software optimizations on overall efficiency.
The combination of RAG models with Intel Xeon processors, facilitated by platforms like Aible, can enable applications such as:
Intel and Aible will demonstrate their solutions at the Amazon Web Services Summit in Washington, D.C., on June 26 and 27. Aible's solutions run on AWS Lambda and are available in the AWS Marketplace.
View at TechPowerUp Main Site
"Customers are looking for efficient, enterprise-grade solutions to harness the power of AI. Our collaboration with Aible shows how we're closely working with the industry to deliver innovation in AI and lowering the barrier to entry for many customers to run the latest GenAI workloads using Intel Xeon processors," said Mishali Naik, Intel senior principal engineer, Data Center and AI Group.
Aible's solutions demonstrate how CPUs can significantly enhance performance across a range of the latest AI workloads, from running language models to RAG. Optimized for Intel processors, Aible's technology utilizes an efficient serverless end-to-end approach for AI, consuming resources only when there are active user requests. For example, the vector database activates for just a few seconds to retrieve information relevant to a user query, and the language model similarly powers up briefly to process and respond to the request. This on-demand operation helps reduce the total cost of ownership (TCO).
While RAG is often implemented using GPUs (graphics processing units) and accelerators to leverage their parallel processing capabilities, Aible's serverless technique, combined with Intel Xeon Scalable processors, allows RAG use cases to be powered entirely by CPUs. The performance data shows that multiple generations of Intel Xeon processors can run RAG workloads efficiently.
Aible enables customers to lower the operational costs of GenAI projects by exclusively utilizing CPUs in serverless form to share the same underlying compute resources more securely across multiple customers. As a comparison, the lowered operational costs can be compared to buying electricity when it's used rather than renting an electricity generator. Moreover, as demand for generative AI grows, the need to optimize both performance and energy consumption becomes more crucial. Aible's CPU-based services offer customers a cost-effective and energy-efficient solution.
According to Aible's benchmark analysis, customers can realize up to a 55x cost saving when running RAG models on their CPU-based serverless solutions1. This cost reduction is a testament to the effectiveness of Aible's CPU-exclusive approach, which sidesteps the need for more expensive GPU-based infrastructures with shared services or dedicated servers.
Intel - including Intel Labs - has worked with Aible to optimize AI workloads on Xeon processors. Notably, by optimizing Aible's code for AVX-512, Aible saw significant performance gains and improved its throughput on Xeon processors, highlighting the impact of strategic software optimizations on overall efficiency.
The combination of RAG models with Intel Xeon processors, facilitated by platforms like Aible, can enable applications such as:
- Natural language processing (NLP)
- Recommendation systems
- Decision support systems
- Content generation
Intel and Aible will demonstrate their solutions at the Amazon Web Services Summit in Washington, D.C., on June 26 and 27. Aible's solutions run on AWS Lambda and are available in the AWS Marketplace.
View at TechPowerUp Main Site