• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel Contributes AI Acceleration to PyTorch 2.0

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,311 (7.52/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
In the release of Python 2.0, contributions from Intel using Intel Extension for PyTorch, oneAPI Deep Neural Network Library (oneDNN) and additional support for Intel CPUs enable developers to optimize inference and training performance for artificial intelligence (AI).

As part of the PyTorch 2.0 compilation stack, the TorchInductor CPU backend optimization by Intel Extension for PyTorch and PyTorch ATen CPU achieved up to 1.7 times faster FP32 inference performance when benchmarked with TorchBench, HuggingFace and timm. This update brings notable performance improvements to graph compilation over the PyTorch eager mode.



Other optimizations include:
  • Improved message-passing between adjacent neural network nodes to support graph neural network in PyTorch Geometric (PyG) for enhanced inference and performance training on Intel CPUs.
  • New x86 quantization backend - a combination of FBGEMM (Facebook General Matrix-Matrix Multiplication) and oneDNN backends - replaces FBGEMM as the default quantization backend for x86 CPU platforms to enable better end-to-end int8 inference performance.
  • Extended use of oneDNN with oneDNN Graph API to maximize efficient code generation on AI hardware by automatically identifying the graph partitions to be accelerated through fusion. BFloat16 and Float32 data types are supported and only inference workloads can be optimized; BF16 is only optimized on machines with AVX512_BF16 ISA support.

View at TechPowerUp Main Site
 
Top