• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel Announces "Cooper Lake" 4P-8P Xeons, New Optane Memory, PCIe 4.0 SSDs, and FPGAs for AI

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,229 (7.55/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
Intel today introduced its 3rd Gen Intel Xeon Scalable processors and additions to its hardware and software AI portfolio, enabling customers to accelerate the development and use of AI and analytics workloads running in data center, network and intelligent-edge environments. As the industry's first mainstream server processor with built-in bfloat16 support, Intel's new 3rd Gen Xeon Scalable processors makes artificial intelligence (AI) inference and training more widely deployable on general-purpose CPUs for applications that include image classification, recommendation engines, speech recognition and language modeling.

"The ability to rapidly deploy AI and data analytics is essential for today's businesses. We remain committed to enhancing built-in AI acceleration and software optimizations within the processor that powers the world's data center and edge solutions, as well as delivering an unmatched silicon foundation to unleash insight from data," said Lisa Spelman, Intel corporate vice president and general manager, Xeon and Memory Group.



AI and analytics open new opportunities for customers across a broad range of industries, including finance, healthcare, industrial, telecom and transportation. IDC predicts that by 2021, 75% of commercial enterprise apps will use AI1. And by 2025, IDC estimates that roughly a quarter of all data generated will be created in real time, with various internet of things (IoT) devices creating 95% of that volume growth.

Unequaled Portfolio Breadth and Ecosystem Support for AI and Analytics
Intel's new data platforms, coupled with a thriving ecosystem of partners using Intel AI technologies, are optimized for businesses to monetize their data through the deployment of intelligent AI and analytics services.
  • New 3rd Gen Intel Xeon Scalable Processors: Intel is further extending its investment in built-in AI acceleration in the new 3rd Gen Intel Xeon Scalable processors through the integration of bfloat16 support into the processor's unique Intel DL Boost technology. bfloat16 is a compact numeric format that uses half the bits as today's FP32 format but achieves comparable model accuracy with minimal (if any) software changes required. The addition of bfloat16 support accelerates both AI training and inference performance in the CPU. Intel-optimized distributions for leading deep learning frameworks (including TensorFlow and Pytorch) support bfloat16 and are available through the Intel AI Analytics toolkit. Intel also delivers bfloat16 optimizations into its OpenVINO toolkit and the ONNX Runtime environment to ease inference deployments.
  • The 3rd Gen Intel Xeon Scalable processors (codenamed "Cooper Lake") evolve Intel's 4- and 8-socket processor offering. The processor is designed for deep learning, virtual machine (VM) density, in-memory database, mission-critical applications and analytics-intensive workloads. Customers refreshing aging infrastructure can expect an average estimated gain of 1.9x on popular workloads and up to 2.2x more VMs compared with 5-year-old, 4-socket platform equivalents.
  • New Intel Optane Persistent Memory: As part of the 3rd Gen Intel Xeon Scalable platform, the company also announced the Intel Optane persistent memory 200 series, providing customers up to 4.5 TB of memory per socket to manage data intensive workloads, such as in-memory databases, dense virtualization, analytics and high-powered computing.
  • New Intel 3D NAND SSDs: For systems that store data in all-flash arrays, Intel announced the availability of its next-generation high-capacity Intel 3D NAND SSDs, the Intel SSD D7-P5500 and P5600. These 3D NAND SSDs are built with Intel's latest triple-level cell (TLC) 3D NAND technology and an all-new low-latency PCIe controller to meet the intense IO requirements of AI and analytics workloads and advanced features to improve IT efficiency and data security.
  • First Intel AI-Optimized FPGA: Intel disclosed its upcoming Intel Stratix 10 NX FPGAs, Intel's first AI-optimized FPGAs targeted for high-bandwidth, low-latency AI acceleration. These FPGAs will offer customers customizable, reconfigurable and scalable AI acceleration for compute-demanding applications such as natural language processing and fraud detection. Intel Stratix 10 NX FPGAs include integrated high-bandwidth memory (HBM), high-performance networking capabilities and new AI-optimized arithmetic blocks called AI Tensor Blocks, which contain dense arrays of lower-precision multipliers typically used for AI model arithmetic.
OneAPI Cross-Architecture Development for Ongoing AI Innovation: As Intel expands its advanced AI product portfolio to meet diverse customer needs, it is also paving the way to simplify heterogeneous programming for developers with its oneAPI cross-architecture tools portfolio to accelerate performance and increase productivity. With these advanced tools, developers can accelerate AI workloads across Intel CPUs, GPUs, and FPGAs, and future-proof their code for today's and next generations of Intel processors and accelerators.

Enhanced Intel Select Solutions Portfolio Address IT's Top Requirements: Intel has enhanced its Select Solutions portfolio to accelerate deployment of IT's most urgent requirements highlighting the value of pre-verified solution delivery in today's rapidly evolving business climate. Announced today are 3 new and 5 enhanced Intel Select Solutions focused on analytics, AI and hyper-converged infrastructure. The enhanced Intel Select Solution for Genomics Analytics is being used around the world to find a vaccine for COVID-19 and the new Intel Select Solution for VMware Horizon VDI on vSAN is being used to enhance remote learning.

The 3rd Gen Intel Xeon Scalable processors and Intel Optane persistent memory 200 series are shipping to customers today. In May, Facebook announced that 3rd Gen Intel Xeon Scalable processors are the foundation for its newest Open Compute Platform (OCP) servers, and other leading CSPs, including Alibaba, Baidu and Tencent, have announced they are adopting the next-generation processors. General OEM systems availability is expected in 2H 2020. The Intel SSD D7-P5500 and P5600 3D NAND SSDs are available today. And the Intel Stratix 10 NX FPGA is expected to be available in the 2H 2020.



Complete Slide Deck


View at TechPowerUp Main Site
 
Joined
Jul 25, 2017
Messages
59 (0.02/day)
So compared to a 5 year old system, you will get 2x the performance?

That is innovation..

And funny how they haven’t got any charts comparing them to Epyc/Rome.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,167 (2.81/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
To quote someone from the Phoronix forums:
Hear ye, hear ye! Announcing our NewLake processors! They are the exact same uarch as OldLake, same process node as OldLake, same gfx as OldLake, but clock is 100 Mhz faster! Innovation!! Upgrade now for only $499!!
 
Joined
Jun 19, 2018
Messages
848 (0.36/day)
System Name Batman's CaseLabs Mercury S8 Work Computer
Processor 8086K 5.3Ghz binned delidded by Siliconlottery.com 5.5Ghz 6c12t 5.6Ghz 6c6t on ambient air
Motherboard EVGA Z390 DARK
Cooling Noctua C14S for all overclocking so far Noctua Industrial PWM fan 2000rpm rated (700rpm inaudible)
Memory Gskill Trident Z Royal Silver F4-4600C18D-16GTRS running at 4500Mhz 17-17-17-37 (new mem OC) : )
Video Card(s) AMD WX 4100 Workstation Card (AMD W5400 7nm workstation card coming soon)
Storage Intel Optane 900P 280GB PCIe card as Primary OS drive / (4) Samsung 860Pro 256GB SATA internal
Display(s) Planar 27in 2560x1440 Glossy LG panel with glass bonded to panel for increased clarity
Case CaseLabs Mercury S8 open bench chassis two-tone black front cover with gunmetal frame
Audio Device(s) Creative $25 2.1 speakers lol
Power Supply Seasonic Prime Titanium 700watt fanless
Mouse Logitech MX Master 3 graphite / Glorious Model D matte black / Razer Invicta mousing mat gunmetal
Keyboard HHKB Hybrid Type-S black printed keycaps
Software Work Apps text and statistical
Benchmark Scores Single Thread scores at 5.6Ghz: Cinebench R15 ST - 249 CPU-Z ST - 676 PassMark CPU ST - 3389
When can we purchase Optane DDR5 memory modules for client builds?

And New Optane PCIe 4.0 SSDs with the gen 2 controllers?

Those, the more pertinent questions for simple builders like us. :)
 
Joined
Feb 18, 2005
Messages
5,847 (0.81/day)
Location
Ikenai borderline!
System Name Firelance.
Processor Threadripper 3960X
Motherboard ROG Strix TRX40-E Gaming
Cooling IceGem 360 + 6x Arctic Cooling P12
Memory 8x 16GB Patriot Viper DDR4-3200 CL16
Video Card(s) MSI GeForce RTX 4060 Ti Ventus 2X OC
Storage 2TB WD SN850X (boot), 4TB Crucial P3 (data)
Display(s) 3x AOC Q32E2N (32" 2560x1440 75Hz)
Case Enthoo Pro II Server Edition (Closed Panel) + 6 fans
Power Supply Fractal Design Ion+ 2 Platinum 760W
Mouse Logitech G602
Keyboard Razer Pro Type Ultra
Software Windows 10 Professional x64
When can we purchase Optane DDR5 memory modules for client builds?

Considering DDR5 is still in the prototype stage... a while.
 
Joined
Feb 27, 2007
Messages
51 (0.01/day)
Location
Huntington, NY
System Name Home PC
Processor AMD Ryzen 7 1700
Motherboard ASRock Fatal1ty X370 Gaming K4 AM4
Cooling AMD Wraith Spire
Memory 16 GB Corsair Vengeance PC3000 DDR4
Video Card(s) PowerColor RED DRAGON Radeon RX Vega 56
Storage Samsung 850 Evo 1TB, Crucial MX300 500GB
Display(s) Dell S2719DGF 1440p
Case Phanteks Enthoo Pro Series PH-ES614P
Audio Device(s) Onboard
Power Supply SeaSonic M12II 620 Bronze
Mouse Logitech G9X
Keyboard Dell
Software Windows 10 Pro
New PCIe 4.0 SSDs, and no Intel platform to take advantage of them!
 
Joined
Jan 14, 2019
Messages
12,337 (5.77/day)
Location
Midlands, UK
System Name Nebulon B
Processor AMD Ryzen 7 7800X3D
Motherboard MSi PRO B650M-A WiFi
Cooling be quiet! Dark Rock 4
Memory 2x 24 GB Corsair Vengeance DDR5-4800
Video Card(s) AMD Radeon RX 6750 XT 12 GB
Storage 2 TB Corsair MP600 GS, 2 TB Corsair MP600 R2
Display(s) Dell S3422DWG, 7" Waveshare touchscreen
Case Kolink Citadel Mesh black
Audio Device(s) Logitech Z333 2.1 speakers, AKG Y50 headphones
Power Supply Seasonic Prime GX-750
Mouse Logitech MX Master 2S
Keyboard Logitech G413 SE
Software Bazzite (Fedora Linux) KDE
Hey, a new "Lake"! Just what everyone has been waiting for! :laugh:
 
Joined
Nov 25, 2019
Messages
141 (0.08/day)
"Copper Lake"?

Actually, there are two Os and one P instead of the other way around.
 
Joined
Aug 30, 2006
Messages
7,221 (1.08/day)
System Name ICE-QUAD // ICE-CRUNCH
Processor Q6600 // 2x Xeon 5472
Memory 2GB DDR // 8GB FB-DIMM
Video Card(s) HD3850-AGP // FireGL 3400
Display(s) 2 x Samsung 204Ts = 3200x1200
Audio Device(s) Audigy 2
Software Windows Server 2003 R2 as a Workstation now migrated to W10 with regrets.
Bfloat16 is a disgusting number format. It is going to cause all kind of awful “glitches” in the future. It has one purpose, to shoehorn additional performance using tricks and shortcuts to save money. It uses half the memory of FP32, and the silicon required is less, allowing you to have more BF16 parallel calculations on the same silicon die size, and is a little quicker at calculating. The cost? Inaccuracy. Doesnt matter if you are google and using it to process data to target advertising. Who cares if you get the wrong ad, an odd youtube recommendation, or some video or photo has lost a little quality. For google, they probably can process on less silicon and less memory saving them money. So what’s the problem?

libraries and obfuscation of calculation.

BF16 is often not used singularly and exclusively, but is mixed with FP32 in an ad hoc fashion, to obtain speed gains. Look, says the coder, i gained 30% throughput by optimising the algorithm using BF32 on the multipliers. Great, lets bake that into production.

years later, libraries layered on layers, API’s linked over networks or Internet, some IMPORTANT application will use one of these now standard libraries and deliver most of the time Accurate results but sometimes inaccurate results that could havemajor consequences, and the developer or user is Non the wiser and things will go wrong. Security? financial markets? Rocket trajectories? https://www.bbc.com/future/article/20150505-the-numbers-that-lead-to-disaster

Einstein said Everything should be made as simple as possible, but not simpler. There is a huge risk BF16 will be tomorrows year 2000 problem, or The GPS rollover problem, or other examples. We will regret it, it if gets out of its special case box
 
Joined
Jul 29, 2019
Messages
77 (0.04/day)
Bfloat16 is a disgusting number format. It is going to cause all kind of awful “glitches” in the future. It has one purpose, to shoehorn additional performance using tricks and shortcuts to save money. It uses half the memory of FP32, and the silicon required is less, allowing you to have more BF16 parallel calculations on the same silicon die size, and is a little quicker at calculating. The cost? Inaccuracy. Doesnt matter if you are google and using it to process data to target advertising. Who cares if you get the wrong ad, an odd youtube recommendation, or some video or photo has lost a little quality. For google, they probably can process on less silicon and less memory saving them money. So what’s the problem?

libraries and obfuscation of calculation.

BF16 is often not used singularly and exclusively, but is mixed with FP32 in an ad hoc fashion, to obtain speed gains. Look, says the coder, i gained 30% throughput by optimising the algorithm using BF32 on the multipliers. Great, lets bake that into production.

years later, libraries layered on layers, API’s linked over networks or Internet, some IMPORTANT application will use one of these now standard libraries and deliver most of the time Accurate results but sometimes inaccurate results that could havemajor consequences, and the developer or user is Non the wiser and things will go wrong. Security? financial markets? Rocket trajectories? https://www.bbc.com/future/article/20150505-the-numbers-that-lead-to-disaster

Einstein said Everything should be made as simple as possible, but not simpler. There is a huge risk BF16 will be tomorrows year 2000 problem, or The GPS rollover problem, or other examples. We will regret it, it if gets out of its special case box
Interesting observation, thanks for the link!
 
Joined
Jan 8, 2017
Messages
9,428 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
Bfloat16 is a disgusting number format. It is going to cause all kind of awful “glitches” in the future. It has one purpose, to shoehorn additional performance using tricks and shortcuts to save money. It uses half the memory of FP32, and the silicon required is less, allowing you to have more BF16 parallel calculations on the same silicon die size, and is a little quicker at calculating. The cost? Inaccuracy. Doesnt matter if you are google and using it to process data to target advertising. Who cares if you get the wrong ad, an odd youtube recommendation, or some video or photo has lost a little quality. For google, they probably can process on less silicon and less memory saving them money. So what’s the problem?

libraries and obfuscation of calculation.

BF16 is often not used singularly and exclusively, but is mixed with FP32 in an ad hoc fashion, to obtain speed gains. Look, says the coder, i gained 30% throughput by optimising the algorithm using BF32 on the multipliers. Great, lets bake that into production.

years later, libraries layered on layers, API’s linked over networks or Internet, some IMPORTANT application will use one of these now standard libraries and deliver most of the time Accurate results but sometimes inaccurate results that could havemajor consequences, and the developer or user is Non the wiser and things will go wrong. Security? financial markets? Rocket trajectories? https://www.bbc.com/future/article/20150505-the-numbers-that-lead-to-disaster

Einstein said Everything should be made as simple as possible, but not simpler. There is a huge risk BF16 will be tomorrows year 2000 problem, or The GPS rollover problem, or other examples. We will regret it, it if gets out of its special case box

Half floats are obviously used only when the accuracy is enough. Some networks require very low precision, some classification algorithms need as little as 8 bit integers or even 4 bit.

Neural networks are never "accurate" no matter the floating point precision, they can't be by definition, they are statistical models.
 
Joined
Jan 6, 2013
Messages
350 (0.08/day)
So compared to a 5 year old system, you will get 2x the performance?

That is innovation..

And funny how they haven’t got any charts comparing them to Epyc/Rome.
What do you expect? They just doubled the core counts from 28 to 56...same silicon.
 
Top