Thursday, March 21st 2019

Intel "Ice Lake" GPU Docs Reveal Unganged Memory Mode

When reading through the Gen11 GT2 whitepaper by Intel, which describes their upcoming integrated graphics architecture, we may have found a groundbreaking piece of information that concerns the memory architecture of computers running 10 nm "Ice Lake" processors. The whitepaper mentions the chip to feature a 4x32-bit LPDDR4/DDR4 interface as opposed to the 2x64-bit LPDDR4/DDR4 interface of current-generation chips such as "Coffee Lake." This is strong evidence that Intel's new architecture will have unganged dual-channel memory controllers (2x 64-bit), as opposed to the monolithic 128-bit IMC found on current-generation chips.

An unganged dual-channel memory interface consists of two independent memory controllers, each handling a 64-bit wide memory channel. This approach lets the processor execute two operations in tandem, given the accesses go to distinct memory banks. On top of that it's now possible to read and write at the same time, something that's can't be done in 128-bit memory mode. From a processor's perspective DRAM is very slow, and what takes up most of the time (= latency), is opening the memory and preparing the read/write operation - the actual data transfer is fairly quick.
With two independent memory controllers these latencies can be mitigated, in several ways in unganged mode. While single-threaded workloads, or workloads that operate on a relatively small problem set, benefit more from ganged mode, unganged mode can shine when multiple (or multi-threaded) applications work with vast amounts of memory, which increases the likelihood that two independent banks of memory get accessed. Perhaps unganged-aware software, such as OS-level memory management could help make the most out of unganged mode, by trying to spread out processes evenly throughout the physical memory, so independent memory accesses can be executed as often as possible.

For integrated graphics, unganged mode is a real killer application though. The iGPU reserves a chunk of system memory for geometry, textures and framebuffer. This memory range is typically placed at the end of the physical memory space, whereas the Windows OS and applications usually are located near the start of physical memory. This effectively gives the GPU its own dedicated memory controller, which also reduces memory latency, because one controller can hold the IGP's memory pages open almost all the time, whereas the second controller takes care of the OS and application memory requests.

AMD has been supporting unganged dual-channel memory interfaces for over a decade now. The company's first Phenom processors introduced unganged memory with a BIOS option to force the CPU to interleave all data, called ganged mode. The consensus among the tech-community over the past ten years and the evolution of the modern processor toward more parallelism favors unganged mode. With CPU core counts heading north of 8 for mainstream-desktop processors, and integrated GPUs becoming the norm, it was natural for Intel to add support for an unganged memory interface.Image Courtesy: ilsistemista.net
Add your own comment

15 Comments on Intel "Ice Lake" GPU Docs Reveal Unganged Memory Mode

#1
laszlo
so amd approach was better and they do the same..

seems they try to improve the perf. this way also; if i read between lines .... they're aware of having perf. issues vs amd.... "Houston, we have a problem"
Posted on Reply
#2
Flyordie
Intel got complacent. Now they are paying for it. MASSIVELY.
Posted on Reply
#3
pjl321
That is some huge spec increases.
Posted on Reply
#4
Steevo
If the IGP is really worth a damn they found the same issue AMD faces, how to feed the high efficiency parallel shader cores fast enough to make them work while not starving your CPU cores.
Posted on Reply
#5
Imsochobo
laszloso amd approach was better and they do the same..

seems they try to improve the perf. this way also; if i read between lines .... they're aware of having perf. issues vs amd.... "Houston, we have a problem"
it may explain how vega is doing so well in the apu's.
Posted on Reply
#6
diatribe
FlyordieIntel got complacent. Now they are paying for it. MASSIVELY.
They're really not though:



AMD for reference:

Posted on Reply
#7
Vayra86
Come on, we all know Intel is straight up copy/pasting technology to quickly get in the higher end of GPUs. This can not be a surprise. Great minds think alike; or look in each others' garden.
Posted on Reply
#8
dj-electric
Vayra86Come on, we all know Intel is straight up copy/pasting technology to quickly get in the higher end of GPUs. This can not be a surprise. Great minds think alike; or look in each others' garden.
They might even try trickier stuff in the future. I don't trust this Raja dude and the Keller whats-his-face. They look like they might copy other people's design like Vega or Ryzen or something. Don't trust those, they look snakey
Posted on Reply
#9
moproblems99
dj-electricThey might even try trickier stuff in the future. I don't trust this Raja dude and the Keller whats-his-face. They look like they might copy other people's design like Vega or Ryzen or something. Don't trust those, they look snakey
I would hope they would copy someone else's GPU.
Posted on Reply
#10
R0H1T
moproblems99I would hope they would copy someone else's GPU.
I'm sure they got some great stuff with the Nvidia licensing agreement previously, maybe they've made the perfect love child of the two with their upcoming dGPU :cool:
Posted on Reply
#11
eidairaman1
The Exiled Airman
diatribeThey're really not though:



AMD for reference:

Considering AMD has had ganged and unganged mode along with ECC definitely on AM3 and I believe even since AM2, yes Intel has been very complacent.
Posted on Reply
#12
SoNic67
Cache memory solves that problem. Level one at CPU core, level two at cluster level...
Only cache misses have to be read from or written to memory.
I don't see as being a huge performance factor.
Posted on Reply
#13
Vya Domus
SoNic67Cache memory solves that problem. Level one at CPU core, level two at cluster level...
Only cache misses have to be read from or written to memory.
I don't see as being a huge performance factor.
Caches are unfortunately not very useful for GPU architectures, they need a lot of instructions/data delivered all at once as opposed to a few instructions/data delivered very quickly as is the case with a CPU (that's a very primitive description but it's good enough).

They need a lot of bandwidth which is rather scarce on the current DDR4 platform, AMD faces the same problem.
Posted on Reply
#14
Caring1
Vayra86Come on, we all know Intel is straight up copy/pasting technology to quickly get in the higher end of GPUs. This can not be a surprise. Great minds think alike; or look in each others' garden.
Reverse engineering is common, yet when the Chinese do it people crack a sad and spit the dummy over lost jobs and revenues.
Posted on Reply
#15
Patriot
Caring1Reverse engineering is common, yet when the Chinese do it people crack a sad and spit the dummy over lost jobs and revenues.
Going to assume you are joking, One is innovation, the other is espionage.
Anyone can steal someones entire IP and manufacture the design.
To figure out how it works, iterate on the design and compete... that is innovation.
Vya DomusCaches are unfortunately not very useful for GPU architectures, they need a lot of instructions/data delivered all at once as opposed to a few instructions/data delivered very quickly as is the case with a CPU (that's a very primitive description but it's good enough).

They need a lot of bandwidth which is rather scarce on the current DDR4 platform, AMD faces the same problem.
Yeah, the 2200g can keep pace with the 2400g when clocked the same despite the ~40% increase in sp, definitely memory starved.
Posted on Reply
Add your own comment
May 29th, 2024 18:33 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts