Thursday, January 21st 2021

"Nehalem" Lead Architect Rejoins Intel to Work on New High-Performance Architecture

The original "Nehalem" CPU microarchitecture from 2008 was pivotal to Intel, as it laid the foundation for Intel's mainline server and client x86 processors for the following 12-odd years. Glenn Hinton, the lead architect behind "Nehalem," announced that he is rejoining Intel after 3 years of retirement, to work on a new high-performance CPU project. Hinton states that his decision to rejoin Intel out of his retirement was influenced by Pat Gelsinger joining the company as its new CEO. Jim Keller, a CPU architecture lead behind several commercially-successful architectures, recently left Intel after a brief stint leading an undisclosed CPU core project. Keller later took up the mantle of CEO at hardware start-up Tenstorrent.

Pat Gelsinger leading Intel is expected to have a big impact on its return to technological leadership in its core businesses, as highlighted in Gelsinger's recent comments on the need for Intel to be better than Apple (which he referred to as "that lifestyle company") at making CPUs, in reference to Apple's new M1 chip taking the ultraportable notebook industry by storm. The other front Intel faces stiff competition from, is AMD, which has achieved IPC parity with Intel, and is beating it on energy-efficiency, taking advantage of the 7 nm silicon fabrication process.
Sources: bizude (Reddit), Dylan Martin (Twitter)
Add your own comment

46 Comments on "Nehalem" Lead Architect Rejoins Intel to Work on New High-Performance Architecture

#26
THU31
The Nehalem i3's (Clarkdale) were pretty good because they were on 32 nm, and you could overclock them like crazy on any board. They did not have an integrated memory controller (it was a separate die on package), but performance was still good thanks to HyperThreading, which gave up to 50% more performance.
Posted on Reply
#27
TheinsanegamerN
UpgrayeddPlease, for the love of humanity design something for mainstream without an iGPU. RKL could've been 10-core without the iGPU.
Why do you people get so hung up on the iGPU? FFS the KF line showed they dont OC any better, nor are they any cheaper. Rocketlake slams into thermal and power restraints already, you REALLY think that removing the iGPU would magically allow two extra cores to work properly in there?

Just buy HDET and be quiet. For 10 years people like you have been mewling about the integrated graphcis like its some massive boat anchor.
Posted on Reply
#28
BArms
tabascosauzHas nothing to do with the iGPU...

Getting rid of the GPU doesn't magically free up power budget or space for more cores.

The 10900 and 10900K literally stretched the ringbus to its limits and the core to core latency was already showing the strain.
You're almost right, it's not magical, but it does in fact free up power budget and space for more cores. Literally.
TheinsanegamerNWhy do you people get so hung up on the iGPU? FFS the KF line showed they dont OC any better, nor are they any cheaper. Rocketlake slams into thermal and power restraints already, you REALLY think that removing the iGPU would magically allow two extra cores to work properly in there?

Just buy HDET and be quiet. For 10 years people like you have been mewling about the integrated graphcis like its some massive boat anchor.
The KF aren't without an iGPU, they're just standard K's with the iGPU disabled. It's still on die taking up space that could be used for wider or more cores. Nobody is saying that Intel should itch iGPUs, just that they're completely unnecessary on their higher end SKUs meant for gamers, as AMD's Zen 3 has shown. At the very least, they should make a highest end "KF" variant that doesn't have an iGPU in silicon at all.
Posted on Reply
#29
_UV_
TheinsanegamerNWhy do you people get so hung up on the iGPU? FFS the KF line showed they dont OC any better, nor are they any cheaper. Rocketlake slams into thermal and power restraints already, you REALLY think that removing the iGPU would magically allow two extra cores to work properly in there?

Just buy HDET and be quiet. For 10 years people like you have been mewling about the integrated graphcis like its some massive boat anchor.
Well, if you compare same generation IGPU with lowest end consumer GPU it will be 2-4x slower, lack of features, steal RAM bandwidth and it will cost you almost same amount of money. The only benefits less power consumption by a few W and you could use it in very generic boards without extra slots.

PS
Also, almost forget, you may encounter troubles with driver support much earlier with IGPU.
Posted on Reply
#30
HansRapad
iGPU is the main selling point of Intel chip for SI

reject it all you want, but DIY market is very small segment of Intel CPU business, in fact AMD is the one that Unable to integrate a iGPU on non G series chip that's why they having hard time getting to SI despite having performance advantages, in sense AMD only able to put 8 core with Vega at maximum, while losing PCiE 4.0 despite having node advantages while Intel able to put 10 core + iGPU which is a big selling point for SI
Posted on Reply
#31
tabascosauz
BArmsYou're almost right, it's not magical, but it does in fact free up power budget and space for more cores. Literally.



The KF aren't without an iGPU, they're just standard K's with the iGPU disabled. It's still on die taking up space that could be used for wider or more cores. Nobody is saying that Intel should itch iGPUs, just that they're completely unnecessary on their higher end SKUs meant for gamers and professionals, as AMD's Zen 3 has shown. At the very least, they should make a highest end "KF" variant that doesn't have an iGPU in silicon at all.
A modern Intel iGPU consumes nearly 0W all the time (if not that) when not actively being used for 3D purposes (which is highly unlikely for any user that cares about this subject) or Quicksync. The PL1 of these processors is somewhere in the 200-300W range for recent 8- and 10-cores. Power budget for the cores is not affected in any way by the iGPU lol.

As for the other "stealing RAM bandwidth" comment when it's not connected to a video output...lmao

As to space, it does take up a quantifiable amount of space, but not any more than the SA does on desktop. Look at Tiger Lake if you want to see what "takes up space" looks like. As to the why, like I said, the ringbus on the 10900K vs the 9900K speaks for itself.
Upgrayeddit doesn't HAVE to be 5GHz and I didn't really know about the core latency on Intel. How does it compare to a 5900x/5950x using Infinity fabric?
Ringbus is considerably faster and lower latency than IF, but it doesn't scale nearly as well, which is why Intel uses two different interconnects for its mainstream and server/HEDT. IF is much more similar to the mesh bus that Intel has on its HCC/XCC parts. IF suffers on latency, so AMD [rather successfully] compensates with massive L3 on its desktop CPUs.

Mesh and IF would be even closer in performance if AMD didn't rely on chiplets.
Posted on Reply
#32
biffzinker
BArmsThe KF aren't without an iGPU, they're just standard K's with the iGPU disabled.
That’s because of the binning process. There’s no point in throwing out a die when there is a defect in the iGPU.
Posted on Reply
#33
Turmania
This means Intel will be uncompetitve for at least 3 more years.
Posted on Reply
#34
biffzinker
TurmaniaThis means Intel will be uncompetitve for at least 3 more years.
Up to 5 years for anything groundbreaking that will be revealed.
Posted on Reply
#35
efikkan
Harry LloydEffectively, Intel is still using Nehalem.
Effectively, Intel is still using P6, in a way… :rolleyes:
Harry LloydAll they did was made some teeny tiny IPC improvements, increased clock speeds and core counts, updated the memory and PCI-E controllers and added new instructions (usable only in very specific applications).
What about the following "tiny" changes;
- A massive overhaul of the vector engine
- One additional execution port
- Significantly larger OoO window, better branch prediction and uop cache.
- Significantly higher int MUL/DIV performance (Sunny Cove)
- Significantly higher memory address calculation (Sunny Cove)
I certanily think Intel could have brought more, but there have been major changes since Nehalem. Sandy Bridge and Ice Lake(Sunny Cove) are the biggest improvements relatively speaking.
Harry LloydThey have not even touched the cache subsystem at all (just increased L3 size for higher core counts). They are finally doing that with Rocket Lake, but it is still just an increase in L1 and L2 sizes.
Both Skylake-X and Ice Lake featured major cache overhauls.
Harry LloydRight now they are stuck not only on the same architecture, but also the same manufacturing process, which is why there has been zero progress made since 2015 Skylake.
Nearly all of Intel's current problems are tied to their manufacturing problems. Ice Lake has been ready for years, and Sapphire Rapids is complete with just the final touches left. Both of these are good architectural improvements, and we could have gotten them two years earlier if their 10nm project hadn't derailed this badly.
Harry LloydHopefully all these personnel changes will help. They really need some big changes. Thank God for AMD, they finally made Intel wake up.
Their manufacturing department and the corresponding management must wake up.
Their architectural department is not the problem.

Honestly I doubt this personnel change is going to change anything, and if it does, it will take at least five years before we see the results. This is probably just about finding a good person to lead until they find a successor, don't read too much into these things. Good engineering makes good products, not "rockstars".
Posted on Reply
#36
_UV_
biffzinkerUp to 5 years for anything groundbreaking that will be revealed.
They already revealed LGA 1700 mess of mixed cores, expect strong shitstorm then actual buyers taste this "product".

[USER=150226]efikkan[/USER]

Every year we see new roadmaps and it's either server or mobile CPUs, or abandoned for next gen solution.
Posted on Reply
#37
BArms
tabascosauzA modern Intel iGPU consumes nearly 0W all the time (if not that) when not actively being used for 3D purposes (which is highly unlikely for any user that cares about this subject) or Quicksync. The PL1 of these processors is somewhere in the 200-300W range for recent 8- and 10-cores. Power budget for the cores is not affected in any way by the iGPU lol.
The problem is that the iGPU takes up a huge amount of die space, the latest intel CPU i can find a die shot for with annotations is the 9900K, which has something like 30% taken up by the iGPU, that's probably at least 4 more cores they could have added or used the space (and the socket pins) for more cache or lower prices. My point is that for most gamers (granted not all), a 10900K iGPU is completely worthless and actually taking up precious space and pins. I think one of the biggest reasons AMD has been doing so well is because 30% of their Zen 3 design isn't reserved by a feature that virtually nobody wants in a desktop CPU where a Radeon or Geforce card is present, and Intel could be more competitive again simply by not giving gamers/professionals what they don't want.
Posted on Reply
#38
TumbleGeorge
There is one question:
How time need to go on market this better products which will designed from mister "Nehalem"? 5-6 years looks at minimum to implement. 2026-2027 2nm or below with the very last lithography ever. So much work and spent money for one last series traditional kind CPU's? Has no out of labs experience with other kinds manufacturing for organic computers or with computers on the single atomic level controls.
Posted on Reply
#39
TheoneandonlyMrK
I don't know what this guy is capable of doing at Intel, but I can't help but think that he also worked on some of the most insecure silicon yet made also, regardless.
Good luck to him and hopefully he'll do what most hope for Intel's future.
Posted on Reply
#41
tabascosauz
BArmsThe problem is that the iGPU takes up a huge amount of die space, the latest intel CPU i can find a die shot for with annotations is the 9900K, which has something like 30% taken up by the iGPU, that's probably at least 4 more cores they could have added or used the space (and the socket pins) for more cache or lower prices. My point is that for most gamers (granted not all), a 10900K iGPU is completely worthless and actually taking up precious space and pins. I think one of the biggest reasons AMD has been doing so well is because 30% of their Zen 3 design isn't reserved by a feature that virtually nobody wants in a desktop CPU where a Radeon or Geforce card is present, and Intel could be more competitive again simply by not giving gamers/professionals what they don't want.
It's not that I don't understand the sentiment - I just don't understand why you're so irked about the existence of an iGPU that doesn't actually detract from the experience.

I'm not sure what you think a Coffee Lake or Comet Lake die actually looks like. It occupies less than 1/3 the space under the heatspreader and it's anything but square, but that doesn't mean you can just add shit to the silicon. You can't just tack on more L3 or L4 cache - that's not how layout works, not on Intel and not on AMD.

And obviously, the power draw and the interconnect are pretty good reasons already why it won't/can't be done just because it looks possible. Comet Lake weighed down the bus to the point where Zen 3 is actually lower core-core latency.

There are only two dies in Comet Lake. A 6-core die that's more or less a carryover, and a 10-core die. Making a new die that won't ever sell outside of a small subset of "gamers" is in no way profitable, especially since that 10C die already probably is less useful to Intel's business than the 6C.
Posted on Reply
#42
biffzinker
tabascosauzYou can't just tack on more L3 or L4 cache - that's not how layout works, not on Intel and not on AMD.
You missed the part about the different type of transistor being used for the SRAM cache for more on die in a smaller space compared to the transistors used in the L1/L2 cache.
tabascosauzIt's not that I don't understand the sentiment - I just don't understand why you're so irked about the existence of an iGPU that doesn't actually detract from the experience.
Not to mention it's on hand for those in need of GPU when the dedicated card goes kaput. Later on after an upgrade if you choose not to sell second hand it can be passed down without needing a dedicated card for a repurposed PC.
Posted on Reply
#43
R0H1T
efikkanNearly all of Intel's current problems are tied to their manufacturing problems.
I wouldn't say "nearly all of them" ~ Intel chose not to offer 4+ cores on mainstream desktop for over a decade. If they did that even a couple of years prior to Zen's launch it would've been DoA, a lot of Intel's problems stem from insatiable greed & IMO lack of foresight. Not for the first time mind you, as Apple's shown them with their Ax lineup! I'm not even getting to their ridiculous market segmentation with locked processors, chipsets, heck (extended) memory support even for Xeons :shadedshu:
www.anandtech.com/show/15405/intel-cuts-xeon-prices
Posted on Reply
#44
efikkan
R0H1TI wouldn't say "nearly all of them" ~ Intel chose not to offer 4+ cores on mainstream desktop for over a decade. If they did that even a couple of years prior to Zen's launch Zen would've been DoA, a lot of Intel's problems stem from insatiable greed & IMO lack of foresight. Not for the first time mind you, as Apple's shown them with their Ax lineup!
FYI, the cancelled Cannon Lake (shink of Skylake) was designed as an 8-core, even before the launch of Zen.
There were also some 6-core plans for Kaby Lake back in the days, but those were probably scrapped early on.
Posted on Reply
#45
THU31
Die size was never the problem. The problem was greed.

Lynnfield was a quad core 45 nm CPU with a die size of 296 mm2 (no iGPU). Skylake was a quad core 14 nm CPU with a die size of 122 mm2 (with an iGPU). There was basically no price difference between those CPUs. We were paying the same amount of money for quad cores for about 8 years (and way more if we wanted to overclock them).

An octa core Coffee Lake CPU had a die size of 174 mm2, but the cheapest model was way over 300 $. Rocket Lake should be very similar to this. 14 nm is probably the most mature process in history, the yields they are getting on these small dies must be crazy high. But the only thing that matters is performance relative to the competition. That is the main factor on which prices are based.

AMD did a similar thing with Zen 3. They are the fastest CPUs on the market, which is why they are much more expensive than Zen 2 (even almost twice as much, considering the price reduction of older models).

EDIT.

The i7-8700 (6C/12T) was 303 $. The i7-10600 is bascially the exact same CPU, two years later at 213 $. And the i7-10400F also uses the same die, at just 157 $. Same process, same die size, half the price.
The die size might even be higher, if it uses an 8- or 10-core die with some cores disabled, I cannot find any info on that.

The i7-10400F is an insanely good CPU for 60 Hz gaming (and not terrible at 120 Hz either). One of the best ever. Pair it with a cheap Z board and DDR4-3200 and you are golden.
Posted on Reply
#46
mechtech
If I recall corectly the big boost from Nehalem was doing like AMD did and integrating the memory controller??
Posted on Reply
Add your own comment
Dec 18th, 2024 05:14 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts