Intel "Alder Lake-S" Confirmed to Introduce LGA1700 Socket, Technical Docs Out for Partners

MxPhenom 216 · Jun 30, 2020

InVasMani said:
What I think Intel should do is connect the two chips with traces thru the substrate itself and call it hyper tunneling. Basically convert hyper threading into actual physical cores on another package with a chip that matches the base clock performance and activate when turbo boost performances heat throttles. Going further because voltages rise and fall naturally peaks and dips they could make each physical core have 3 threads then sync them to match and put a physical core on each dip representing base clock performance and the peak representing the turbo boost performance. That way when the turbo boost performance throttles the two physical cores on each rising and falling signal take over allowing the turbo performance to cool down and kick back in sooner. Squeeze more turbo cores onto a single package and supplement that performance more base clock cores from another package in the form of hyper threading with the turbo performance sandwiched in between.

The cool thing is the two CPU packages could ping pong the power throttling off and on between inactivity and activity so when one package gets engaged the other can disengage and to reduce heat and energy. If they can do that and sync it well it could be quite effective much the fan profiles on GPU's at least when setup and working right are quite nice from the 0db fan profiles to just when they trigger higher fan RPM's to operate and how long they operate cooling things down and then wind down the fan RPM's after they've lowered the GPU temp's.

Im not so sure that'll work as well as you think it might. Plus it'll get expensive from a price per package standpoint.

I think clock skew between the two would be hell and a half to compensate for and manage.

InVasMani · Jun 30, 2020

Yeah IDK what makes the most sense given the scheduler isn't perfect in the first place in terms of leveraging flexibility perfectly to adapt to best case user scenario's. They need to come up with some kind of practical perk to utilize a big LITTLE design if that's what they are aiming at leveraging. If they could improve hyper threading via a 2nd package maybe that's a option, but if it's practical or possible I'm not certain and I'm certainly not a technical design engineer on the matter. I mean if they took some instruction sets off one package and placed them on the other and used that die space area to leverage the remaining things it already does well I can see that being possibility perhaps. In a scenario like that say you have 4 CPU die packages with some different instructions between them though some might have some universal instruction sets that they all share while some might only have specific ones for example perhaps SSE 4.1/4.2/AVX 2/FMA3 only go on one package while the other lacks those, but makes up for it on other ways. Perhaps Intel puts a new FML instruction set on one package that cover security flaws with the chips designs who knows it's Intel the rabbit holes the limit.

efikkan · Jun 30, 2020

InVasMani said:
Yeah IDK what makes the most sense given the scheduler isn't perfect in the first place in terms of leveraging flexibility perfectly to adapt to best case user scenario's. They need to come up with some kind of practical perk to utilize a big LITTLE design if that's what they are aiming at leveraging. If they could improve hyper threading via a 2nd package maybe that's a option, but if it's practical or possible I'm not certain and I'm certainly not a technical design engineer on the matter.

My issue with big-little designs is the state of OS' schedulers (the ancient Windows scheduler in particular), and how far should we expect OS schedulers to be optimized for specific microarchitectures.

Just balancing HT is bad enough, hopefully if Intel chooses a big-little design on some or all CPUs they will drop HT, the combination of the two would be a scheduling nightmare. If anything, big-little might be easier to balance than HT, if done properly. HT also have complex security considerations, as we've come to learn the past couple of years, and HT sometimes cause latency issues and cache pollution, which does negatively impact some tasks.

InVasMani said:
I mean if they took some instruction sets off one package and placed them on the other and used that die space area to leverage the remaining things it already does well I can see that being possibility perhaps. In a scenario like that say you have 4 CPU die packages with some different instructions between them though some might have some universal instruction sets that they all share while some might only have specific ones for example perhaps SSE 4.1/4.2/AVX 2/FMA3 only go on one package while the other lacks those, but makes up for it on other ways. Perhaps Intel puts a new FML instruction set on one package that cover security flaws with the chips designs who knows it's Intel the rabbit holes the limit.

I'm very skeptical about having different instruction sets on different cores. I don't know if executables have all ISA features flagged in their header, but this would be a requirement.
An alternative would be to implement slower FPUs which uses fewer transistors and more clocks for the little cores, but retain ISA compatibility.

InVasMani · Jun 30, 2020

efikkan said:
My issue with big-little designs is the state of OS' schedulers (the ancient Windows scheduler in particular), and how far should we expect OS schedulers to be optimized for specific microarchitectures.

Just balancing HT is bad enough, hopefully if Intel chooses a big-little design on some or all CPUs they will drop HT, the combination of the two would be a scheduling nightmare. If anything, big-little might be easier to balance than HT, if done properly. HT also have complex security considerations, as we've come to learn the past couple of years, and HT sometimes cause latency issues and cache pollution, which does negatively impact some tasks.

I'm very skeptical about having different instruction sets on different cores. I don't know if executables have all ISA features flagged in their header, but this would be a requirement.
An alternative would be to implement slower FPUs which uses fewer transistors and more clocks for the little cores, but retain ISA compatibility.

To that I'll argue that I think we should certainly expect OS schedulers to improve in particular the ancient Windows one. I think HT is likely on it's way to being phased back out in favor more physical cores to do what HT was a stop gap solution to in the first place, but a convoluted scheduling mess especially on a OS like Windows that's poorly optimized in that area. I see HT as adding a layer of complexity that doesn't even achieve what it sets out to in the first place. When it works it's fine, but when it doesn't it's a mess. HT takes up some die space I'm sure as well that might be better to just use for more legitimate resources. I think the bigger issue with the Windows scheduler is scaling moving forward clearly looks to be at a bit of impasse at the very high end for some of these extremely multi-core AMD chips. Basically AMD has pushed the core count much higher than Microsoft seemingly anticipated and have been caught with it's pants down. It's to the point where the HT on the AMD chips are a real bottleneck and you're better off outright disabling them to avoid all the thread contention or that was my take away from some Linus's benchmarks on one of those AMD Uber FX chips.

I think with all the thread contention in mind getting rid of HT entirely could make more sense going forward especially as we're able to utilize more legitimate physical cores now today anyway. It's my belief that it'll lead to more consistent and reliable performance as a whole. There are of course middle ground solutions like taking a single HT and spreading it adjacently between two CPU core's that could be utilize in a round robin nature on a need be basis. By doing it that way AMD/Intel could diminish the overall scheduler contention issue in extreme chip core count scenario's til or if Microsoft is able to better resolve those concerns and issues.

I think the big thing is different options needs to be on the table presented and considered the CPU has evolve if it wishes to improve. I think big LITTLE certainly presents itself as a option to inserted somewhere in the overall grand scheme of things going forward, but where it injects itself is hard to say and the first designing on something radically different is always the biggest learning curve.

efikkan · Jun 30, 2020

InVasMani said:
I think HT is likely on it's way to being phased back out in favor more physical cores to do what HT was a stop gap solution to in the first place, but a convoluted scheduling mess especially on a OS like Windows that's poorly optimized in that area. I see HT as adding a layer of complexity that doesn't even achieve what it sets out to in the first place. When it works it's fine, but when it doesn't it's a mess. HT takes up some die space I'm sure as well that might be better to just use for more legitimate resources.

At the time, adding HT only costed a few percent extra transistors, and allowed to utilize some of the stalled clock cycles for other threads. As CPUs have grown more efficient, this waste has been reduced, so there are less and less free cycles to use. Additionally CPUs are only growing more reliant on cache and prefetching, so having two threads share this can certainly hurt performance. Thirdly, the ever-advancing CPU front-ends results in more and more complexity to handle HT/SMT safely (which they failed to do). I believe we're at the point where it should be cut, as it makes less and less sense for non-server workloads.

One interesting thing is the rumors of AMD moving to 4-way SMT. I do sincerely hope this is either untrue or limited to server CPUs. This is the wrong move.

Raevenlord · Jan 12, 2021

I think Big-Little makes a lot of sense, expecially considering the work apple did with the M1, which smokes Intel's previous offerings on the platform and runs circles around most - if not all - solutions currently on the market when running native apps. A non-symmetrical core design seems the way to go to improve both power efficiency and performance. And if Apple could do it and implement in iOS, I don't see why Microsoft couldn't.

yotano211 · Jan 12, 2021

Raevenlord said:
I think Big-Little makes a lot of sense, expecially considering the work apple did with the M1, which smokes Intel's previous offerings on the platform and runs circles around most - if not all - solutions currently on the market when running native apps. A non-symmetrical core design seems the way to go to improve both power efficiency and performance. And if Apple could do it and implement in iOS, I don't see why Microsoft couldn't.

I guess you dont know anything about apple because they control everything from the hardware to the software.
I guess you dont know anything about Microsoft either, they only control the software. Kinda hard for Microsoft and/or intel to do something like apple did with the M1, all companies would have to sit down and agree to a joint multiple company agreement, good luck with that.
On your next post, I would advice on teaching yourself more about tech companies.

thesmokingman · Jan 12, 2021

yotano211 said:
I guess you dont know anything about apple because they control everything from the hardware to the software.
I guess you dont know anything about Microsoft either, they only control the software. Kinda hard for Microsoft and/or intel to do something like apple did with the M1, all companies would have to sit down and agree to a joint multiple company agreement, good luck with that.
On your next post, I would advice on teaching yourself more about tech companies.

Yeap. AMD wrote about this as the main problem to big/little... it's useless on windows due to the scheduler not knowing how to manage or make use of it.

yotano211 · Jan 12, 2021

thesmokingman said:
Yeap. AMD wrote about this as the main problem to big/little... it's useless on windows due to the scheduler not knowing how to manage or make use of it.

Its funny how a TPU "news editor" would write that and think it would be easy.

thesmokingman · Jan 12, 2021

yotano211 said:
Its funny how a TPU "news editor" would write that and think it would be easy.

Well it does make sense but it's not practical given this is MS we are talking about and their scheduler. And for AMD's part they were talking about developing a way of doing it in hardware since... well MSFT. lol

System Name	Ryzen Reflection
Processor	AMD Ryzen 9 5900x
Motherboard	Gigabyte X570S Aorus Master
Cooling	2x EK PE360 \| TechN AM4 AMD Block Black \| EK Quantum Vector Trinity GPU Nickel + Plexi
Memory	Teamgroup T-Force Xtreem 2x16GB B-Die 3600 @ 14-14-14-28-42-288-2T 1.45v
Video Card(s)	Zotac AMP HoloBlack RTX 3080Ti 12G \| 950mV 1950Mhz
Storage	WD SN850 500GB (OS) \| Samsung 980 Pro 1TB (Games_1) \| Samsung 970 Evo 1TB (Games_2)
Display(s)	Asus XG27AQM 240Hz G-Sync Fast-IPS \| Gigabyte M27Q-P 165Hz 1440P IPS \| LG 24" IPS 1440p
Case	Lian Li PC-011D XL \| Custom cables by Cablemodz
Audio Device(s)	FiiO K7 \| Sennheiser HD650 + Beyerdynamic FOX Mic
Power Supply	Seasonic Prime Ultra Platinum 850
Mouse	Razer Viper v2 Pro
Keyboard	Corsair K65 Plus 75% Wireless - USB Mode
Software	Windows 11 Pro 64-Bit

Processor	AMD Ryzen 9 5900X \|\|\| Intel Core i7-3930K
Motherboard	ASUS ProArt B550-CREATOR \|\|\| Asus P9X79 WS
Cooling	Noctua NH-U14S \|\|\| Be Quiet Pure Rock
Memory	Crucial 2 x 16 GB 3200 MHz \|\|\| Corsair 8 x 8 GB 1333 MHz
Video Card(s)	MSI GTX 1060 3GB \|\|\| MSI GTX 680 4GB
Storage	Samsung 970 PRO 512 GB + 1 TB \|\|\| Intel 545s 512 GB + 256 GB
Display(s)	Asus ROG Swift PG278QR 27" \|\|\| Eizo EV2416W 24"
Case	Fractal Design Define 7 XL x 2
Audio Device(s)	Cambridge Audio DacMagic Plus
Power Supply	Seasonic Focus PX-850 x 2
Mouse	Razer Abyssus
Keyboard	CM Storm QuickFire XT
Software	Ubuntu

Processor	AMD Ryzen 9 5900X \|\|\| Intel Core i7-3930K
Motherboard	ASUS ProArt B550-CREATOR \|\|\| Asus P9X79 WS
Cooling	Noctua NH-U14S \|\|\| Be Quiet Pure Rock
Memory	Crucial 2 x 16 GB 3200 MHz \|\|\| Corsair 8 x 8 GB 1333 MHz
Video Card(s)	MSI GTX 1060 3GB \|\|\| MSI GTX 680 4GB
Storage	Samsung 970 PRO 512 GB + 1 TB \|\|\| Intel 545s 512 GB + 256 GB
Display(s)	Asus ROG Swift PG278QR 27" \|\|\| Eizo EV2416W 24"
Case	Fractal Design Define 7 XL x 2
Audio Device(s)	Cambridge Audio DacMagic Plus
Power Supply	Seasonic Focus PX-850 x 2
Mouse	Razer Abyssus
Keyboard	CM Storm QuickFire XT
Software	Ubuntu

System Name	The Ryzening
Processor	AMD Ryzen 9 5900X
Motherboard	MSI X570 MAG TOMAHAWK
Cooling	Lian Li Galahad 360mm AIO
Memory	32 GB G.Skill Trident Z F4-3733 (4x 8 GB)
Video Card(s)	Gigabyte RTX 3070 Ti
Storage	Boot: Transcend MTE220S 2TB, Kintson A2000 1TB, Seagate Firewolf Pro 14 TB
Display(s)	Acer Nitro VG270UP (1440p 144 Hz IPS)
Case	Lian Li O11DX Dynamic White
Audio Device(s)	iFi Audio Zen DAC
Power Supply	Seasonic Focus+ 750 W
Mouse	Cooler Master Masterkeys Lite L
Keyboard	Cooler Master Masterkeys Lite L
Software	Windows 10 x64

System Name	MSI GP76
Processor	intel i7 11800h
Cooling	2 laptop fans
Memory	32gb of 3000mhz DDR4
Video Card(s)	Nvidia 3070
Storage	x2 PNY 8tb cs2130 m.2 SSD--16tb of space
Display(s)	17.3" IPS 1920x1080 240Hz
Power Supply	280w laptop power supply
Mouse	Logitech m705
Keyboard	laptop keyboard
Software	lots of movies and Windows 10 with win 7 shell
Benchmark Scores	Good enough for me

Intel "Alder Lake-S" Confirmed to Introduce LGA1700 Socket, Technical Docs Out for Partners

MxPhenom 216

ASIC Engineer

InVasMani

efikkan

InVasMani

efikkan

Raevenlord

News Editor

yotano211

thesmokingman

yotano211

thesmokingman

Processor	AMD 5900x
Motherboard	Asus x570 Strix-E
Cooling	Hardware Labs
Memory	G.Skill 4000c17 2x16gb
Video Card(s)	RTX 3090
Storage	Sabrent
Display(s)	Samsung G9
Case	Phanteks 719
Audio Device(s)	Fiio K5 Pro
Power Supply	EVGA 1000 P2
Mouse	Logitech G600
Keyboard	Corsair K95