Friday, December 8th 2023

Intel "Sierra Forest" Xeon System Surfaces, Fails in Comparison to AMD Bergamo

Dec 8th, 2023 01:19 Discuss (76 Comments)

Intel's upcoming Sierra Forest Xeon server chip has debuted on Geekbench 6, showcasing its potential in multi-core performance. Slated for release in the first half of 2024, Sierra Forest is equipped with up to 288 Efficiency cores, positioning it to compete with AMD's Zen 4c Bergamo server CPUs and other ARM-based server chips like those from Ampere for the favor of cloud service providers (CSP). In the Geekbench 6 benchmark, a dual-socket configuration featuring two 144-core Sierra Forest CPUs was tested. The benchmark revealed a notable multi-core score of 7,770, surpassing most dual-socket systems powered by Intel's high-end Xeon Platinum 8480+, which typically scores between 6,500 and 7,500. However, Sierra Forest's single-core score of 855 points was considerably lower, not even reaching half of that of the 8480+, which manages 1,897 points.

The difference in single-core performance is a matter of choice, as Sierra Forest uses Crestmont-derived Sierra Glen E-cores, which are more power and area-efficient, unlike the Golden Cove P-cores in the Sapphire Rapids-based 8480+. This design choice is particularly advantageous for server environments where high-core counts are crucial, as CSPs usually partition their instances by the number of CPU cores. However, compared to AMD's Bergamo CPUs, which use Zen 4c cores, Sierra Forest lacks pure computing performance, especially in multi-core. The Sierra Forest lacks hyperthreading, while Bergaamo offers SMT with 256 threads on the 128-core SKU. Comparing the Geekbench 6 scores to AMD Bergamo EPYC 9754 and Sierra Forest results look a lot less impressive. Bergamo scored 1,597 points in single-core, almost double that of Sierra Forest, and 16,455 points in the multi-core benchmarks, which is more than double. This is a significant advantage of the Zen 4c core, which cuts down on caches instead of being an entirely different core, as Intel does with its P and E-cores. However, these are just preliminary numbers; we must wait for real-world benchmarks to see the actual performance.

Sources: BenchLeaks, Tom's Hardware

Add your own comment

76 Comments on Intel "Sierra Forest" Xeon System Surfaces, Fails in Comparison to AMD Bergamo

#51

Redwoodz

Intel's foundry services have failed. They have to use TSMC going forward. Any denial of this fact will just keep the bleeding going. Intel is in big trouble, they are just trying to hide it. I bet this sku won't even be released.

#52

watzupken

My opinion is Intel’s strategy of trying to sell e-core is very risky. When it comes to efficiency, they can’t beat ARM or RISCV. And you can spam a lot of ARM cores, similar to what Intel is trying to do here. What I don’t like here is Intel is charging top dollars for cheaper e-cores. So instead of selling cutting edge products, consumers and businesses are getting lower cost cores.

#53

Squared

The title is Intel "Sierra Forest" Xeon System Surfaces, Fails in Comparison to AMD Bergamo, which is a pretty misleading title considering that the judgement comes from a single benchmark tool that doesn't simulate the type of work Sierra First will be expected to do. And the benchmark was run on a two socket system with only as many cores as one-socket Sierra Forest is capable of.

#54

user556

You can't expect much more than rumours and click bait given the product doesn't launch till next year.

#55

Chaitanya

RedwoodzIntel's foundry services have failed. They have to use TSMC going forward. Any denial of this fact will just keep the bleeding going. Intel is in big trouble, they are just trying to hide it. I bet this sku won't even be released.

Atleast their CEO is going around begging for handouts from US governments.

On serious note, old article on current state of Intel:
www.techspot.com/news/97578-intel-bad-place-they-need-admit.html

#56

Daven

ChaitanyaAtleast their CEO is going around begging for handouts from US governments.

On serious note, old article on current state of Intel:
www.techspot.com/news/97578-intel-bad-place-they-need-admit.html

That is a good article. Intel is indeed at a crossroads and they will need to seriously pivot their business before its too late. The split design/IFS approach is the one I’m advocating for. Intel has 17 fabs with two more in construction. The world needs these fabs. The world needs intel chips less and less.

#57

remixedcat

this whole e core thing screams "something not right with the pudding" at intel... is the shit binning that bad they had to come up with a new name for even worse defects?

bugThat why I pointed out some server specific benchmarks are in order.
E cores are built to be slower. Yet servers run hundreds of threads (way more than any single CPU can offer), so it's all a matter of spreading the workload and getting the job done promptly, but without burning through too much power. I mean, if the E cores are half as slow as Zen4c cores, but there's twice as many of them, they would get the job done just as fast. So the CPU that burns though less power would win. But the numbers just aren't there to tell either way.

at a certain point your cores fight over resources, and you dont get a return of investment on more cores because they are fighting over the resources too much.

Also this introduces latency with schedulers. lower speeds introduce scaling issues as well if it's scaled laterally. I know a lot about kernel scheduling and latency as an audio producer. You need low latency and fast cpu scaling. the lower the per core clock is the higher latency and you get what's called xruns and kernel hangs that create audio stuttering, and cracking. having nothing but high latency e cores is not going to cut it for anyone serious about anything. This is why I have to upgrade my xeon 8c16t at only 2ghz to something more than that right now. Even a 6c12t at 3ghz would be much better... Core clock speed over less cores is best for audio production. Specifically for heavy synth patches that use a lot of layers typically get processed on just one thread. Not spread over several evenly. This is due to plugin containerization within the Digital Audio Workstation.

I also chose my laptop ( dell inspiron 15 3525) because it's got all power in all cores! (ryzen 7 5700u 1.9ghz 8c16t up to 4.3ghz) vs getting a more durable dell latitude that has only a few performance cores and a buncha useless e cores! my price range was only 500 so I made due with what i could... bitwig performance was the most important thing! I can change my cpu governor to performance and run at full speed while producing and then go back to schedutil for day to day stuff.

many server applications are also structured like this code wise and suffer the same. You are not going to get good database performance from e cores, neither any GIS processing or anything intense at all...

you can't read and write to the same section of ram at the same time.

#58

Tek-Check

fancucker"Fails" is such a wreckless and irresponsible statement from a journalist of TPU's calibre. These products are designed for specific applications and Intel's tertiary services and easier integration make them a more compelling option than Bergamo.

Don't let the word hurt feelings. You can always be creative with language, add 'r' and read as 'frail'.

OnasiI was genuinely sure that the biggest WX is a 4c part. Huh, guess not. Not sure why I was so convinced that it was. Mandela effect, I guess.

Products with small Zen cores:
1. 2023 - Z4 Bergamo and Siena for server (telcom and other sectors) - small cores (codename Dionysus); 8 core CCX/16 core CCD (codename Vindhya)
2. 2023 - Z4 Phoenix2 for entry mobility devices - hybrid monolithic design
3. 2024 - Z5 Turin Dense and Sorano for server 192 and 64 cores - small cores (codename Prometheus); 16 core CCX/CCD
4. 2024 - Z5 Strix Point for top mobility devices 12 cores - hybrid design
5. 2025 - Z5 Turin with AI chiplets - unknown number of cores; AI chiplets expected ~1500 TOPS

DavidC1I wouldn't be surprised if the top 144 core version closes even the low-thread gap over Bergamo using higher frequencies than Bergamo.

It's Crestmont-derived e-cores. We can't expect miracles even with higher frequency, but without SMT. Also, it will come one year later and AMD is preparing Turin Dense already on 3nm to compete with Sierra Forrest.

AssimilatorYeah, this is a new low for TPU. Even though I'm 100% certain that Sierra Forest CPUs are going to be comprehensively beaten by anything AMD has to offer, to use Geekbench of all things as "evidence" of that is just plain stupid... there's really no other way to put it. Geekbench is designed for consumer smartphone CPU workloads, which are about as far from server chip workloads as it's possible to be.

It's more about news effect than anything more technical, in absence of other preliminary metrics. I would not worry. At the end of the day, the news inspires us to exchange thoughts, with or without cursed Geekbench.

watzupkenWhat I don’t like here is Intel is charging top dollars for cheaper e-cores. So instead of selling cutting edge products, consumers and businesses are getting lower cost cores.

Nobody is forced to buy those products. There are plenty of alternatives.

SquaredThe title is Intel "Sierra Forest" Xeon System Surfaces, Fails in Comparison to AMD Bergamo, which is a pretty misleading title considering that the judgement comes from a single benchmark tool that doesn't simulate the type of work Sierra First will be expected to do. And the benchmark was run on a two socket system with only as many cores as one-socket Sierra Forest is capable of.

Relax. There are no other benchmarks available. It's just a talking point. Treat it like a gossip from royal household.

#59

bug

remixedcatthis whole e core thing screams "something not right with the pudding" at intel... is the shit binning that bad they had to come up with a new name for even worse defects?

at a certain point your cores fight over resources, and you dont get a return of investment on more cores because they are fighting over the resources too much.

Also this introduces latency with schedulers. lower speeds introduce scaling issues as well if it's scaled laterally. I know a lot about kernel scheduling and latency as an audio producer. You need low latency and fast cpu scaling. the lower the per core clock is the higher latency and you get what's called xruns and kernel hangs that create audio stuttering, and cracking. having nothing but high latency e cores is not going to cut it for anyone serious about anything. This is why I have to upgrade my xeon 8c16t at only 2ghz to something more than that right now. Even a 6c12t at 3ghz would be much better... Core clock speed over less cores is best for audio production. Specifically for heavy synth patches that use a lot of layers typically get processed on just one thread. Not spread over several evenly. This is due to plugin containerization within the Digital Audio Workstation.

I also chose my laptop ( dell inspiron 15 3525) because it's got all power in all cores! (ryzen 7 5700u 1.9ghz 8c16t up to 4.3ghz) vs getting a more durable dell latitude that has only a few performance cores and a buncha useless e cores! my price range was only 500 so I made due with what i could... bitwig performance was the most important thing! I can change my cpu governor to performance and run at full speed while producing and then go back to schedutil for day to day stuff.

many server applications are also structured like this code wise and suffer the same. You are not going to get good database performance from e cores, neither any GIS processing or anything intense at all...

you can't read and write to the same section of ram at the same time.

I've been pointing out throughout the thread that server workloads are nothing like what you see on your typical desktop. Yet here you are :wtf:

#60

remixedcat

bugI've been pointing out throughout the thread that server workloads are nothing like what you see on your typical desktop. Yet here you are :wtf:

That applies to any applications that require low latency!!! AI workloads are most likely similar too as well as mapping/spatial, web applications, sql db, forum software like this one, etc..

When I worked for invision forum services we would be hitting max cpu loads a lot on some of our biggest clients we had to tell them to get better cpu on the servers all the time. Some things were ran on single threads,

#61

bug

remixedcatThat applies to any applications that require low latency!!! AI workloads are most likely similar too as well as mapping/spatial, web applications, sql db, forum software like this one, etc..

You are very, very confused. But this isn't the place for a tutorial on server software architectures, so I'll just shut up.

#62

remixedcat

bugYou are very, very confused. But this isn't the place for a tutorial on server software architectures, so I'll just shut up.

Do you have any server administrator experience?? I have!! Been in the web hosting game for a long time since 2003... some web apps require good single thread performance or they crash!!

#63

unwind-protect

Strange. I worked on server backends all my life and always needed high core speed. And would run into diminishing returns from "too many" cores.

#64

bug

unwind-protectStrange. I worked on server backends all my life and always needed high core speed. And would run into diminishing returns from "too many" cores.

Old school servers, yes. But take a brief look at the admin interface of a cloud provider. You get to choose a dozen of CPUs, depending on your workload. Even when a particular application doesn't multithread that well, you just fire up additional instances per tenant or even per user and you still get to scale it horizontally. It's the age of Docker and Kubernetes, it's not the age of JBoss anymore.

#65

unwind-protect

bugOld school servers, yes. But take a brief look at the admin interface of a cloud provider. You get to choose a dozen of CPUs, depending on your workload. Even when a particular application doesn't multithread that well, you just fire up additional instances per tenant or even per user and you still get to scale it horizontally. It's the age of Docker and Kubernetes, it's not the age of JBoss anymore.

How does firing up more slow CPU cores help the problem of not having fast enough cores? (except helping the cloud bill)

#66

trparky

unwind-protectHow does firing up more slow CPU cores help the problem of not having fast enough cores? (except helping the cloud bill)

I'd have to agree with you on that one. I'd rather take fewer cores with a higher clock frequency than a whole lot of cores running at drastically lower clock speeds.

#67

Crackong

unwind-protectHow does firing up more slow CPU cores help the problem of not having fast enough cores? (except helping the cloud bill)

True True.

Many applications still rely on single thread ( and frequency ).
Most of the developers out there just don't have the resources to do multi-core optimization
and just rely on the hosting software ( e.g. tomcat/ wildfly...etc) to do basic multi-threading management.
And with lackluster coding, so the software just bricks when things aren't cathcing up in a single loop.

Results are bunch of in-house + lack of maintenance softwares out there that just runs on faster single core.

#68

Assimilator

remixedcatAI workloads are most likely similar too

So similar that NVIDIA has built custom hardware to perform them. :rolleyes:

remixedcatWhen I worked for invision forum services we would be hitting max cpu loads a lot on some of our biggest clients

That's because Invision is written in PHP, and PHP is shit.

unwind-protectHow does firing up more slow CPU cores help the problem of not having fast enough cores? (except helping the cloud bill)

You don't get to pick frequency in the cloud, you get to pick a relative amount of virtualised performance that your application requires. How that performance is delivered is intentionally opaque; it could be via Intel CPUs, AMD CPUs, Arm CPUs, or starving children. Nobody except the cloud provider knows or cares.

trparkyI'd have to agree with you on that one. I'd rather take fewer cores with a higher clock frequency than a whole lot of cores running at drastically lower clock speeds.

Again, that's not how it works in the cloud, where most of these processors will be deployed.

#69

bug

unwind-protectHow does firing up more slow CPU cores help the problem of not having fast enough cores? (except helping the cloud bill)

I thought that was obvious, but apparently not: you use more costly instances having more powerful CPUs for your workloads that actually need high single-thread performance and you use the cheaper instances for everything else.

#70

Dr. Dro

DavenThat is a good article. Intel is indeed at a crossroads and they will need to seriously pivot their business before its too late. The split design/IFS approach is the one I’m advocating for. Intel has 17 fabs with two more in construction. The world needs these fabs. The world needs intel chips less and less.

Fair assessment, but the world's need of x86 CPUs is relatively waning, it is due to this phenomenon that Intel weakens as it has always centered itself in this very specific business - and that's where AMD being fabless turned out to be awfully convenient. It doesn't help that despite their foundry services being up to par, their CPU design team has run into roadblocks, the "14th gen" stunt is cold, hard proof of that. That 14900K processor never had any business existing, let alone being called that, and this is coming from someone who owns a CPU as frivolous and luxurious as the i9-13900KS.

With the server market being highly specialized and thus welcoming of weird designs that go far beyond mere benchmarks to achieve real world results, ARM dominating the mobile market and finally making inroads into traditional mobile computing, one can't help but wonder what will happen with the x86 architecture and what does the future have in store for it. I'm sure it's not really going anywhere, but I doubt that it'll be able to remain as Intel's darling and aggressively protected patent for too long, at least not without architectural innovation pushing forward regardless of it

#71

Squared

In a way, I think Intel and AMD's refusal to license x86 to other companies is now hurting Intel. Apple, Nvidia, Qualcomm, and others have all wanted to build high-performance processors to compete with Intel, but they weren't allowed to build x86 processors so they turned to ARM. Now all Apple laptops, many servers, and even some Windows computers use ARM instead of x86, and there's a real risk that in the near future so much software will be built for ARM that using the x86 ISA will be a disadvantage in the mind of consumers.

#72

trparky

SquaredNow all Apple laptops, many servers, and even some Windows computers use ARM instead of x86, and there's a real risk that in the near future so much software will be built for ARM that using the x86 ISA will be a disadvantage in the mind of consumers.

I agree, but it's going to be a slow uphill climb until that fully happens. Apple can do it because they're willing to throw the figurative baby out with the bathwater whereas with Windows, Microsoft can't do that since they have three decades of legacy software and APIs to support which effectively is a boat anchor around their necks.

#73

Assimilator

trparkyI agree, but it's going to be a slow uphill climb until that fully happens. Apple can do it because they're willing to throw the figurative baby out with the bathwater whereas with Windows, Microsoft can't do that since they have three decades of legacy software and APIs to support which effectively is a boat anchor around their necks.

It's not a boat anchor, it's the reason that Microsoft is so successful.

#74

DavidC1

Tek-CheckIt's Crestmont-derived e-cores. We can't expect miracles even with higher frequency, but without SMT. Also, it will come one year later and AMD is preparing Turin Dense already on 3nm to compete with Sierra Forrest.

The Integer gap between Gracemont and Golden Cove is only 25%. Yes, clock speed will make all the difference. Also in the workloads Bergamo and SRF is supposed to compete in, SMT doesn't really matter. That's why ARM servers are competitive despite lack of it.

#75

Tek-Check

DavidC1The Integer gap between Gracemont and Golden Cove is only 25%. Yes, clock speed will make all the difference. Also in the workloads Bergamo and SRF is supposed to compete in, SMT doesn't really matter. That's why ARM servers are competitive despite lack of it.

No. SMT does really matter in many workloads and provides significant performance uplift where relevant. See those workloads below.
Growing popularity of ARM servers has nothing to do with usefulness of SMT on Bergamo SKUs which sell now like hot cakes.
www.phoronix.com/review/amd-epyc-9754-smt/8

Add your own comment

Intel "Sierra Forest" Xeon System Surfaces, Fails in Comparison to AMD Bergamo

76 Comments on Intel "Sierra Forest" Xeon System Surfaces, Fails in Comparison to AMD Bergamo

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

Intel "Sierra Forest" Xeon System Surfaces, Fails in Comparison to AMD Bergamo

Related News

76 Comments on Intel "Sierra Forest" Xeon System Surfaces, Fails in Comparison to AMD Bergamo

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts