Thursday, March 9th 2023

South Korean Company Morumi is Developing a CPU with Infinite Parallel Processing Scaling

Mar 9th, 2023 12:59 Discuss (29 Comments)

One of the biggest drawbacks of modern CPUs is that adding more cores doesn't equal more performance in a linear fashion. Parallelism in CPUs offer limited scaling for most applications and even none for some. A South Korean company called Morumi is now taking a stab at solving this problem and wants to develop a CPU that can offer more or less infinite processing scaling, as more cores are added. The company has been around since 2018 and focused on various telecommunications chips, but has now started the development on what it calls every one period parallel processor (EOPPP) technology.

EOPPP is said to distribute data to each of the cores in a CPU before the data is being processed, which is said to be done over a type of mesh network inside the CPU. This is said to allow for an almost unlimited amount of instructions to be handled at once, if the CPU has enough cores. Morumi already has an early 32-core prototype running on an FPGA and in certain tasks the company has seen a tenfold performance increase. It should be noted that this requires software specifically compiled for EOPPP and Moumi is set to release version 1.0 of its compiler later this year. It's still early days, but it'll be interesting to see how this technology develops, but if it's successfully developed, there's also a high chance of Morumi being acquired by someone much bigger that wants to integrate the technology into their own products.

Sources: The Elec, Morumi

Add your own comment

29 Comments on South Korean Company Morumi is Developing a CPU with Infinite Parallel Processing Scaling

sLowEnd

They're literally claiming to break Amdahl's Law in that chart there. lol
en.wikipedia.org/wiki/Amdahl%27s_law

I'll believe it when I see it, but I strongly suspect the capabilities of this EOPPP thing are either grossly exaggerated, or an outright scam entirely.

TheoneandonlyMrK

Around since 2018 ! And in 2022 they're aiming for infinite scaling!

Sighhh

I quote " is said to distribute data to each of the cores in a CPU before the data is being processed, which is said to be done over a type of mesh network inside the CPU. This is said to allow for an almost unlimited amount of instructions to be handled at once, "

Hmnnn now is anyone else thinking wtaf is it me.

I thought that's how CPU work, distribute data, work on data , does an EORPPP use magical stuff wherein Intel use silicon.

What gives.

And have I mentioned that I am inventing a raycasting chip that can do infinite rays, it takes work in first then does work and through this simple change I WILL BEAT Nvidia, wait what.

TumbleGeorge

sLowEndThey're literally claiming to break Amdahl's Law in that chart there. lol
en.wikipedia.org/wiki/Amdahl%27s_law

I'll believe it when I see it, but I strongly suspect the capabilities of this EOPPP thing are either grossly exaggerated, or an outright scam entirely.

What you is missunderstand in this:

TheLostSwedeand in certain tasks

TheLostSwede

News Editor

TheoneandonlyMrKAround since 2018 ! And in 2022 they're aiming for infinite scaling!

Sighhh

I quote " is said to distribute data to each of the cores in a CPU before the data is being processed, which is said to be done over a type of mesh network inside the CPU. This is said to allow for an almost unlimited amount of instructions to be handled at once, "

Hmnnn now is anyone else thinking wtaf is it me.

I thought that's how CPU work, distribute data, work on data , does an EORPPP use magical stuff wherein Intel use silicon.

What gives.

And have I mentioned that I am inventing a raycasting chip that can do infinite rays, it takes work in first then does work and through this simple change I WILL BEAT Nvidia, wait what.

I assume that the difference here is that the data is devided up in smaller chunks, so each processor core works on a chunk of data and the chunks are put back together at the end somewhere. To be honest, it's not entirely clear how it works and only so much info is available.

From the source link. Maybe I misunderstood something.

The pre-saved data are processed at once and the processed data are moved and saved in parallel on a mesh network. Using this saved result in the next period allows the sequential processing of this parallel data.

TumbleGeorge

Basically computers are "stupid". Constantly and unnecessarily repeating the same calculations for different parts of the task. When it is easier to apply the result obtained from a single calculation everywhere the formula is the same. Instead of calculating a trillion times 2+1=3, in different queues, a single calculation is enough and the resulting value is embedded wherever it is needed.

TheoneandonlyMrK

TheLostSwedeI assume that the difference here is that the data is devided up in smaller chunks, so each processor core works on a chunk of data and the chunks are put back together at the end somewhere. To be honest, it's not entirely clear how it works and only so much info is available.

From the source link. Maybe I misunderstood something.

Oh right, sounds a bit mental even if vague, again that's what,!,, , two changes the work is split into chunks at start.
Worked on and,
Put back together at the end.

We have two versions of this in modern pcs already, this is exactly what a GPU does, unified processing across core's and has memory constraints since SRAM has stopped scaling and in general eats space, obviously a CPU does this on a limited small scale?!?.

But EORPPP needs to be specifically written for or compiled for and by the sound of it conceptually written For, ohh kk I mean Academia and enterprise might have a use but I think it limited, especially since we have massively parallel symptoms we already struggle to make work on general tasks and not enough tasks to warrant the financial input.

Well see , but Cerberus would also be saying yo What now.

Ps massively parallel systems :p made me laugh, it's staying, now where are those glasses. :):D

80-watt Hamster

TheoneandonlyMrKBut EORPPP needs to be specifically written for or compiled for and by the sound of it conceptually written For, ohh kk I mean Academia and enterprise might have a use but I think it limited, especially since we have massively parallel symptoms we already struggle to make work on general tasks and not enough tasks to warrant the financial input.

A technology doesn't need to have consumer-oriented use cases to be interesting, IMO.

ExcuseMeWtf

TumbleGeorgeWhat you is missunderstand in this:

in certain tasks the company has seen a tenfold performance increase

Amdahl's law doesn't prevent 10x performance increase, or really, any arbitrary number increase, if sequential part is respectively small enough.
It's the "no performance limit" claim that is BS if there previously was one, as that would require sequential part to be nonexistent, i.e., program code being redesigned, and not just ran on another CPU.

TumbleGeorge

ExcuseMeWtfAmdahl's law

So it is not a law, since any exceptions can exist, so it is a theorem with a limited range of conditions for which it is valid.

#10

TheoneandonlyMrK

80-watt HamsterA technology doesn't need to have consumer-oriented use cases to be interesting, IMO.

I agree I am Intrigued but the vague in this one is StRONG, ,, and what is said is so ANDDD?!?!?.

#11

ExcuseMeWtf

TumbleGeorgeSo it is not a law, since any exceptions can exist, so it is a theorem with a limited range of conditions for which it is valid.

No. It's valid for ANY code and any processor, you just seem to misunderstanding, where it applies.

Let's say you have a piece of code.

1) You run it on CPU A, say, 128 core Xeon, but 127 of those disabled. You get some performance numbers.

2) Now you enable all cores, run it again. You get 10x speedup.

3) Now have same code ran on different CPU B with only 1 core, 127 disabled. You get some other performance number.

4) Rerun that code on CPU B with 128 cores too. What would speedup vs scenario 3) be? x10 too

What would difference between CPU A and CPU B when ran single-to-single or multi-to-multi? That has nothing to do with Amdahl's law, but with how A and B architectures are optimized for this kind of task.

So what Amdahl's law states is that speedup between scenarios 1 vs 2 and 3 vs 4 is the same, because you keep same architecture, but add more cores. This is actual scope of Amdahl's law.

Scenarios 1 vs 3 and 2 vs 4 are not the scope of Amdahl's law.

Changing an architecture is a different scenario. It can make CPU B 10/100/1000x faster than CPU A core-to-core, but it cannot change that speedup from adding more cores will plateau proportionally as well. That max speedup is inherent property of specific code and not something to work around in CPU architecture.

Only way to make it scale without limit is to rewrite the code so that there is no sequential part and all threads are ran independent from each other.

Then you get infinite scaling with more cores on CPU A, but also CPU B, and any other CPU that can run this code.

IOW, there is nothing magical about described CPU that would make same code have infinite scaling, if it didn't have it already.

#12

TumbleGeorge

ExcuseMeWtfNo. It's valid for ANY code and any processor, you just seem to misunderstanding, where it applies.

Let's say you have a piece of code.

1) You run it on CPU A, say, 128 core Xeon, but 127 of those disabled. You get some performance numbers.

2) Now you enable all cores, run it again. You get 10x speedup.

3) Now have same code ran on different CPU B with only 1 core, 127 disabled. You get some other performance number.

4) Rerun that code on CPU B with 128 cores too. What would speedup vs scenario 3) be? x10 too

What would difference between CPU A and CPU B when ran single-to-single or multi-to-multi? That has nothing to do with Amdahl's law, but with how A and B architectures are optimized for this kind of task.

So what Amdahl's law states is that speedup between scenarios 1 vs 2 and 3 vs 4 is the same, because you keep same architecture, but add more cores. This is actual scope of Amdahl's law.

Scenarios 1 vs 3 and 2 vs 4 are not the scope of Amdahl's law.

Changing an architecture is a different scenario. It can make CPU B 10/100/1000x faster than CPU A core-to-core, but it cannot change that speedup from adding more cores will plateau proportionally as well. That max speedup is inherent property of specific code and not something to work around in CPU architecture.

Only way to make it scale without limit is to rewrite the code so that there is no sequential part and all threads are ran independent from each other.

Then you get infinite scaling with more cores on CPU A, but also CPU B, and any other CPU that can run this code.

IOW, there is nothing magical about described CPU that would make same code have infinite scaling, if it didn't have it already.

Mathematical logic is not always correct. At one time we described geocentrism mathematically correctly with the "correct" formulas, then we looked and saw that the Earth is not the center around which everything else revolves.

#13

ExcuseMeWtf

Then please pinpoint, what observation exactly does contradict Amdahl's law here.

10x speedup from more cores is not it, as it's predicted by that law to be entirely possible.

Making same code scale infinitely with more cores, when it didn't on other processors? That's not an observation. That's a claim. Not a validated one by anything provided. Until it actually gets validated, Amdahl's law stands. And I have temerity to strongly doubt it would get validated, ever. Somewhere in the range of doubting perpetuum mobile existing.

For the record "10x speedup" is a claim too for all we know now, but an easily believable one, since:

1)It does not violate said law
2)Processor designing companies have been optimizing architectures for specific tasks for decades

#14

TumbleGeorge

"Infinite" is used to attract attention and investment. It is more than obvious that this is a PR word order. I don't know why you're even trying to rub in that part.

#15

ExcuseMeWtf

That's justifying clickbait headlines that are greatly exaggerated or, in this case, outright false.

I guess you won't be complaining about clickbait in headlines for the sake of being consistent then.

#16

TumbleGeorge

ExcuseMeWtfThat's justifying clickbait headlines that are greatly exaggerated or, in this case, outright false.

I guess you won't be complaining about clickbait in headlines for the sake of being consistent then.

I wouldn't deny a title correction, but if the article was produced to its OP here and accordingly he has the rights to change the title. If it is only a translation, and the article is owned by another author, hardly anything can be done about it without his consent.

#17

ExcuseMeWtf

The way TheLostSwede published it here is not the issue.

Company's claims are.

#18

TumbleGeorge

It is too early to express such an opinion. What if they succeed? Will you turn your opinion 180°?

#19

ExcuseMeWtf

Succeed in what exactly?

Creating a highly performing/efficient architecture for specific tasks? I hope they do lol.

Overturning Amdahl's law by making programs suddenly perfectly scale with more cores when it didn't on other processors? I have a bridge to sell you, if you honestly believe that.

#20

TumbleGeorge

Oh no. And I'm on the principle of touching first to make sure of something. Which at such an early stage cannot possibly happen. But I'm not in a hurry to dismiss things as impossible. Whoever does it should try harder. Arguing by citing a limit theorem does not work.

#21

ExcuseMeWtf

And guess what? People have been more than just "touching" this for DECADES at this point and yet this law holds. See wiki article posted above to find out when this law was first presented.

Same as, say, evolution. Oh, it's "just" a theory, right? Yet we have so much evidence for it, that we basically accept it at this point. Unless one's tinfoil hat is slipping that is ;)

#22

mechtech

I'm guessing not on Windows OS ;)

#23

chrcoluk

How would this help on single threaded apps/games?

What you described just sounds like an enhanced hyperthreading?

#24

Parallelking

US 11,526,432
Parallel Processing Apparatus Capable of Consecutive Parallelism

#25

TumbleGeorge

ParallelkingUS 11,526,432
Parallel Processing Apparatus Capable of Consecutive Parallelism

US5287465A
??

Add your own comment

South Korean Company Morumi is Developing a CPU with Infinite Parallel Processing Scaling

29 Comments on South Korean Company Morumi is Developing a CPU with Infinite Parallel Processing Scaling

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

South Korean Company Morumi is Developing a CPU with Infinite Parallel Processing Scaling

Related News

29 Comments on South Korean Company Morumi is Developing a CPU with Infinite Parallel Processing Scaling

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts