AMD TRX40 Chipset Not Compatible with 1st and 2nd Gen Threadrippers

Camm · Oct 10, 2019

AMD acting like Intel? Its pretty obvious power delivery is a huge factor here, or are people forgetting how gimped the 32 core Threadripper 2 parts were just to fit into the TR4 socket?

InVasMani · Oct 10, 2019

Metroid said:
There we go, amd starting to behave like intel.

They've got a long ways to go to match Intel's overall shenanigans however. Honestly I get it with the HEDT platform and with the memory channel changes. I'm sure this HEDT board replacing x399 will probably last a additional generation or two on top of that. It would have been good if things were different though I can understand it a lot more than the socket 1151 situation. Like if z270/z370 were quad channel they would've been a easier pill to swap and justify. At least in the case of TRX40 it's a more justifiable reason behind it and AMD is essentially transforming it into a cheaper cost of entry equivalent to Epyc from the generation prior with probably another generation or two beyond. To me it's a different situation. It's still far better than what Intel was offering for it's workstation boards and even more than what Intel was doing in the mainstream with socket 1151. As I said AMD has a long ways to match Intel's shenanigans like abusing it's power in a monopolistic anticompetitive manner for example.

HTC · Oct 10, 2019

windwhirl said:
Isn't Epyc a full SoC? I never heard or read that Epyc mobos had chipsets...

My bad: somehow i missed the word "chipset" in the topic and was referring to socket instead.

Blunder, on my part: oooooops ...

notb · Oct 10, 2019

theoneandonlymrk said:
Fair enough and important to note , just as important though is the fact that not everyone is doing the same things as you.
Bioinformatics is a niche of computer use not the sole use or AMD would be in trouble.

Bioinformatics may be, but how is that even relevant? Underneath it's just math. And math is fairly mainstream in computing. :-D

Intel provides Intel MKL (the library that does the dirty work) and they make sure it uses AVX-512 as much as possible.
They've managed to use AVX-512 e.g. for a lot of common algebra stuff (in BLAS, LINPACK/LAPACK) and for FFT. You can google these terms if they seem mysterious.

Programming is mostly high-level. You don't have to intentionally write a program to use AVX-512. You don't have to know what AVX is. You don't have to have any idea of how CPUs work.

Let's say you want to solve a very simple problem - a system of linear equations. You know:
Ax=B
It's a single line of code in Python with NumPy:
x = linalg.inv(A).dot(B)
And it'll use AVX-512.

voltage · Oct 10, 2019

notb · Oct 11, 2019

Chrispy_ said:
Yeah, we're not in the bioinformatics business. The CPU farm tends to get used for raytracing using a couple of different renderers at the moment. Neither use AVX-512.

Are you sure? ;-)
Many renderers use Intel Embree kernel.

Anyway, your current rendering engine may not benefit from AVX-512, but you could always switch to a different one - if it made more sense paired with Intel CPUs. Right?

Choice of those benefiting from AVX is quite significant (from some Intel presentation https://www.embree.org/data/embree-siggraph-2018-final.pdf):

xkm1948 · Oct 11, 2019

notb said:
Bioinformatics may be, but how is that even relevant? Underneath it's just math. And math is fairly mainstream in computing. :-D

Intel provides Intel MKL (the library that does the dirty work) and they make sure it uses AVX-512 as much as possible.
They've managed to use AVX-512 e.g. for a lot of common algebra stuff (in BLAS, LINPACK/LAPACK) and for FFT. You can google these terms if they seem mysterious.

Programming is mostly high-level. You don't have to intentionally write a program to use AVX-512. You don't have to know what AVX is. You don't have to have any idea of how CPUs work.

Let's say you want to solve a very simple problem - a system of linear equations. You know:
Ax=B
It's a single line of code in Python with NumPy:
x = linalg.inv(A).dot(B)
And it'll use AVX-512.

Dude that explains quite a lot! I am nowhere near the math foundation of a pure computer science specialist. I was wondering how the heck did a lot of my tools suddenly run faster on Intel system with avx512. I guess all those linux updates and patches made improvements at the close-to-metal level?

efikkan · Oct 11, 2019

notb said:
Bioinformatics may be, but how is that even relevant? Underneath it's just math. And math is fairly mainstream in computing. :-D

Intel provides Intel MKL (the library that does the dirty work) and they make sure it uses AVX-512 as much as possible.
They've managed to use AVX-512 e.g. for a lot of common algebra stuff (in BLAS, LINPACK/LAPACK) and for FFT. You can google these terms if they seem mysterious.

Programming is mostly high-level. You don't have to intentionally write a program to use AVX-512. You don't have to know what AVX is. You don't have to have any idea of how CPUs work.

Let's say you want to solve a very simple problem - a system of linear equations. You know:
Ax=B
It's a single line of code in Python with NumPy:
x = linalg.inv(A).dot(B)
And it'll use AVX-512.

Libraries like MKL works by your code passing data to it and performing a larger mathematical computation, like multiplication of large matrices etc, but not taking over every calculation in a program. There is of course overhead involved, MKL may even use multiple threads and different algorithms depending on the parameters you pass to it, so each computation needs to be of a certain size before it becomes beneficial.

While all code is fundamentally math, most code is more logic than singular "big" equations. So libraries like this is mostly relevant for research and other specific use cases, not all code in general.

If you want your native C/C++ code to leverage SIMD, it's usually done through intrinsics, and this is required to actually implement algorithms on a lower level, and will of course yield much greater performance benefits.

There is a huge performance potential in using AVX in applications, libraries and the OS itself. Intel have their experimental "Clear Linux" distro, which features some standard libraries and applications with AVX optimizations. If you take a standard Linux distro or Windows, most of it is compiled for the ISA level of old Athlon64 (so x86-64 with SSE2/3). Optimizing standard libraries will of course help all applications which uses them, but not as much as optimizing individual applications of course. Still, I wish this was a feature to enable in OS's, because increased performance will also ultimately give you higher power efficiency, if that's what you want.

thesmokingman · Oct 11, 2019

X399 boards kinda sucked anyways...

Flyordie · Oct 11, 2019

Camm said:
AMD acting like Intel? Its pretty obvious power delivery is a huge factor here, or are people forgetting how gimped the 32 core Threadripper 2 parts were just to fit into the TR4 socket?

My VRMs are perfectly fine running their full intended output.

That VRM isn't a weakling and can easily handle a 250W part. Mine is also watercooled as I use a monoblock to cool mine. Never hits above 40C.

kapone32 · Oct 11, 2019

thesmokingman said:
X399 boards kinda sucked anyways...

Can you please expand on that sentiment?

Chrispy_ · Oct 11, 2019

notb said:
Are you sure? ;-)
Many renderers use Intel Embree kernel.

Anyway, your current rendering engine may not benefit from AVX-512, but you could always switch to a different one - if it made more sense paired with Intel CPUs. Right?

Choice of those benefiting from AVX is quite significant (from some Intel presentation https://www.embree.org/data/embree-siggraph-2018-final.pdf):

View attachment 133819

Yep, I'm sure.

I suspect Intel paid Chaos Group to add Embree features just to get their foot in the door and have another industry name they could add to the marketing image you presented.
Sadly, it's worthless in the current generation and will need far more work to ever be a viable option for production renders.

Support is limited to specific functions and the displacement mod. Memory requirements double and reduces precision from double to single-precision. The result is artifacts everywhere :\
You can also use Embree for motion blur, but it doesn't support multiple geometry samples, so it's noisy and ugly. It's also actually slower than the default V-Ray motion blur, LOL.

I honestly don't care about what code or hardware renders run on. I get a budget and the brief is to generate error-free frames as fast as possible with that budget. RTX or Embree fail the error-free requirement, see, and Intel typically fails the budget requirement.

It wasn't always this way; The renderfarm's getting pretty big now with about 70 AMD machines in the CPU pool but three years ago it was all-Intel. Performance/$ and Performance/Watt are something Intel has always been really bad at and now that Ryzen is soundly beating them on both fronts. Often, even if you compare AMD with AVX2 and Intel with AVX-512.

thesmokingman · Oct 12, 2019

kapone32 said:
Can you please expand on that sentiment?

There were a lot of issues with the boards in the beginning. Ask anyone who ran TR when they came out, it was crazy. I went thru a handful of boards myself.

notb · Oct 12, 2019

efikkan said:
Libraries like MKL works by your code passing data to it and performing a larger mathematical computation, like multiplication of large matrices etc, but not taking over every calculation in a program. There is of course overhead involved, MKL may even use multiple threads and different algorithms depending on the parameters you pass to it, so each computation needs to be of a certain size before it becomes beneficial.

We'll of course. AVX-512 (just like any other optimization feature) will be used when it's expected to make the program faster, not slower.

I hope this is obvious and acceptable for everyone.
No one said that suddenly all calculations are done using AVX-512.

The main idea I wanted to pass is that you don't have to, consciously, call Intel MKL. That's the whole point of high-level programming after all.
Some people seem to think utilizing AVX-512 needs rewriting of programs. Or that the code will only work on Intel, because it's full of "multiplyUsingAvx512" or whatever.

And as more and more libraries use AVX-512 (via Intel MKL or otherwise), more and more problems run faster on modern Intel CPUs. That's the phenomenon @xkm1948 noticed.
That's the phenomenon we notice all the time as drivers and libraries evolve.
The only difference with AVX-512 is that it offers a sudden, large boost.

If you take a standard Linux distro or Windows, most of it is compiled for the ISA level of old Athlon64 (so x86-64 with SSE2/3).

The OS - yes. The software and libraries? Not really.

Most industry standard software/environments will utilize Intel MKL by default on Intel systems. That includes things like Matlab, Autodesk solvers, Anaconda, many rendering engines (as noted earlier). In other words: one doesn't have to care about it (and he shouldn't, because these products are made for analysts/engineers).

On Windows everything is usually supplied with software, so one doesn't really have to care.
On Linux programs tend to use OS-wide libraries / interpreters / compilers, so sometimes using Intel MKL requires a bit of work. But it shouldn't be an issue for people who already tolerate Linux.
E.g. compiling NumPy with Intel MKL:

Numpy/Scipy with Intel® MKL and Intel® Compilers

This guide is intended to help current NumPy/SciPy users to take advantage of Intel® Math Kernel Library (Intel® MKL).

software.intel.com

But honestly, there's really no good reason not to use compilers or distributions provided by Intel (like C++, Python), since they usually work better. And you've already paid for them in the "Intel tax" or whatever you want to call it.

Of course all of this is only relevant for admins and home users. In a pro situation an analyst/engineer is provided with an optimized environment.

Chrispy_ said:
Yep, I'm sure.

Which rendering engine?

I suspect Intel paid Chaos Group to add Embree features just to get their foot in the door and have another industry name they could add to the marketing image you presented.

No offense, but your life must be quite sad if you think Intel has to pay anyone to convince them to add optimizations for Xeons (~95% servers).
Software companies just do it. They don't compete with Intel. They compete with other software companies, who may add these optimizations as well.
That's how computing works.

1d10t · Oct 12, 2019

Bummer,this explains why AMD mostly silent about their upcoming Threadripper.
I'm kinda suspicious when they change motherboard convention from X prefix to TRX, and there's no leaked BIOS from any motherboard manufacture.
Oh well, guess I'll wait for review between 3950x versus lowest Threadripper and their total cost platform before deciding to jump.

efikkan · Oct 12, 2019

notb said:
The main idea I wanted to pass is that you don't have to, consciously, call Intel MKL. That's the whole point of high-level programming after all.
Some people seem to think utilizing AVX-512 needs rewriting of programs. Or that the code will only work on Intel, because it's full of "multiplyUsingAvx512" or whatever.

AVX(2) have proven to scale very well on AMD too, sometimes even better relatively speaking vs. no AVX, which has probably to do with the preparations needed in order to use any kind of SIMD, which actually helps to lighten the load for the front-end of the CPU. Too bad AMD is still not implementing AVX-512. Once client applications starts to utilize it properly, people will not look back.

As to optimizing programs in general; in most code bases only a fraction of the code is actually performance critical. Even for those rare cases where applications contain assembly optimizations, it's usually just a couple of tiny spots at the choke point in an algorithm in a tight loop somewhere, where using assembly results in a 50% gain or something. Most such cases are ideal for SIMD, which means you would use AVX intrinsics instead of assembly instructions (even though these are just macros mapping to AVX assembly instructions), and now you may get a 10-50x gain instead.

Optimizing applications usually focuses on optimizing algorithms and some of the core engine surrounding them. Most performance critical paths are fairly limited in terms of the code involved, and throughout the rest of the application code your high-level abstractions will not make any significant impact on performance at all. But in that critical path, all bloat like abstractions, function calls, non-linear memory accesses and branching will come at a cost. So the first step of optimizing this is removal of abstractions and branching (to the extent possible), then cache optimize it with linear memory layouts (this step may require rewriting code surrounding the algorithm). By this point you should have tight loops without function calls, memory allocations etc. inside them. And only then you are ready to use SIMD, but will also get huge gains from doing so. Doing this kind of optimizations is generally not possible in languages higher than C++. You may still be able to interface with libraries containing low-level optimizations and get some gains there, but that would be it.

One of the good news about writing code for SIMD is that the preparations are the same, so upgrading an existing code from SSE to AVX or a newer version of AVX is simple, just change a few lines of code and tweak some loops. The hard work is the preparations for SIMD.

notb said:
The OS - yes. The software and libraries? Not really.

That's where you're wrong.
I mentioned Intel Clear Linux, one of the main features is the optimizations to libc, the runtimes for the C standard library. Almost every native application uses this library. The gains are usually in the 5-30% range, so nothing fantastic, but that's free performance gains for most of your applications, and who doesn't want that? The only reason why I don't use it is that this Linux distro is an experimental rolling release distro, and I need a stable machine for work, and don't have time to spend all day troubleshooting bugs. If it was properly supported by e.g. Ubuntu, I would switch to it.

The counterpart of libc for MS Visual Studio is msvcrt.dll and msvcpxx.dll, which most of your heavier applications and games rely on. There are of course some optimizations in MS's standard library, and they even open sourced some of it recently, but from what I can see it's mostly older SSE. If these were updated to utilize AVX2 or even AVX-512, I'm sure many Windows users would appreciate it. The problem is compatibility; so they would either have to ship two versions or drop hardware support.

notb said:
On Windows everything is usually supplied with software, so one doesn't really have to care.
On Linux programs tend to use OS-wide libraries / interpreters / compilers, so sometimes using Intel MKL requires a bit of work. But it shouldn't be an issue for people who already tolerate Linux.

Another misconception.
Most Windows software rely on MS Visual Studio's runtime libraries, including everything bundled with Windows itself, which is why most Windows applications don't have to "care". The only times they do, is when they are compiled with a more recent Visual Studio version, as Visual Studio choose to duplicate the library for every new version, which I assume is their approach for compatibility.
This is no different than Linux, except it uses libc instead, which comes bundled with every Linux distro. Most Linux software is also POSIX compliant, which makes it easy to port to BSD, MacOS, and even Windows (which is have partial compliance).

The differences which may have confused you are when it comes to GUI libraries etc., since there is not one "GUI API" like in Windows. Linux applications may rely on GTK, Qt, wxWidgets and many others. When such applications are ported to Windows, they usually needs runtimes for those libraries, examples of such applications which you may be familiar with includes; Firefox, VLC, LibreOffice, GIMP, Handbrake, VirtualBox, Teamviewer etc.

Chrispy_ · Oct 14, 2019

notb said:
Which rendering engine?

3.6, distributed bucket renders - I'm pretty sure the team is not using RT for the CPU farm since that is for models/scenes that use features not supported in RT.
The RT farm is outside the scope of this thread anyway, since that's GPU-dependent, and NEXT is a little too early to call production ready, simply because of the massive VRAM/RAM overheads incurred with hybrid rendering.

notb said:
No offense, but your life must be quite sad if you think Intel has to pay anyone to convince them to add optimizations for Xeons (~95% servers).
Software companies just do it. They don't compete with Intel. They compete with other software companies, who may add these optimizations as well.
That's how computing works.

My life probably is pretty sad, I flew all the way to Siggraph this summer to listen to Vlado Koylazov (Chaos Group lead developer) on stage. Maybe I misinterpreted the segment on sponsorship and support, but both Nvidia and Intel are providing 'incentives'. That means a combination of financial support and developers on either side to get things working - both of which I interpret as "costs money". There's also the PR/marketing/promotional side of it which is arguably cost-free, but I'd imagine that can't happen without the financial incentives greasing the wheels first.

System Name	ATHENA
Processor	AMD 7950X
Motherboard	ASUS Crosshair X670E Extreme
Cooling	ASUS ROG Ryujin III 360, 13 x Lian Li P28
Memory	2x32GB Trident Z RGB 6000Mhz CL30
Video Card(s)	ASUS 4090 STRIX
Storage	3 x Kingston Fury 4TB, 4 x Samsung 870 QVO
Display(s)	Acer X38S, Wacom Cintiq Pro 15
Case	Lian Li O11 Dynamic EVO
Audio Device(s)	Topping DX9, Fluid FPX7 Fader Pro, Beyerdynamic T1 G2, Beyerdynamic MMX300
Power Supply	Seasonic PRIME TX-1600
Mouse	Xtrfy MZ1 - Zy' Rail, Logitech MX Vertical, Logitech MX Master 3
Keyboard	Logitech G915 TKL
VR HMD	Oculus Quest 2
Software	Windows 11 + Universal Blue

System Name	HTC's System
Processor	Ryzen 5 5800X3D
Motherboard	Asrock Taichi X370
Cooling	NH-C14, with the AM4 mounting kit
Memory	G.Skill Kit 16GB DDR4 F4 - 3200 C16D - 16 GTZB
Video Card(s)	Sapphire Pulse 6600 8 GB
Storage	1 Samsung NVMe 960 EVO 250 GB + 1 3.5" Seagate IronWolf Pro 6TB 7200RPM 256MB SATA III
Display(s)	LG 27UD58
Case	Fractal Design Define R6 USB-C
Audio Device(s)	Onboard
Power Supply	Corsair TX 850M 80+ Gold
Mouse	Razer Deathadder Elite
Software	Ubuntu 20.04.6 LTS

System Name	Virtual Reality / Bioinformatics
Processor	Undead CPU
Motherboard	Undead TUF X99
Cooling	Noctua NH-D15
Memory	GSkill 128GB DDR4-3000
Video Card(s)	EVGA RTX 3090 FTW3 Ultra
Storage	Samsung 960 Pro 1TB + 860 EVO 2TB + WD Black 5TB
Display(s)	32'' 4K Dell
Case	Fractal Design R5
Audio Device(s)	BOSE 2.0
Power Supply	Seasonic 850watt
Mouse	Logitech Master MX
Keyboard	Corsair K70 Cherry MX Blue
VR HMD	HTC Vive + Oculus Quest 2
Software	Windows 10 P

Processor	AMD Ryzen 9 5900X \|\|\| Intel Core i7-3930K
Motherboard	ASUS ProArt B550-CREATOR \|\|\| Asus P9X79 WS
Cooling	Noctua NH-U14S \|\|\| Be Quiet Pure Rock
Memory	Crucial 2 x 16 GB 3200 MHz \|\|\| Corsair 8 x 8 GB 1333 MHz
Video Card(s)	MSI GTX 1060 3GB \|\|\| MSI GTX 680 4GB
Storage	Samsung 970 PRO 512 GB + 1 TB \|\|\| Intel 545s 512 GB + 256 GB
Display(s)	Asus ROG Swift PG278QR 27" \|\|\| Eizo EV2416W 24"
Case	Fractal Design Define 7 XL x 2
Audio Device(s)	Cambridge Audio DacMagic Plus
Power Supply	Seasonic Focus PX-850 x 2
Mouse	Razer Abyssus
Keyboard	CM Storm QuickFire XT
Software	Ubuntu

Processor	AMD 5900x
Motherboard	Asus x570 Strix-E
Cooling	Hardware Labs
Memory	G.Skill 4000c17 2x16gb
Video Card(s)	RTX 3090
Storage	Sabrent
Display(s)	Samsung G9
Case	Phanteks 719
Audio Device(s)	Fiio K5 Pro
Power Supply	EVGA 1000 P2
Mouse	Logitech G600
Keyboard	Corsair K95

System Name	Budget AMD System
Processor	Threadripper 1900X @ 4.1Ghz (100x41 @ 1.3250V)
Motherboard	Gigabyte X399 Aorus Gaming 7
Cooling	EKWB X399 Monoblock
Memory	4x8GB GSkill TridentZ RGB 14-14-14-32 CR1 @ 3266
Video Card(s)	XFX Radeon RX Vega₆⁴ Liquid @ 1,800Mhz Core, 1025Mhz HBM2
Storage	1x ADATA SX8200 NVMe, 1x Segate 2.5" FireCuda 2TB SATA, 1x 500GB HGST SATA
Display(s)	Vizio 22" 1080p 60hz TV (Samsung Panel)
Case	Corsair 570X
Audio Device(s)	Onboard
Power Supply	Seasonic X Series 850W KM3
Software	Windows 10 Pro x64

System Name	Best AMD Computer
Processor	AMD 7900X3D
Motherboard	Asus X670E E Strix
Cooling	In Win SR36
Memory	GSKILL DDR5 32GB 5200 30
Video Card(s)	Sapphire Pulse 7900XT (Watercooled)
Storage	Corsair MP 700, Seagate 530 2Tb, Adata SX8200 2TBx2, Kingston 2 TBx2, Micron 8 TB, WD AN 1500
Display(s)	GIGABYTE FV43U
Case	Corsair 7000D Airflow
Audio Device(s)	Corsair Void Pro, Logitch Z523 5.1
Power Supply	Deepcool 1000M
Mouse	Logitech g7 gaming mouse
Keyboard	Logitech G510
Software	Windows 11 Pro 64 Steam. GOG, Uplay, Origin
Benchmark Scores	Firestrike: 46183 Time Spy: 25121

System Name	Bragging Rights
Processor	Atom Z3735F 1.33GHz
Motherboard	It has no markings but it's green
Cooling	No, it's a 2.2W processor
Memory	2GB DDR3L-1333
Video Card(s)	Gen7 Intel HD (4EU @ 311MHz)
Storage	32GB eMMC and 128GB Sandisk Extreme U3
Display(s)	10" IPS 1280x800 60Hz
Case	Veddha T2
Audio Device(s)	Apparently, yes
Power Supply	Samsung 18W 5V fast-charger
Mouse	MX Anywhere 2
Keyboard	Logitech MX Keys (not Cherry MX at all)
VR HMD	Samsung Oddyssey, not that I'd plug it into this though....
Software	W10 21H1, barely
Benchmark Scores	I once clocked a Celeron-300A to 564MHz on an Abit BE6 and it scored over 9000.

System Name	Poor Man's PC
Processor	Ryzen 7 9800X3D
Motherboard	MSI B650M Mortar WiFi
Cooling	Thermalright Phantom Spirit 120 with Arctic P12 Max fan
Memory	32GB GSkill Flare X5 DDR5 6000Mhz
Video Card(s)	XFX Merc 310 Radeon RX 7900 XT
Storage	XPG Gammix S70 Blade 2TB + 8 TB WD Ultrastar DC HC320
Display(s)	Xiaomi G Pro 27i MiniLED
Case	Asus A21 Case
Audio Device(s)	MPow Air Wireless + Mi Soundbar
Power Supply	Enermax Revolution DF 650W Gold
Mouse	Logitech MX Anywhere 3
Keyboard	Logitech Pro X + Kailh box heavy pale blue switch + Durock stabilizers
VR HMD	Meta Quest 2
Benchmark Scores	Who need bench when everything already fast?