- Joined
- Nov 4, 2005
- Messages
- 12,013 (1.72/day)
System Name | Compy 386 |
---|---|
Processor | 7800X3D |
Motherboard | Asus |
Cooling | Air for now..... |
Memory | 64 GB DDR5 6400Mhz |
Video Card(s) | 7900XTX 310 Merc |
Storage | Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives |
Display(s) | 55" Samsung 4K HDR |
Audio Device(s) | ATI HDMI |
Mouse | Logitech MX518 |
Keyboard | Razer |
Software | A lot. |
Benchmark Scores | Its fast. Enough. |
I'm leveraging my experience as a software dev and as a person with a degree in Comp Sci on this matter because it's highly applicable and the record needs to be set straight before you start putting weird ideas in people's heads and I'll explain why.
This is easier said than done. You can't simply make all workloads parallel because most code written is serial, there are dependencies between instructions and as a result, would need to share state and memory. The issues here is that overhead of making it parallel might introduce less performance if there is enough coordination that has to occur because most applications will do work to build up data, not build it in parallel, as a result, instructions will depend on the changes from the last (see serializability). It is the developer's job to write code in a way that can leverage hardware when it's available and to know when to apply it because not everything can be done on multiple cores by virtue of the task being done. The state problem is huge and locking will destroy multi-threaded performance. As a result, many successful multi-threaded systems have parts of an application decoupled with queues placed inbetween tasks that much occur serially. This all sounds find and dandy, but now you're just passing state through essentially a series of queues and any work will have to wait if any part of the pipeline slows down (so you're still limited by the least parallel part of your application.) So while you inherently get parallelism by doing this, you add the side effect of increasing latency and not knowing the state of something at any level other than the one it's operating at. So in the case of games, they need to react quickly to less latency means either lower frame rate or input lag.
So while your comment makes sense from looking at it from a high level, it makes absolutely no sense at a low level because that isn't how computers work under the hood and you can't simply break apart an application and make it multi-threaded. It simply doesn't work that way.
I would argue that developers need better tools for easily implementing games that can utilize multiple cores but, languages (like Java, Clojure, C#, etc.) have great constructs for multi-threaded programming. The question isn't if you can do it, the question what is the best way to do it and no one really knows.
I tell people this all the time: "If it were that easy, it would have been done already!"
All 8 of the SATA ports in my tower are used. One M.2 and be done with it, but it's really not designed for mass storage. SATA will always have a place in my book until something better rolls around that doesn't require the device to be attached to the motherboard. (Imagine mounting a spinning drive to a motherboard, that makes for some pretty funny images.
Side note: I had more issues with IDE cables than SATA, so I'm not complaining.
Having done some programming I agree entirely. For multithreaded workloads with dependencies you have to have a parent thread start and dispatch child processes or threads to perform the work while the parent thread synchronizes the results, and that cannot be done on hardware as the hardware has no idea of the actual work being done, it only sees a string of binary. So at the point where we have a parent thread and at least one child thread, and so if we break it down to human language.
Parent thread launches on core 0, it copies in a large dataset for comparison, it then launches a child thread that starts at the top of the thread, and it starts at the bottom, and it then has to check up on a flag being set by the child thread to see if it found a match thus reducing its performance. Try to find customer 123456789 in a dataset that is alpha numeric and customers numbers are randomized so either you have to sort, search the whole table, or you could have it check using a smarter algorithm. Depending on the size of the dataset and processor speed it might be as fast to sort on a single core as it is to search and compare on multiple cores.