Schedulers are fancy learning algorithms, where "learning" is the old school 1980s definition, and not the modern "Deep Learning Neural Net" definition. I'm no expert, but read up on how 2.6.xx Linux's "Fair Scheduler" works.
en.wikipedia.org
I'm sure Linux has been upgraded since then, but that's what was taught in my college years, so its the only scheduler I'm really familiar with. The Wikipedia link has a decent description:
The leftmost node in a Red/Black tree is the node that has the highest priority. Priorities change based off of dynamic scheduling: that is, Linux is adding and subtracting from the priority number in an attempt to maximize responsiveness, throughput, and other statistics. Its pretty dumb all else considered, but these algorithms work quite well when all cores are similar.
Modern schedulers also account for "Hot" cores, where L1 / L2 / L3 is already primed with the data associated with a task (aka: Thread Affinity), NUMA (the distance that data has to travel to get to RAM). There are issues like "Priority Inversion" (Task A is a high priority task for some reason. Task Excel-Spreadsheet is low priority. But for some reason, Task A is waiting on Task Excel Spreadsheet. So the scheduler needs to detect this situation and temporarily increase Excel-Spreadsheet's priority so that Task A can resume quicker).
------------
I guess you can say that "Schedulers" are adaptive like branch predictors and L1 caches. They follow a set of dumb rules that works in practice, allowing for basic levels of adaptation. But there's no AI here, its just a really good set of dumb rules that's been tweaked over the past 40 years to get good results on modern processors.
Scheduling is
provably NP complete. The only way to find the optimal schedule is to try all combinations of choices. Alas: if you did that, you'd spend more time scheduling rather than running the underlying programs!!! Schedulers need to run in less than 10-microseconds to be effective (any slower, and you start taking up way more time than the underlying programs).
----------------
Honestly? I think the main solution is to just have a programmer flag. Just like Thread Affinity / NUMA Affinity, you can use heuristics to have a "sane default" but not really work in all cases. Any programmer who knows about modern big.LITTLE architecture can just say "Allocate little-thread" (a thread that's Affinity to a little-core) explicitly, because said programmer knows that his thread works best on LITTLE for some reason.
That's how the problem is "solved" for NUMA and core-affinity already. Might as well keep that solution. Then, have Windows developers go through all of the system processes, and test individually which ones work better on LITTLE vs big cores and manually tweak the configuration of Windows until its optimal.
If you can't solve the problem in code, solve the problem with human effort. There may be thousands of Windows-processes, but you only have to do the categorization step once. Give a few good testers / developers 6 months on the problem, and you'll probably get adequate results that will improve over the next 2 years.