Those are just vector ops from the perspective of the assembly language.
What I'm talking about is in the compute units themselves. See page 12:
https://developer.amd.com/wp-content/resources/Vega_Shader_ISA_28July2017.pdf
View attachment 172588
sALU processes Scalar instructions (loops, branching, booleans), where sGPRs are primarily booleans, but also function-pointers, the call stack, and things of that nature.
vALUs process vector instructions, which include those "packed" instructions. If we wanted to get more specific, there are also LDS, load/store, and DPP instructions going to different units. But by and large, the two instructions that constitute the majority of AMD GPUs are classified as vector, or scalar.
You're right in that the fixed-function pipeline (not shown in the above diagram), in particular rasterization ("ROPs") constitute a significant portion of the modern GPU. But you can see that the command-processor is very far away from the vALUs / sALUs inside of the compute units.
AMD's command processors are poorly documented. I can't find anything that describes their operation very well. (Well... I could read the ROCm source code, but I'm not
THAT curious...)
But from my understanding: the command processor simply launches wavefronts. That is: it sets up the initial sGPRs for a workgroup (x, y, and z coordinate of the block), as well as VGPR0, VGPR1, and VGPR2 (for the x, y, and z coordinate of the thread). Additional parameters go into sGPRs (shared between all threads). Then, it issues a command to jump (or function call) the compute unit to a location in memory. AMD command processors have a significant amount of hardware scheduling logic for events and ordering of wavefronts: priorities and the like.
But the shader has already been converted into machine code by the OpenCL or Vulkan or DirectX driver, and loaded somewhere. The command processor only has to setup the parameters, and issue a jump command to get a compute unit to that code (once all synchronization functions, such as
OpenCL Events, have proven that this particular wavefront is ready to run).