Unified memory is only necessary because LLM data sets are too large to fit on a GPU's dedicated VRAM. That enables marketing departments to make deceitful, misleading slides that fanboys of respective companies (and this is not a problem with AMD or NVIDIA, it applies to pretty much both of them) will often parrot without question, such as this little gem right here:
View attachment 385052
Of course it's "up to 2.2x faster" when you can actually load the model onto memory (provided you have at least 96 or 128 GB of RAM in this case), and you're not at all compute bottlenecked, which is the issue quite literally any GPU short of NVIDIA's many-thousand-dollar, 80 GB+ HBM AI accelerators right now. Needless to say, the person who posted this slide to me as a rebuttal on X (where I told them, that if they believed this product was faster than a 4090 at anything, I had a bridge to sell 'em) summarily blocked me right after posting it and calling me a "smug f**k", go figure. For context, Ryzen AI Max Plus 395 is rated at 126 AI TOPS, an RTX 4090 is, at a worst case scenario basis, 10x faster.