It would be best if all stuff need in one frame are stored in VRAM (read>done). If not, GPU have to wait until all data for the frame is ready. (read>wait>read>done). The wait part, when the VRAM need to get new stuff from RAM, is irrelevant with PCIe 3.0 speed, and the delay was mostly caused by the additional read part.
However, with HBM the speed of this read part is 9 times smaller than that of GDDR5 at same clock. In FuryX and TitanX case, this ratio reduce to 0.6 times smaller due to different clock. Some simple math from here show that with good scheduler from driver, 4GB capacity is not that big issue with frames of 4GB-8GB zone. At 8GB-12GB zone, the difference will be more clear, but the GPU also suffers here, which make the delay of memory less significant. In short, 4GB of HBM on FuryX can keep up with 12GB of TitanX in 4GB-8GB zone, and is superior in sub 4GB area.