According to
this it's 4.0 GHz, but I haven't verified.
i7-7700K is 4.5 GHz (1 core), 4.4 GHz (2-4 cores).
To both;
It's easy to become blind on specs. Heck, even old 80486 supported at least 512kB L2 cache (off-die). L1 and L2 is closely tied to the microarchitecture, which is probably why Intel and AMD tweak the config more or less every generation. Heat is not the primary concern, but the size on the die certainly is, since it needs to be connected in the ideal spot. Moving it slightly might cause higher latency, and with higher clock speeds this is more sensitive than ever.
So back to the subject you both were mentioning; why is Skylake-S more efficient with half the L2 cache of Zen? It comes down to
how the cache is used. The front-end/prefetcher operates on an instruction window, does OoOE, predicts branches etc. While Skylake have a slightly larger instruction window than Zen (224 vs. 192), Zen have other advantages like a larger micro-op cache (2048 vs. 1536), more L2 cache and more execution ports, so on paper Zen looks fairly strong but still doesn't answer the question.
When it comes to prefetching, you might think more is better, right? Wrong. Each cache line you write to L2 kicks something else out, so if you cache "useless" stuff, it might kick out more useful stuff, and you'll end up hurting the performance. So the most important thing of all is the algorithm you use to predict, and while it may not be visible in the tech specs, it's more important than the size of the L2 etc. This is one of the reasons why I keep saying good benchmarks is what matters, not buying CPUs or GPUs based on a "limited" understanding of what they even mean.
It also comes down to types of workloads.
Many server workloads are typically async, which means they will scale nearly perfectly across any core count, so all that really matters then is the balance between performance and total efficiency.
Synchronized multithreaded workloads on the other hand tend to get diminishing returns with increasing core count, so balancing core count and core speed is more important for typical end-user workloads.