Nope that's the case. There's only 160 pipelines, so you can have 160 threads feeding those 800 "cores" as long as the program can pack them together in an VLIW instruction, but it's not exactly the same and requires a lot of anticipation, not always posible. In fact almost never posible.
I'm not compating the SPs to x86 cores in any way, I don't know how did you come up to that conclusion.
Because of the VLIW nature of the SPs you could potentially make an engine that only works with 5 wide VLIW instructions and then you could potentially fill all the "cores", but that engine would not work on Nvidia cards or pre R600 Ati cards, not to mention it would not be profitable to do so and DirectX has no such functionality so you would have to make your engine entirely on HLSL. Still filling the 5 ALUs with something relevant to do would be very very difficult.
http://perspectives.mvdirona.com/2009/03/18/HeterogeneousComputingUsingGPGPUsAMDATIRV770.aspx
On general computing you will not see that typical usage of 4.2 and will be closer to 1 most times than not and hence the real Gflops on the Ati cards with this design is 1/5th or 2/5th of the peak throughoutput.
Also when a special function must be calculated you loose one of those ALUs (the fat one) for many clocks (probably you loose the entire SP), whereas the Nvidia card can do both the SF and the ALU operation and this is not the famous dual-issue, it can always be done as long as the SF function and the thread being executed in the ALUs were issued in a different clock.