Have you ever read any game developer's blog? They do specific code not only for each brand, but for almost each card architecture. Then they may optimize better the code specific for Nvidia hardware under TWIMTBP, because Nvidia gives them extense support. In that respect the code for Nvidia hardware may be better optimized, but each card has its own code.
Long before TWIMTBP, many first tier game developers said the way 3DMark did things WAS NOT how they were going to do things in the future. AFAIK this never changed, so it's the same with 3DMark 06 too. Saying that Ati architecture is any better based on these kind of benchmarks, involves that those benchmarks are doing things right, which they aren't.
This is something that has always bugged me. People say that game developers are making code "Nvidia's way", but they never stop and think that MAYBE Nvidia is doing their hardware in "game developers way", while Ati may not. TWIMTBP it's a two way relationship. With the time Ati has become better (comparatively) in benchmarks and worse in games. Everybody blames TWIMTBP for this, without taking into account each company's design decisions. For a simple example, Ati's Superscalar, VLIW and SIMD shader processors are A LOT better suited for the HOMOGENEOUS CODE involved in an static benchmark, than for the ever changing code involved in a game. Also in the case of R600 and one of it's biggest flaws, its TMUs, benchmarks are a lot more favorable than games, since you can "guess" which texture comes next and you don't have to care about the textures that have already been used. In games you don't know where the camera will head next, so you don't know which textures you can discard. In reality none of the architectures can "guess" the next texture, but G80/92 with it's bigger texture power can react better to texture changes. On benchmarks R600/670 can mitigate this effect with streaming.