• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

ATI Radeon HD 4800 Series Video Cards Specs Leaked

Joined
Apr 7, 2008
Messages
633 (0.10/day)
Location
Australia
System Name _Speedforce_ (Successor to Strike-X, 4LI3NBR33D-H, Core-iH7 & Nemesis-H)
Processor Intel Core i9 7980XE (Lapped) @ 5.2Ghz With XSPC Raystorm (Lapped)
Motherboard Asus Rampage VI Extreme (XSPC Watercooled) - Custom Heatsinks (Lapped)
Cooling XSPC Custom Water Cooling + Custom Air Cooling (From Delta 220's TFB1212GHE to Spal 30101504&5)
Memory 8x 8Gb G.Skill Trident Z RGB 4266MHz @ 4667Mhz (2x F4-4266C17Q-32GTZR)
Video Card(s) 3x Asus GTX1080 Ti (Lapped) With Customised EK Waterblock (Lapped) + Custom heatsinks (Lapped)
Storage 1x Samsung 970 EVO 2TB - 2280 (Hyper M.2 x16 Card), 7x Samsung 860 Pro 4Tb
Display(s) 6x Asus ROG Swift PG348Q
Case Aerocool Strike X (Modified)
Audio Device(s) Creative Sound BlasterX AE-5 & Aurvana XFi Headphones
Power Supply 2x Corsair AX1500i With Custom Sheilding, Custom Switching Unit. Braided Cables.
Mouse Razer Copperhead + R.A.T 9
Keyboard Ideazon Zboard + Optimus Maximus. Logitech G13.
Software w10 Pro x64.
Benchmark Scores pppft, gotta see it to believe it. . .
I agree, stuff the power consuption, give me a raw beast!
 

Morgoth

Fueled by Sapphire
Joined
Aug 4, 2007
Messages
4,237 (0.67/day)
Location
Netherlands
System Name Wopr "War Operation Plan Response"
Processor 5900x ryzen 9 12 cores 24 threads
Motherboard aorus x570 pro
Cooling air (GPU Liquid graphene) rad outside case mounted 120mm 68mm thick
Memory kingston 32gb ddr4 3200mhz ecc 2x16gb
Video Card(s) sapphire RX 6950 xt Nitro+ 16gb
Storage 300gb hdd OS backup. Crucial 500gb ssd OS. 6tb raid 1 hdd. 1.8tb pci-e nytro warp drive LSI
Display(s) AOC display 1080p
Case SilverStone SST-CS380 V2
Audio Device(s) Onboard
Power Supply Corsair 850MX watt
Mouse corsair gaming mouse
Keyboard Microsoft brand
Software Windows 10 pro 64bit, Luxion Keyshot 7, fusion 360, steam
Benchmark Scores timespy 19 104
rwar more power consuption = more heat = les overclock = nucler melddown when overclock
 

imperialreign

New Member
Joined
Jul 19, 2007
Messages
7,043 (1.11/day)
Location
Sector ZZ₉ Plural Z Alpha
System Name УльтраФиолет
Processor Intel Kentsfield Q9650 @ 3.8GHz (4.2GHz highest achieved)
Motherboard ASUS P5E3 Deluxe/WiFi; X38 NSB, ICH9R SSB
Cooling Delta V3 block, XPSC res, 120x3 rad, ST 1/2" pump - 10 fans, SYSTRIN HDD cooler, Antec HDD cooler
Memory Dual channel 8GB OCZ Platinum DDR3 @ 1800MHz @ 7-7-7-20 1T
Video Card(s) Quadfire: (2) Sapphire HD5970
Storage (2) WD VelociRaptor 300GB SATA-300; WD 320GB SATA-300; WD 200GB UATA + WD 160GB UATA
Display(s) Samsung Syncmaster T240 24" (16:10)
Case Cooler Master Stacker 830
Audio Device(s) Creative X-Fi Titanium Fatal1ty Pro PCI-E x1
Power Supply Kingwin Mach1 1200W modular
Software Windows XP Home SP3; Vista Ultimate x64 SP2
Benchmark Scores 3m06: 20270 here: http://hwbot.org/user.do?userId=12313
rwar more power consuption = more heat = les overclock = nucler melddown when overclock

yep - Intel has proven this (except for the less OC bit) with the Prescott lineup :p
 

HAL7000

New Member
Joined
Jul 28, 2007
Messages
263 (0.04/day)
Location
Nashville TN
Processor AMD Athlon 64 X2 6000+ Windsor 3.0GHz
Motherboard ECS KA3 MVP
Cooling stock
Memory Mushkin eXtreme Performance 2GB (2 x 1GB)
Video Card(s) X1900GT
Storage SAMSUNG SpinPoint P Series SP2004C 200GB
Display(s) Acer AL2051W 20" 8ms DVI Widescreen LCD Monitor
Case IN WIN IW-F430.RL Red
Audio Device(s) Creative Audigy gamers edition
Power Supply PC Power & Cooling Silencer 750 Quad - Copper
Software XP Pro w/SP3
rwar more power consuption = more heat = les overclock = nucler melddown when overclock

Understood, that is of course you plan on overclocking and how you cool your system . The energy I was refering to would not even come near what the prescott consumed.
My point was simple, I don't care about saving energy to power my system up.
I just would like AMD to release something worth building (for myself). The 4870 x2 and whatever else they decide to release I hope is not just a play on words. Its been real close many times to join the dark side. But will hold off this one last time.
 
Joined
Aug 30, 2006
Messages
7,221 (1.08/day)
System Name ICE-QUAD // ICE-CRUNCH
Processor Q6600 // 2x Xeon 5472
Memory 2GB DDR // 8GB FB-DIMM
Video Card(s) HD3850-AGP // FireGL 3400
Display(s) 2 x Samsung 204Ts = 3200x1200
Audio Device(s) Audigy 2
Software Windows Server 2003 R2 as a Workstation now migrated to W10 with regrets.
your link said:
In terms of performance, we heard some interesting claims. A 4870 should perform on par with or better than a dual-chip 3870 X2. Our sources explained to us that using a PCIe Gen1 controller 3870 X2 was a mistake, since the board was hungry for data and didn't sync well with this interface
I'd be delighted if the 4870 really was as fast as a 2x 3870 in crossfire. (A 3870X2 is actually clocked as a 3850 and should really be called 3850X2)

But I dont think the reasons they give will result in such a performance gain:

#1. 480 vs. 320 shaders = 50% improvement in the BEST POSSSIBLE situation, ie. purely shader limited.

#2. 32 vs. 16 TMU = 100% improvement... now I actually think THIS is going to have a bigger impact.

#3. 16 ROPS vs. 16 ROPS = no change here or to architecture.

#4. PCIe v1.0 controller? Well, check my benchies... my AGP is as fast as a PCIe16 card... given similar processor and proc speed. No. The interface is irrelevant UNLESS the graphics assets are in memory and not on the card.

#5. As I have always said, there will be increases associated with increased clocks, but points 1-4 refer to clock for clock gainst.

Net net? 50%-100% improvement IN THE BEST sitation (clock for clock) depending on where the limit was, ie shader limit or resolution limit.

On average? Less than 50%.

In practice. For the average person, FPS at, say, 1280x1024 will not improve by more than 20-30%. But you WILL BE ABLE to dial up much higher FSAA and AA without performance penalty. (And PLEASE read that as "much performance penalty". Its a relative comment, not supposed to mean exactly 100% same performance :rolleyes:)
 
Joined
Mar 1, 2008
Messages
284 (0.05/day)
Location
Antwerp, Belgium
I'd be delighted if the 4870 really was as fast as a 2x 3870 in crossfire. (A 3870X2 is actually clocked as a 3850 and should really be called 3850X2)

But I dont think the reasons they give will result in such a performance gain:

#1. 480 vs. 320 shaders = 50% improvement in the BEST POSSSIBLE situation, ie. purely shader limited.

#2. 32 vs. 16 TMU = 100% improvement... now I actually think THIS is going to have a bigger impact.

#3. 16 ROPS vs. 16 ROPS = no change here or to architecture.

#4. PCIe v1.0 controller? Well, check my benchies... my AGP is as fast as a PCIe16 card... given similar processor and proc speed. No. The interface is irrelevant UNLESS the graphics assets are in memory and not on the card.

#5. As I have always said, there will be increases associated with increased clocks, but points 1-4 refer to clock for clock gainst.

Net net? 50%-100% improvement IN THE BEST sitation (clock for clock) depending on where the limit was, ie shader limit or resolution limit.

On average? Less than 50%.

In practice. For the average person, FPS at, say, 1280x1024 will not improve by more than 20-30%. But you WILL BE ABLE to dial up much higher FSAA and AA without performance penalty. (And PLEASE read that as "much performance penalty". Its a relative comment, not supposed to mean exactly 100% same performance :rolleyes:)

lemonadesoda, if you don't mind, i have to correct you.

First off, the 3870X2 gpu's are clocked at 825Mhz. The 3870 is clocked at 777Mhz. So i don't know why you compare it with a 3850 (670Mhz btw).
Check out this for refrence: http://techreport.com/articles.x/14284/5
So a 3870X2 is only 4% slower than a 3870 CF. The reason why it's slower is because only one CF bridge is connected onboard and one is still free for CF-X. Normal CF uses two bridges.
All things aside, 4% is nothing. So if they say it's as fast or faster than a 3870X2 then that means around 70% faster than a 3870. That's what that sentence mean. Nothing more, nothing less.

A couple of things i need to rectify (again):
You shouldn't compare the amount of shaders but the GFlop they can compute.
RV670 = 497GFlop
RV770 = 1008GFlop
So shader power is increased more than 100%.

Also i don't care what a GPU can do at 1280x1024. That resolution is mostly cpu bound.
If you want to compare GPU's, you need to go over 1600x1200. That's just the way it is.
 
Joined
May 9, 2006
Messages
2,116 (0.31/day)
System Name Not named
Processor Intel 8700k @ 5Ghz
Motherboard Asus ROG STRIX Z370-E Gaming
Cooling DeepCool Assassin II
Memory 16GB DDR4 Corsair LPX 3000mhz CL15
Video Card(s) Zotac 1080 Ti AMP EXTREME
Storage Samsung 960 PRO 512GB
Display(s) 24" Dell IPS 1920x1200
Case Fractal Design R5
Power Supply Corsair AX760 Watt Fully Modular
Children no fighting until the cards are released please. Then we can find out what they're actually capable of.
 

DarkMatter

New Member
Joined
Oct 5, 2007
Messages
1,714 (0.27/day)
Processor Intel C2Q Q6600 @ Stock (for now)
Motherboard Asus P5Q-E
Cooling Proc: Scythe Mine, Graphics: Zalman VF900 Cu
Memory 4 GB (2x2GB) DDR2 Corsair Dominator 1066Mhz 5-5-5-15
Video Card(s) GigaByte 8800GT Stock Clocks: 700Mhz Core, 1700 Shader, 1940 Memory
Storage 74 GB WD Raptor 10000rpm, 2x250 GB Seagate Raid 0
Display(s) HP p1130, 21" Trinitron
Case Antec p180
Audio Device(s) Creative X-Fi PLatinum
Power Supply 700W FSP Group 85% Efficiency
Software Windows XP
I'd be delighted if the 4870 really was as fast as a 2x 3870 in crossfire. (A 3870X2 is actually clocked as a 3850 and should really be called 3850X2)

But I dont think the reasons they give will result in such a performance gain:

#1. 480 vs. 320 shaders = 50% improvement in the BEST POSSSIBLE situation, ie. purely shader limited.

#2. 32 vs. 16 TMU = 100% improvement... now I actually think THIS is going to have a bigger impact.

#3. 16 ROPS vs. 16 ROPS = no change here or to architecture.

#4. PCIe v1.0 controller? Well, check my benchies... my AGP is as fast as a PCIe16 card... given similar processor and proc speed. No. The interface is irrelevant UNLESS the graphics assets are in memory and not on the card.

#5. As I have always said, there will be increases associated with increased clocks, but points 1-4 refer to clock for clock gainst.

Net net? 50%-100% improvement IN THE BEST sitation (clock for clock) depending on where the limit was, ie shader limit or resolution limit.

On average? Less than 50%.

In practice. For the average person, FPS at, say, 1280x1024 will not improve by more than 20-30%. But you WILL BE ABLE to dial up much higher FSAA and AA without performance penalty. (And PLEASE read that as "much performance penalty". Its a relative comment, not supposed to mean exactly 100% same performance :rolleyes:)

Every time you post about this, you demostrate your lack of knowledge on the matter. The X2 with HD3850 clocks? My God. Whatever, I don't want to fight again, I will only try to explain why could they say PCIe 1 wasn't enough and why is so important.

One thing is the interface between the card and the system and another completely different one is the one inside the card, the PCIe bridge. The one they are reffering to is the bridge chip between the two RV670 cores. They are using PCIe as they could have used Hyper Transport or another one. They used this for driver compativility, I'm 99,99% sure. I guess they are using it to comunicate between the cores (obvious), but most importantly to get some kind of cache coherency between them. Why this coherency is important? Because that way one core can use the info calculated by the other. AFAIK normal Crossfire (and SLI) does little of this, each card renders odd frames or lines or something (pixel quads, clusters, whatever, let's call them pixel arrays), while the other renders even parts. If the array in core 1 takes 10 times more to render than the one in core 2, you lose a lot of time waiting. You need some kind of comunication between them to let core2 continue the work of core1 without doing a mess. PCIe bandwidth while more than enough for texture and general data transfers between main memory and the card, is pretty slow for that kind of work. For a comparison PCIe 1.1 has a maximum bandwidth of 4 GB/s, while typical CPU chaches are around 30-50 GB/s. PCIe 2.0 increases to 8 GB/s which is still far away, but definately better. Only Ati knows why they used PCIe 1 in the first place, knowing this, but they learnt and move ahead. Let's hope it turns out better this time around.
 
Joined
Aug 30, 2006
Messages
7,221 (1.08/day)
System Name ICE-QUAD // ICE-CRUNCH
Processor Q6600 // 2x Xeon 5472
Memory 2GB DDR // 8GB FB-DIMM
Video Card(s) HD3850-AGP // FireGL 3400
Display(s) 2 x Samsung 204Ts = 3200x1200
Audio Device(s) Audigy 2
Software Windows Server 2003 R2 as a Workstation now migrated to W10 with regrets.
@milli,

My bad, i read elsewhere that the 3870X2 "is actually 3850X2 but marketed as 3870X2". Yep, I should be more careful about what info I pick up and pass on. Thanks for the correction.

RV670 = 497GFlop
RV770 = 1008GFlop
So shader power is increased more than 100%.

If that is true, then GREAT! But hasnt RV770 been advertised as no architectural change? For 50% increase in shaders to give 100% increase in power *must* require quite a different architectural approach. If that's true, then 4870 will be a winner, baby!

@darkmatter,

everytime you prove your lack of diplomacy. Man, they must have been tough on you at school.
 

DarkMatter

New Member
Joined
Oct 5, 2007
Messages
1,714 (0.27/day)
Processor Intel C2Q Q6600 @ Stock (for now)
Motherboard Asus P5Q-E
Cooling Proc: Scythe Mine, Graphics: Zalman VF900 Cu
Memory 4 GB (2x2GB) DDR2 Corsair Dominator 1066Mhz 5-5-5-15
Video Card(s) GigaByte 8800GT Stock Clocks: 700Mhz Core, 1700 Shader, 1940 Memory
Storage 74 GB WD Raptor 10000rpm, 2x250 GB Seagate Raid 0
Display(s) HP p1130, 21" Trinitron
Case Antec p180
Audio Device(s) Creative X-Fi PLatinum
Power Supply 700W FSP Group 85% Efficiency
Software Windows XP
@milli,

My bad, i read elsewhere that the 3870X2 "is actually 3850X2 but marketed as 3870X2". Yep, I should be more careful about what info I pick up and pass on. Thanks for the correction.

RV670 = 497GFlop
RV770 = 1008GFlop
So shader power is increased more than 100%.

If that is true, then GREAT! But hasnt RV770 been advertised as no architectural change? For 50% increase in shaders to give 100% increase in power *must* require quite a different architectural approach. If that's true, then 4870 will be a winner, baby!

@darkmatter,

everytime you prove your lack of diplomacy. Man, they must have been tough on you at school.

:shadedshu I explained why it has double the shader power in the very first post that flamated our discusiion. Maybe I lack diplomacy, but at least I listen (read) to others and learn. I don't talk too much, knowing zero about the matter and discuss others opinions with arbitrary numbers taken out of mist.
 
Joined
Aug 30, 2006
Messages
7,221 (1.08/day)
System Name ICE-QUAD // ICE-CRUNCH
Processor Q6600 // 2x Xeon 5472
Memory 2GB DDR // 8GB FB-DIMM
Video Card(s) HD3850-AGP // FireGL 3400
Display(s) 2 x Samsung 204Ts = 3200x1200
Audio Device(s) Audigy 2
Software Windows Server 2003 R2 as a Workstation now migrated to W10 with regrets.
Refer to point #5. You are taking my comments out of context. I'm talking about performance increases clock/clock. Until the boards are out in the channels, we dont know the clocks, so we can only make assessments on KNOWN architectural changes, while the clock effects are guesswork until we know what they are. My comments have always been very clearly stated as changes on same clocks... so please go back and "read" before getting so hot under the collar!

3870X2 = 3850X2 with overclock. Fact. Both use GDDR3. Put the 3870X2 at the same core clocks as 2x 3870 in crossfire (on GDDR4) and which will win?
 

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,244 (7.54/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
@milli,

My bad, i read elsewhere that the 3870X2 "is actually 3850X2 but marketed as 3870X2". Yep, I should be more careful about what info I pick up and pass on. Thanks for the correction.

RV670 = 497GFlop
RV770 = 1008GFlop
So shader power is increased more than 100%.

If that is true, then GREAT! But hasnt RV770 been advertised as no architectural change? For 50% increase in shaders to give 100% increase in power *must* require quite a different architectural approach. If that's true, then 4870 will be a winner, baby!
]

The 'different' architecture comes in the form of shaders having their own clock-generator, shaders are clocked well above 1 GHz while the geometry domain stays below 800 MHz.
 

DarkMatter

New Member
Joined
Oct 5, 2007
Messages
1,714 (0.27/day)
Processor Intel C2Q Q6600 @ Stock (for now)
Motherboard Asus P5Q-E
Cooling Proc: Scythe Mine, Graphics: Zalman VF900 Cu
Memory 4 GB (2x2GB) DDR2 Corsair Dominator 1066Mhz 5-5-5-15
Video Card(s) GigaByte 8800GT Stock Clocks: 700Mhz Core, 1700 Shader, 1940 Memory
Storage 74 GB WD Raptor 10000rpm, 2x250 GB Seagate Raid 0
Display(s) HP p1130, 21" Trinitron
Case Antec p180
Audio Device(s) Creative X-Fi PLatinum
Power Supply 700W FSP Group 85% Efficiency
Software Windows XP
Refer to point #5. You are taking my comments out of context. I'm talking about performance increases clock/clock. Until the boards are out in the channels, we dont know the clocks, so we can only make assessments on KNOWN architectural changes, while the clock effects are guesswork until we know what they are. My comments have always been very clearly stated as changes on same clocks... so please go back and "read" before getting so hot under the collar!

3870X2 = 3850X2 with overclock. Fact. Both use GDDR3. Put the 3870X2 at the same core clocks as 2x 3870 in crossfire (on GDDR4) and which will win?

Point is you can't use clock for clock comparisons because RV770 will run faster, and that's in fact one of the advancements of new chips. Minor changes in internal units can affect how far they reach and improvements in the fab process (within the same process) can help obtaining higher stable clocks. It's like saying that HD3850 is as fast as HD3870 with the argument that if run at same speeds they will be equally fast. Wait, you did. Well it's essentially true, but HD3850 can only dream of reaching as high as HD3870, it's pointless to compare them clock for clock and claim no difference in performance.

Given the same architecture, higher clocks, and more shaders, I think these are the performance implications:

1./ Broadly similar performance at standard resolutions e.g. 1280x1024 and with no AA FSAA effects since no architectural changes
2./ General improvement in line with clock-for-clock increases 10-20%
3./ The increase to 32 TMU will mean that the cards wont CHOKE at higher resolutions. It will be able to handle 1920x1200 without hitting the wall
4./ Currently you can dial up 4x AA without any performance hit. With the extra shaders you can do the same at 1920x1200 now
5./ With the extra shaders, you will be able to dial up 8x or 16x at 1280x1024 without a significant hit.
6./ The GPU will run hotter and require more power
7./ Compensated by using GDDR5 memory that will require less power and run a bit cooler

Net net... get the GDDR5 model.

Will there be a "jump" in performance like we saw between the x19xx series and hd38xx? No.

Tell me where did you stated you were talking about clock for clock. You didn't up until now, in fact the post above makes me think you had taken clocks into account, since it's the only thing you say will improve the performance in HD4000 series. Neither can we read anything about clock for clock comparison in the next posts, until post #256. Even then you overlook the fact that shaders are running a lot faster and say a 50% is THE BEST POSSIBLE improvement in this area. But that's not the worst part. The worst part is that after saying there's a 50% improvement in shaders and 100% improvement in textures, you come to the conclusion that performance will be LESS than 50%, in fact around 20-30%! :eek:

How can that be? Well, since GDDR5 would make memory bandwidth double of that in HD3000 series, there's only raster power left. You could have argued the weight of ROPs in the final performance, and say that my thoughts about them were wrong, which could be true AT HIGHER RESOLUTIONS, and not in 1280x as you are saying. If shader and texture power is double that of RV670 there are no reasons to say it won't be 2x faster, specially at lower resolutions where ROPS don't count as much. It's an incongruity everything you said, and that's why I say you don't know about this. That and the fact that by your posts, it seems that you act as if fuctional units (ROP, SP, TMU) and clocks were independent and had nothing to do with each other, or something of the like. Like shaders did only AA, like extra TMUs only work when bigger textures are loaded and are idle otherwise, etc. Example of this is when MrMilli said RV770 has double the GFlops you say:

If that is true, then GREAT! But hasnt RV770 been advertised as no architectural change? For 50% increase in shaders to give 100% increase in power *must* require quite a different architectural approach. If that's true, then 4870 will be a winner, baby!

Maybe I'm understanding this badly, but it seems as if you took that as magic. Like it *must* be something underlaying there, something shaddy. It demostrates your lack of knowlegde IMO.
 
Joined
Aug 30, 2006
Messages
7,221 (1.08/day)
System Name ICE-QUAD // ICE-CRUNCH
Processor Q6600 // 2x Xeon 5472
Memory 2GB DDR // 8GB FB-DIMM
Video Card(s) HD3850-AGP // FireGL 3400
Display(s) 2 x Samsung 204Ts = 3200x1200
Audio Device(s) Audigy 2
Software Windows Server 2003 R2 as a Workstation now migrated to W10 with regrets.
It's a basic analytic approach. To separate independent factors, to understand where the gains are coming from. An analogue is with CPUs.

Comparing P4 to Core2 you can just go, Chip A vs. Chip B. Oh look chip B is faster. Or you can break it down, and analyse the performance on things what you can set independently. Much better to compare A and B at-the-same-clock first, to see architectural gains, then observe the additional gain/loss through different clockspeeds. Likewise (in the CPU world) with amount of cache, or number of cores.

With the RV770, you can break it down to:

1./ Increase in shaders ---> impact, and in what situations
2./ Increase in TMU ---> impact, and in what situations
3./ Increase in ROP ---> impact, and in what situations
4./ Change in memory type ---> impact
5./ Increase in clocks ---> impact (unknown at start of thread, although strong speculation now about what they will be, but until in retail channels, we really dont know what is "consumer stable" from the product manufacturers).

With the R770 what is ATi trying to address? The shader and texture "wall" at high resolutions for greater FPS. For regular resolutions? The benefit is being able to dial up higher AA and FSAA. I still hold the view that at a regular 1280x1024 without (or low) AA, FSAA, the performance gains will be relatively small. At high resolutions like 1920x1600, or when at 8xx and 16xx FSAA, AA, thats where the gain will be.

It's quite clear from the benchmarks that the RV770 *will* be very fast HD.4870.3Dmark06benchmark.leak.html=21,223 :D.

I'm very happy to listen to any argument except the lame "you demonstrate lack of knowledge", "you're not very versed, are you?". I find it insulting, and your continued use of it demonstrates a major lack of politeness and bellicose attitude.

Refer to post #218. Please do not try to kindle old flames. This was dealt with. Turn off your microphone. The jury's out until the cards are in.
 

Mussels

Freshwater Moderator
Joined
Oct 6, 2004
Messages
58,413 (7.94/day)
Location
Oystralia
System Name Rainbow Sparkles (Power efficient, <350W gaming load)
Processor Ryzen R7 5800x3D (Undervolted, 4.45GHz all core)
Motherboard Asus x570-F (BIOS Modded)
Cooling Alphacool Apex UV - Alphacool Eisblock XPX Aurora + EK Quantum ARGB 3090 w/ active backplate
Memory 2x32GB DDR4 3600 Corsair Vengeance RGB @3866 C18-22-22-22-42 TRFC704 (1.4V Hynix MJR - SoC 1.15V)
Video Card(s) Galax RTX 3090 SG 24GB: Underclocked to 1700Mhz 0.750v (375W down to 250W))
Storage 2TB WD SN850 NVME + 1TB Sasmsung 970 Pro NVME + 1TB Intel 6000P NVME USB 3.2
Display(s) Phillips 32 32M1N5800A (4k144), LG 32" (4K60) | Gigabyte G32QC (2k165) | Phillips 328m6fjrmb (2K144)
Case Fractal Design R6
Audio Device(s) Logitech G560 | Corsair Void pro RGB |Blue Yeti mic
Power Supply Fractal Ion+ 2 860W (Platinum) (This thing is God-tier. Silent and TINY)
Mouse Logitech G Pro wireless + Steelseries Prisma XL
Keyboard Razer Huntsman TE ( Sexy white keycaps)
VR HMD Oculus Rift S + Quest 2
Software Windows 11 pro x64 (Yes, it's genuinely a good OS) OpenRGB - ditch the branded bloatware!
Benchmark Scores Nyooom.
hey guys... we can stop. you're both offering advice/insight here, and conflicting. why dont we just take bets until the first reviews come out, and see if its 20-30% faster, or 50%+

Either way, you can buy me cards and i'll do an independant review for you
 

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,244 (7.54/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro

DarkMatter

New Member
Joined
Oct 5, 2007
Messages
1,714 (0.27/day)
Processor Intel C2Q Q6600 @ Stock (for now)
Motherboard Asus P5Q-E
Cooling Proc: Scythe Mine, Graphics: Zalman VF900 Cu
Memory 4 GB (2x2GB) DDR2 Corsair Dominator 1066Mhz 5-5-5-15
Video Card(s) GigaByte 8800GT Stock Clocks: 700Mhz Core, 1700 Shader, 1940 Memory
Storage 74 GB WD Raptor 10000rpm, 2x250 GB Seagate Raid 0
Display(s) HP p1130, 21" Trinitron
Case Antec p180
Audio Device(s) Creative X-Fi PLatinum
Power Supply 700W FSP Group 85% Efficiency
Software Windows XP
It's a basic analytic approach. To separate independent factors, to understand where the gains are coming from. An analogue is with CPUs.

Comparing P4 to Core2 you can just go, Chip A vs. Chip B. Oh look chip B is faster. Or you can break it down, and analyse the performance on things what you can set independently. Much better to compare A and B at-the-same-clock first, to see architectural gains, then observe the additional gain/loss through different clockspeeds. Likewise (in the CPU world) with amount of cache, or number of cores.

With the RV770, you can break it down to:

1./ Increase in shaders ---> impact, and in what situations
2./ Increase in TMU ---> impact, and in what situations
3./ Increase in ROP ---> impact, and in what situations
4./ Change in memory type ---> impact
5./ Increase in clocks ---> impact (unknown at start of thread, although strong speculation now about what they will be, but until in retail channels, we really dont know what is "consumer stable" from the product manufacturers).

With the R770 what is ATi trying to address? The shader and texture "wall" at high resolutions for greater FPS. For regular resolutions? The benefit is being able to dial up higher AA and FSAA. I still hold the view that at a regular 1280x1024 without (or low) AA, FSAA, the performance gains will be relatively small. At high resolutions like 1920x1600, or when at 8xx and 16xx FSAA, AA, thats where the gain will be.

It's quite clear from the benchmarks that the RV770 *will* be very fast HD.4870.3Dmark06benchmark.leak.html=21,223 :D.

I'm very happy to listen to any argument except the lame "you demonstrate lack of knowledge", "you're not very versed, are you?". I find it insulting, and your continued use of it demonstrates a major lack of politeness and bellicose attitude.

Refer to post #218. Please do not try to kindle old flames. This was dealt with. Turn off your microphone. The jury's out until the cards are in.

Man, that's good and all, but you then forget that if you analyse it and have an increase (2X in fact) in EVERY STAGE, then performance will increase in every (or most) situations!! Until you understand this, I feel I have to continue. I will put an example:

You have three guys making cheeseburgers, one does the meat (A), the second (B) does the cheese and the third takes that, the bread and puts it together (C).

Analyticaly:

- If instead of A we put two guys, we won't get any benefit and will only get an improvement in those situations where you need more than one guy, i.e if you want to put two meat sticks per burger (sorry I don't know their actual names).

- If we use two B guys it happens the same, unless you want more cheese in the mix.

- Same with C, with the difference that there are proofs that C, indeed, is more than enough to put more burgers together than what A and B can provide.

A is SPs, B is TMUs and C are ROPs, we could add D guy who would provide the products as well as carry the finished ones, memory subsystem and platform, including chipset and CPU. So if we double A, B or C independently we won't get any benefit, but if we improve A and B and C can truly handle the new income of products (and again we have proofs it could be that way), we will either be able to provide more burgers or same amount of burgers with more meat/cheese in each burger. Comparatively in the graphics card, we will be able to provide either more complex image 1920x1200 4xAA 16xAF at same speed or more frames of lesser complexity. BOTH!

EDIT: And following with the example. You say we won't see as much of an improvement on low resolutions and AA/AF levels, and that's right, but not for the reasons you say. It's not because of A,B or C guys, it's because D is not able to provide the resources and carry the large amount of finished burgers that others are generating! They have told you so already, it's because of the CPUs you won't see such an improvement on those settings...
 
Last edited:
Joined
Aug 30, 2006
Messages
7,221 (1.08/day)
System Name ICE-QUAD // ICE-CRUNCH
Processor Q6600 // 2x Xeon 5472
Memory 2GB DDR // 8GB FB-DIMM
Video Card(s) HD3850-AGP // FireGL 3400
Display(s) 2 x Samsung 204Ts = 3200x1200
Audio Device(s) Audigy 2
Software Windows Server 2003 R2 as a Workstation now migrated to W10 with regrets.
Oh man, you are the King!


"Have it your way!"
"Can you taste the fire?"

:roll:

Yes, its all down to where the bottleneck is. I guess we have different positions on where the bottleneck is... AND... we are looking at different points of the spectrum where gains (or roadblocks) will be.

  • When is the GPU shader constrained... RV770 fixes this
  • When is the GPU texture fill constrained... RV770 fixes this
  • When is the GPU memory constrained... RV770 fixes this in the XT version with GDDR5, but nor RV770 Pro, and although GDDR5 has higher clocks, lowerer power consumption (important given the GPU core will need more power), it is also higher latency. We need to see benchmarks for the net net.
  • When is the GPU vertex, polygon, z-plane, ROP contrained... RV770 does not fix this, except for the core "overclock", which is, in fact, pretty much the same as a regular overclock 3870.
For each resolution, the impact of the above will be different.
For each FSAA, AA setting, the impact of the above will be different.

In some situations, it will be very low improvement, in others up to 100% improvement. But the 100% improvement will only exist if current performance is limited by THAT specific bottleneck.

It's going to be mixed results. 1920x1600 will be REAL winners. But if you are on 1280x1024, I still maintain it wont be worth the upgrade UNLESS you are trying to get to 16xAA 16xFSAA. At 0x, 2x or 4x, I'm not convinced, at 1280x1024, the performance improvement will be that great. Why, because at that resolution and those FSAA AA settings the GPU *is not* shader or TMU constrained. Anyway, i await with interest the first benchmarks that come out.
 
Last edited by a moderator:

DarkMatter

New Member
Joined
Oct 5, 2007
Messages
1,714 (0.27/day)
Processor Intel C2Q Q6600 @ Stock (for now)
Motherboard Asus P5Q-E
Cooling Proc: Scythe Mine, Graphics: Zalman VF900 Cu
Memory 4 GB (2x2GB) DDR2 Corsair Dominator 1066Mhz 5-5-5-15
Video Card(s) GigaByte 8800GT Stock Clocks: 700Mhz Core, 1700 Shader, 1940 Memory
Storage 74 GB WD Raptor 10000rpm, 2x250 GB Seagate Raid 0
Display(s) HP p1130, 21" Trinitron
Case Antec p180
Audio Device(s) Creative X-Fi PLatinum
Power Supply 700W FSP Group 85% Efficiency
Software Windows XP
Oh man, you are the King!


"Have it your way!"
"Can you taste the fire?"

:roll:

Yes, its all down to where the bottleneck is. I guess we have different positions on where the bottleneck is... AND... we are looking at different points of the spectrum where gains (or roadblocks) will be.

  • When is the GPU shader constrained... RV770 fixes this
  • When is the GPU texture fill constrained... RV770 fixes this
  • When is the GPU memory constrained... RV770 fixes this in the XT version with GDDR5, but nor RV770 Pro, and although GDDR5 has higher clocks, lowerer power consumption (important given the GPU core will need more power), it is also higher latency. We need to see benchmarks for the net net.
  • When is the GPU vertex, polygon, z-plane, ROP contrained... RV770 does not fix this, except for the core "overclock", which is, in fact, pretty much the same as a regular overclock 3870.
For each resolution, the impact of the above will be different.
For each FSAA, AA setting, the impact of the above will be different.

In some situations, it will be very low improvement, in others up to 100% improvement. But the 100% improvement will only exist if current performance is limited by THAT specific bottleneck.

It's going to be mixed results. 1920x1600 will be REAL winners. But if you are on 1280x1024, I still maintain it wont be worth the upgrade UNLESS you are trying to get to 16xAA 16xFSAA. At 0x, 2x or 4x, I'm not convinced, at 1280x1024, the performance improvement will be that great? Why, because at that resolution and those FSAA AA settings the GPU *is not* shader or TMU constrained.

We are heading somewhere in the end. :toast:
But first of all, we are not discussing the impact these cards will have on PCs or games of today, i.e. if someone wants to upgrade and have a big improvement, but the actual power of the card. We have already said why you won't see a big improvement now, but you will when new CPU/chipsets launch, a jump you won't get with HD3870 because it's not as much platform bottlenecked as HD4000 will be.
And second vertex and poly data are done in SPs and not in ROPs and I would also assume that since AA is done in shaders, z-depth, at least z-culling is done in SPs in the Radeons as well. So they don't have to repeat the work you know. Anyway since geometry complexity is not going up, according to the trend followed lately, that won't be a problem. If there's going to be any improvement in geometry complexity, this will be done by geometry shaders and tesselation basically. Shaders once again.

EDIT: Just to be clear. My point is there's nothing to fix in ROP arena. Reasons for that are given in previous posts, but basically because:

1- Nvidia cards have less raster power. 16 ROPs @ 600Mhz vs. 16 ROPs @800 Mhz. And having to perform AA in them, still is almost 50% faster (9800GTX).

2- You just don't increase everything, and I mean everything just to let that be a bottleneck...
 
Last edited by a moderator:
Joined
Aug 30, 2006
Messages
7,221 (1.08/day)
System Name ICE-QUAD // ICE-CRUNCH
Processor Q6600 // 2x Xeon 5472
Memory 2GB DDR // 8GB FB-DIMM
Video Card(s) HD3850-AGP // FireGL 3400
Display(s) 2 x Samsung 204Ts = 3200x1200
Audio Device(s) Audigy 2
Software Windows Server 2003 R2 as a Workstation now migrated to W10 with regrets.
If "geometry" = "bump mapping" (in its broadest sense, and including the auto-tesselation concept first introduced by ATi as "TruForm"... yes, I owned a Radeon 8500) then yes, shaders can do this, and = great for games.

If "geometry" = "more complex objects" then no, shaders wont help, and = not so great for CAD.

TBH, I don't know how to interpret the Stream Processor comment (SP) in the RV770 architecture. How has SP changed R600 to R700? I really dont know. With the comments about "no architectural change" with RV770, I assumed SP was the same. I could well be wrong on this one.
 

DarkMatter

New Member
Joined
Oct 5, 2007
Messages
1,714 (0.27/day)
Processor Intel C2Q Q6600 @ Stock (for now)
Motherboard Asus P5Q-E
Cooling Proc: Scythe Mine, Graphics: Zalman VF900 Cu
Memory 4 GB (2x2GB) DDR2 Corsair Dominator 1066Mhz 5-5-5-15
Video Card(s) GigaByte 8800GT Stock Clocks: 700Mhz Core, 1700 Shader, 1940 Memory
Storage 74 GB WD Raptor 10000rpm, 2x250 GB Seagate Raid 0
Display(s) HP p1130, 21" Trinitron
Case Antec p180
Audio Device(s) Creative X-Fi PLatinum
Power Supply 700W FSP Group 85% Efficiency
Software Windows XP
If "geometry" = "bump mapping" (in its broadest sense, and including the auto-tesselation concept first introduced by ATi as "TruForm"... yes, I owned a Radeon 8500) then yes, shaders can do this, and = great for games.

If "geometry" = "more complex objects" then no, shaders wont help, and = not so great for CAD.

TBH, I don't know how to interpret the Stream Processor comment (SP) in the RV770 architecture. How has SP changed R600 to R700? I really dont know. With the comments about "no architectural change" with RV770, I assumed SP was the same. I could well be wrong on this one.

Vertex (geometry) data has ALWAYS been done in vertex shaders. Since R500 (Xenon), G80 and R600 and their unified shaders this is done in shader or stream processor, which packs vertex shaders, pixel shaders and geometry shaders in the same unit, to say it in some way.
More complex objects require more SPs not more ROPs, in no way. You do need more ROP for Z calculations, unless this is done in SPs as I suggested. But AGAIN vertex data is treated in SPS no ROPS.

Also tesselation is taking a simple model and make it more complex, in the sense of more vertex and polygons. It has nothing to do with bump mapping, except that may use bump maps to have some sort of control on how that NEW vertex data would be, instead of just making the same as TurboSmoth does in 3DSMax for example.
 
Joined
Mar 1, 2008
Messages
284 (0.05/day)
Location
Antwerp, Belgium
If "geometry" = "bump mapping" (in its broadest sense, and including the auto-tesselation concept first introduced by ATi as "TruForm"... yes, I owned a Radeon 8500) then yes, shaders can do this, and = great for games.

If "geometry" = "more complex objects" then no, shaders wont help, and = not so great for CAD.

TBH, I don't know how to interpret the Stream Processor comment (SP) in the RV770 architecture. How has SP changed R600 to R700? I really dont know. With the comments about "no architectural change" with RV770, I assumed SP was the same. I could well be wrong on this one.

lemonadesoda, i firmly believe you must be pulling our leg.
If not, then please (i'm asking nicely), stop. Just stop because everything you say is wrong.
For the sake of all of us and for your own embarrassment, please stop.

<strike>Bumb</strike>(lol) Bump mapping (MAPPING: the word says it already) has nothing to do with geometry.
You are still connecting geometry to a T&L unit which doesn't exist anymore in modern GPU's. It's emulated on the shaders.

About the shaders on the RV770: they run at 1050Mhz. That why the GFlop increases so much.

That's the last time i'm going to correct you and i'm not comming back to this thread. You ruined it.
 
Last edited:

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,244 (7.54/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
lemonadesoda, i firmly believe you must be pulling our leg.
If not, then please (i'm asking nicely), stop. Just stop because everything you say is wrong.
For the sake of all of us and for your own embarrassment, please stop.

Bumb mapping (MAPPING: the word says it already) has nothing to do with geometry.
You are still connecting geometry to a T&L unit which doesn't exist anymore in modern GPU's. It's emulated on the shaders.

About the shaders on the RV770: they run at 1050Mhz. That why the GFlop increases so much.

That's the last time i'm going to correct you and i'm not comming back to this thread. You ruined it.

Ehm, that's bump-mapping. Bumbs are the heavy things we all carry, there's not much to map, really, except occasional goose-pimples, hair and a deep gorge in the middle.
 
Joined
Aug 30, 2006
Messages
7,221 (1.08/day)
System Name ICE-QUAD // ICE-CRUNCH
Processor Q6600 // 2x Xeon 5472
Memory 2GB DDR // 8GB FB-DIMM
Video Card(s) HD3850-AGP // FireGL 3400
Display(s) 2 x Samsung 204Ts = 3200x1200
Audio Device(s) Audigy 2
Software Windows Server 2003 R2 as a Workstation now migrated to W10 with regrets.
Traditional

Unified Shader


If you had a "screen render" that fitted into the existing pipeline "4 cycles", single pass for each cycle in the rendering stage... as shown in the diagram, then increasing the number of shaders doesnt change anything. The spare-capacity doesnt help. A low FSAA, AA, 1280x1024 can "fit in" the "4 cycle" path, single pass for each stage.

If you have a scene that is 1920x1200 with 16x, 16x, then a screen render will require more than one pass through each stage.

In instance A, clock speed will get you faster FPS. Shaders doesnt help much.

In instance B, increasing the shaders means more can be done in each pass, meaning fewer passes, ultimately getting to just one single pass through each stage. Here, gains are from increased shaders in addition to increased clocks.

That's how I've always understood it. If there is a fallacy with the logic... let me know.
 
Top