Monday, October 25, 2021

RTX 3090 failing in New World notes/thoughts

Based on experiences from repairing a Gigabyte RTX 3090 Vision.


Vcore VRM is controlled by a UP9511R

OCP seems to be set between 700-1200A

VRM is made up of 10 AOZ5312UQI 60A DrMOS components rated for 10ms 80A peaks and 10us 120A peaks.

OCL is set to ~80A for most phases, ~130A for some and ~160A for the 2 teamed phases.

The failed powerstage was in one of the teamed phases.

 

Power limiting on Nvidia cards is based on average power draw as measured by the US5650/NCP45491 shunt resistor monitoring chips(these do not communicate with VRM directly and are located before the input filtering components)

Furmark with Post FX turned on manages to max out the card's TDP at around 1200MHz core and 0.718V core. The intial start of Furmark slightly violates(100-103% as measured by GPU-z at 0.1 second refresh rate) the cards TDP limit after which it settles to around 95-97%. GPUz reports 97% GPU load

I didn't test New World. I don't feel like buying and setting up. I also don't feel like replacing anymore powerstages on the 3090 because it's a pain to do.

None of the 3Dmarks I've tested have exceed the cards TPD.

The card makes insane amounts of coil whine from 200FPS upto 1500FPS(3Dmark Cloudgate GT2) none of these high FPS loads violate TDP and run at higher clocks and voltage than other higher resolution lower FPS loads.

Unigine superpostion at 1440p doesn't violate TDP. At 4K it very slightly exceeds TDP. At 8K it can peak as much as 12% above TDP.


I think that RTX 3090s fail in New World because some part of the rendering process manages to achieve very high core utilization. This almost certainly happens every single time a frame is rendered. Unlike Furmark this level of high core utilization is a short burst so more like the behaviour seen in 8K Superposition but likely even more intense. Since Nvidia's power management is primarily focused on maintaining a certain level of average power draw while mostly ignoring momentary peaks New World doesn't cause as much down clocking as it probably should during the high intensity burst. If the burst exceeds the current capabilities of the VRM the VRM probably isn't going to shut down due to the high OCL and OCP. Eventually the excessive current bursts leads to mosfet failure. Cards with fuses end up unresponsive. Cards without fuses will either trip PSU SCP/OCP or start smoking or both during start up.

I would also guess failures will get more common as monitor resolutions increase and games get more optimized for cards like the RTX 3090. Also the cards will just get older.