As a person who is constantly reviewing electronic designs and performing root-cause investigations, I come across many an electronic designer and reliability engineer. And while their focus and backgrounds are often different, often there is one thing that both parties can seem to agree upon: the magical/mystical properties of derating.
Now, don’t get me wrong. Derating can be a very useful design for reliability (DfR) tool. It really, to some extent, was the very first DfR tool and is probably still the one most commonly used across the electronics industry (and, by comparison, far superior to the second most common DfR tool, reliability prediction using a handbook).
But, derating is not the answer to everything (NASA and DoD, are you listening?). As the case study below demonstrates, you can run the risk of treating every component problem as a nail if all you have is a derating hammer in your toolbox.
A customer came to DfR Solutions asking for assistance in identifying the root-cause of discrete field effect transistors (FET) in a power supply. The design of the power supply, which is typical for high-power power supplies, consisted of a power factor correction (PFC) boost with a zero voltage switching (ZVS) full bridge. This specific failure analysis came with two interesting twists.
The first twist was that while FETs were present on both circuits and failed on both circuits, they never failed in both topologies at the same time. The second was that the design and reliability teams had decided that they had the problem already solved. The answer? Derating! You see, the bus voltage was 390V and the FETs were rated to 500V. The design and reliability teams speculated that the must be some kind of ethereal voltage spike killing the FETs. They switched to 600V rated components and guess what? The fault disappeared! (Excuse me for using two exclamation points in one paragraph.) Problem solved, right? Wrong.
First, voltage waveforms were measured on both the PFC and Bridge FETs. (See Figure 1 for the Bridge FET.) No voltage spikes above 500V were noted in either waveform. (Actually, there were no voltage spikes at all for the Bridge FET.) Several FETs, including both unused and unfailed FETs from returned power supplies, were subjected to voltage breakdown testing. Voltage breakdown was defined as leakage currents in excess of 125 µA. All the FETs measured had breakdown voltages in excess of 560V (this is +10 to 15% margin is common in power devices).
To be absolutely sure excessive voltage was not playing a role in the field failures, several FETs were connected in parallel with the gate and source grounded, and the drain connected to the high voltage (520 ±1%). The duration of the test was 125 hours, after which the breakdown voltage of the FETs was measured. No failures were experienced and all FETs were within specifications after testing.
To put the final nail in the coffin, failed units were decapsulated, and the die surface was observed. As seen in Figure 2, the damage area was far more extensive than would be expected for an overvoltage event, especially one that is driven by an instantaneous spike. The amount of energy required to induce the extent of damage suggested an overcurrent event. In fact, one of the unfailed FETs was subjected to an overcurrent event, and the failure morphology was similar to failed FET from the field.
But wait a minute! This doesn’t make sense. Derating solved the problem. Going to a higher voltage—and more expensive— part eliminated failures. But how to explain this success if the applied voltage is always well below the breakdown voltage?
The first thing to realize is that FETs, and power components in general, are far more complicated than we often give them credit. An example of this unexpected behavior can be seen in the image below which tracks the diode voltage drop in the FETs as a function of temperature. The 600V rated FET had a higher diode voltage drop than the 500V rated FET.
Now you might ask, “What does this have to do with the price of tea in China?”(Pardon the American idiom.) Everything.
The low forward voltage drop found in the 500V FET is typical of a slow reverse recovery diode. In FETs, these body diodes are normally slow and can be modified by doping the junction with gold or irradiating the device to damage the lattice, which reduces the minority carrier lifetimes and charge. This in turn reduces the reverse recovery time of the diode so that in can be used in faster circuit applications, like the PFC boost in the power supply being investigated.
Recovery times are critical to successful operation, as slow response can induce a parasitic transistor to turn on and allow high current to be drawn from drain to source when the FET is off. This result in overheating of the device similar to what was observed in the failed devices. The die and source bond connection showed high current, yet there was no damage to the gate, which would have suggested overvoltage.
As a general rule, there are two mechanisms that can cause the parasitic transistor to turn on. One is when the FET transitions on in the forward direction with the body diode conducting, during which the FET will have limited dV/dt capability. This explains the failures in the ZVS Fullbridge. The other issue is the dV/dt when the FET turns off, which will be limited because of the parasitic transistor possibly turning on. In the Boost PFC, when the FET turns off, the inductor dumps energy into the FET causing the FET voltage to rise very quickly until the PFC diode conducts and the inductor charges the output capacitor. This rate of rise (dV/dt) on the drain-source of the FET is determined by the capacitance of the device and available inductor energy. In high current continuous mode PFC designs, such as this, the energy stored in the inductor is relatively high and the dV/dt on the FETs will be significantly high.
Now why did the higher voltage part have a faster recovery time? That is the million-dollar question that is not easy to answer because a component’s rating is not a science-based parameter. Ratings are more driven by marketing than engineering and this component manufacturer may have determined, through marketing, that the market for 600V rated FETs needed fast recovery and the market for 500V rated FETs did not.
Regardless, the truism still holds: never assume. Derating, whether for temperature or voltage, is a useful tool, but it can often mask the true problem and cost a lot more money when just a little more science and engineering can point to the real solution.
Craig Hillman is CEO and Managing Member for DfR Solutions. Dr. Hillman’s specialties include best practices in Design for Reliability (DfR), Pb-Free strategies for transitioning to Pb-free, supplier qualification (commodity and engineered products), passive component technology (capacitors, resistors, etc.), and printed board failure mechanisms. Dr. Hillman has over 40 Publications and has presented on a wide variety of reliability issues to over 250 companies and organizations.
Properly manage temperature for maximum PCB reliability - Derating your PCB components to the maximum manufacturer rating isn’t necessarily the most effective way to ensure reliability. In fact, it could be counter-productive. This Best Practices in Thermal Derating webinar delves into the hows and whys of derating, and explores the many factors that impact reliability outcomes like: Thermal stresses, excessive derating, consistent derating parameters over component life, steady state temperature assumption, and different thermal profiles over an entire PCB.
In the early stages of product development, effective design‐for‐reliability (DFR) approaches can have critical impact to the final product integrity. DFR’s goal is to assure adequate product strength against lifecycle stresses proactively. At the component level, identifying and controlling critical reliability factors can take place in schematics and prototype stage (as well as later) and component derating is one popular DFR approach for assuring that individual component will be able to operate robustly against major stress elements.