The use of the term ‘Highly Accelerated Life Testing,’ better known as HALT, originated in the late 1980’s within two different sectors of the electronics industry.
Munikoti and Dhar of Nortel originally used the term HALT in 1987 to describe a solution to the accelerated life tests being performed at that time on ceramic capacitors. However, the far more popular use of the term began in 1988 (based on various marketing literature) by Gregg Hobbs of Hobbs Engineering. Though the mindset and the process goes back several decades, Hobbs used the term to describe a sequential process of applying a variety of stresses to a product, either individually or in combination, to identify the weak links in the design.
Over time, HALT has evolved to describe a specific set of tests (hot step stress, cold step stress, vibration step stress, thermal cycling, combined) performed in a specific environmental test chamber (thermal cycling chamber with adjustable nozzles over a repetitive shaker (RS) vibration table) at a specific point in the product development process (right after first prototype).
HALT can play a very crucial role in ensuring a robust design early in the design process and establishing constraints before certain aspects of the design are fixed. However, success of HALT is can be very dependent upon a complimentary activity, root-cause analysis (RCA), because you cannot pass or fail HALT. A true HALT is the process of learning about the weak points of your design and determining what steps are necessary and sufficient to improve those limitations. Without the knowledge provided by root-cause investigation of HALT failures, that is what failed and why, the value of HALT is greatly reduced.
A great example of the value of RCA and HALT can be seen in a case study of a HALT performed on an industrial power supply. The industrial power supply in question had an operating temperature range of 5˚C to 50˚C and a storage temperature range of -40˚C to 65˚C.
Cold step stress testing
During cold step stress testing the units experienced a variety of failure modes
The failure modes described above are well known for LCDs and are due to material transitions within the liquid crystal. The LCD manufacturer specified an operational range of -20˚C to 70˚C and the displays operated as intended within that temperature range, which is approximately 20˚C beyond the operating range. The bigger concern is the permanent failure at -50˚C. This offers only a 10˚C margin over the storage temperature and is not an expected failure mode for LCD’s. The loss of illumination suggests an issue with the LED backlight. A likely failure mode is a hardening of the silicone potting around the LED, causing separation of one or more wire bond connections.
During hot step stress testing, the unit shut down at +110C and did not recover. Posttest failure analysis of Unit 2 identified the failure site as a DC/DC converter. The operating range of the DC/DC Converter is given as -25˚C to 90˚C. The permanent failure was surprising as recoverable failures typically occur at least 20˚C below the temperature of permanent failure. This would indicate two possible scenarios. The first is that the DC/DC Converter experienced some type of recoverable failure that was not detectable by monitoring during HALT. This scenario is concerning as it suggests that there is no margin beyond the specification on this DC/DC converter. The alternative scenario is that this DC/DC Converter only experienced permanent failure.
In both scenarios, the failure temperature, whether around 90˚C or at 110˚C, is at least 40˚C above the operating specification of 50˚C. Considering the temperature difference between the operating specification and the failure temperature, there are two options. The first is to do nothing because sufficient margin was present. The second is to consider the DC/DC converter the weakest component and to determine if this may indicate a potential future risk.
During rapid thermal cycling, the power supply experienced a sticky relay. Intermittent failures should be taken very seriously whenever detected during product qualification. “Sticky” relays are often an indication of micro-welding, potentially due to timing issues or excessive current. In some manner, rapid thermal transitions may have aggravated the component or the circuit sufficiently to trigger this event, potentially indicating insufficient margin or robustness.
Several failures occurred during vibration step stress testing, including:
One of the most important questions in assessing the results of a HALT test is determining its relevancy. Since the operating environment of the industrial power supply is not expected to see vibration, the vibration step stress test is to some extent assessing the robustness of the design during shipping and transportation. Therefore, an appropriate root-cause evaluation must be based upon an understanding of the actual loads seen during shipping and transportation.
Three available Power Spectral Density (PSD) profiles for shipping are shown in Figure 1, Table 1 and Figure 2. [Note: This function, although it is called “power…”, is not its unit of measurement. This term is used because very frequently the square of a fluctuating quantity enters into the power expression (Joule effect,). It would be preferable to speak of “Acceleration spectral density” or even “acceleration density.”] We can see that the applied vibration loads during HALT are higher than the loading during transportation, but the duration is shorter. An equation provided in older versions of MIL-STD-810 can be used to calculate an equivalency factor, where T1 , T2 are the test times, PSD1 , PSD2 are the corresponding PSD levels and d is approximately equal to 6.4 for electronic equipment. This very rough rule of thumb shows that 10 minutes under HALT could be equivalent to hundreds of hours under transportation. This time under transportation is unlikely for any electronic equipment. So, is the HALT failure not relevant?
It is important to realize that this time compression equation is based on metal fatigue type failures and may not be relevant to backing out of a ground screw. Loosening of attachments during shipping is a common problem and no test-to-field correlation is available. Therefore, at minimum, the 15G failure should be considered relevant and corrective actions should be initiated. There are a number of screw locking and retaining options available. These include the use of a split ring washer, a lock washer, or a threadlocking adhesive, such as Loctite.
The failure of the display unit at 40 Grms is believed to be at the material limit of LCDs and is not considered to be a concern. The dislodging of the bleeder resistors is because of their high standoffs, large mass, and close proximity to each other (they likely failed due to repeated impact). Solutions include resistors with more rigid leads, use of staking compound, or moving the bleeder resistors farther apart.
Failures during combined vibration and temperature cycling testing were the same as those observed during the earlier stages of the HALT process.
In conclusion, HALT can be a very powerful tool in the right hands. But those hands cannot check a box and have to be attached to a person who is interested in learning about their design.
Craig Hillman is CEO and Managing Member for DfR Solutions. Dr. Hillman’s specialties include best practices in Design for Reliability (DfR), Pb-Free strategies for transitioning to Pb-free, supplier qualification (commodity and engineered products), passive component technology (capacitors, resistors, etc.), and printed board failure mechanisms. Dr. Hillman has over 40 Publications and has presented on a wide variety of reliability issues to over 250 companies and organizations.
Highly Accelerated Life Testing (HALT) is an excellent and low cost approach for assessing the robustness of an electronic product; however, the potential variations in testing temperatures, vibration loads and shock prevent direct extrapolation of results to life data, precluding HALT from being a reliability predictor.