As power densities continue to increase, and coolant-based solutions are shunned because of cost / size / energy concerns, the electronics industry has become increasingly reliant on fans and the air flow they provide. This requirement is challenging as fans are a known reliability risk. A recent report released by Carnegie Mellon1 showed that fans were consistently in the top ten failure sites for enterprise applications.
The electronics industry has attempted to resolve this risk through standard test requirements. These are typically constant load tests performed at elevated temperatures (often at 70C) for long periods of time2. These tests are primarily designed to assess fan failures caused by loss or deterioration of the lubricant that leads to wear in the bearings. The viscosity of the lubricant depends on the operating temperature and can deteriorate rapidly at higher temperatures.
The response from the fan industry has been as one would expect. Fan manufacturers design products that pass the test. And some fans are very good at passing these tests (L10 greater than 100,000 hours at 40C). So, what’s the problem? Fans keep failing.
Fans keep failing because of limitations of the test and limitations with the industry standard test setup. Constant load and constant temperature is not a common environment in most applications. Fan speed can vary depending upon thermal loads and can even turn off for applications that are not 24/7. Temperature can also change, which is increasingly likely even in data centers and central offices due to a drive to reduce energy costs (see Randy Schueller’s “Free Air Cooling” White Paper)
Thermal cycling not only can be more stressful on bearings then constant temperature, temperature variations can induce other failure modes, such as degradation of mechanical interconnects and electronic controls. Steady state tests also tend to be poor indicators of infant mortality and defect issues.
The fan life test is typically performed within a standard environmental chamber. This setup can result in chaotic air flows and difficulties in maintaining uniform temperatures across the chamber. These chambers also are unable induce a pressure differential through diffusers or baffles that would place a realistic load on the fans during operation. All of these limitations listed above have motivated DfR to develop a new accelerated testing methodology for fans.
DfR’s new accelerated testing methodology involves subjecting fans to conditions that realistically simulate real world environments. This requirement has driven DfR to design and build a patent-pending air-circulating thermal chamber for elevated temperature power cycling with the option for fan loading through pressure differentials. Subjecting fans to these conditions has an impressive ability to differentiate fan suppliers and quickly identify problems and issues that may lead to early fan failures.
This custom thermal chamber, seen below, has the ability to simultaneously test over 200 standard fans (1U) and up to 32 large fans (5² x 5²). The chamber is designed to regulate the air to a set temperature from room temperature (~25C) up to 90C.
Fans can be placed in the circulating thermal chamber in two banks, one in the upper chamber and one in the lower chamber. The banks of fans are facing opposite directions such that the air circulates in the chamber and provide even and consistent air resistance when the fans are powered. Baffles are sometimes added to induce a pressure differential and additional loading on the fans.
Failure is identified by one or more of three techniques. For the first technique, fans are wired to the power supplies with current sense resistors in series with the return line of each fan. The voltage drop across the sense resistor is monitored by a data logger and recorded at 30 second intervals. The second approach involves acoustic sensors monitoring fan noise. The third approach involves measuring rotational velocity through a tach sensor.
DfR’s accelerated test methodology for fans involves an initial preconditioning soak in an elevated temperature humidity environment. After this initial exposure, the fans are placed in the unique circulating thermal chamber. Depending upon the use environment, baffles or other obstructions may be installed to replicate pressure drops expected in the application. The temperature of the chamber is then increased to 70.5C ± 2.2C. The fans are power cycled with a duty cycle of 80% (4 minutes on, 1 minute off). This is done to induce maximum torque on the rotor/impeller interface and to limit self-heating by the fans. The recommended test duration is 1001 hours.
During the test, fan failure is identified by when the current draw exceeds the nominal variation of +/-10%.
An example of the current draw output from this type of test is shown below.
The new DfR accelerated test methodology is a singular solution for those electronic OEMs attempting to proactively select fan suppliers or retroactively attempting to determine the rootcause of field issues. By accurately subjecting fans to conditions more applicable to field applications, the true capability and performance can be assessed in a fraction of the time.
Thermal chamber: 8 ft x 4 ft x 4 ft
Number of fans: Up to 200
Data recorded: Voltage drop, acoustic monitoring, tach sensing
Recording period: Every 30 seconds
Power Supply: 24 volt 6.5 amp
Temperature range: 25C to 90C
Temperature accuracy: ±2.2C
Power capability: 1500 W
This white paper may include results obtained through analysis performed by DfR Solutions’ Sherlock software. This comprehensive tool is capable of identifying design flaws and predicting product performance. For more information, please contact Sales@dfrsolutions.com.
1 Carnegie Mellon University's "Disk Failures in the Real World"
2 Test conditions are typically derived from ABMA 9-1990
DfR represents that a reasonable effort has been made to ensure the accuracy and reliability of the information within this white paper. However, DfR Solutions makes no warranty, both express and implied, concerning the content of this report, including, but not limited to the existence of any latent or patent defects, merchantability, and/or fitness for a particular use. DfR will not be liable for loss of use, revenue, profit, or any special, incidental, or consequential damages arising out of, connected with, or resulting from, the information presented within this white paper.
Ensuring that a product will perform and function as promised is the key to any manufacturer’s success in the marketplace. As systems become more complex and intricately designed, and as manufacturers rush to be first-to-market, the challenge of performing accurate and timely reliability testing increases, as was demonstrated in recent newsworthy technology failures.
The U.S. Dept. of Defense’s (DoD) has initiated multiple efforts to revitalize reliability in defense systems acquisition and development [1+2]. One of these projects involves a series of revision to MIL-HDBK-217 Rev F, the often imitated and frequently criticized reliability prediction bible for electronics equipment. The MIL-HDBK-217 revision team has proposed eventually migrating to Computer Aided Engineering (CAE) tools with science-based Physics of Failure (PoF) reliability modeling, simulations and probabilistic mechanics techniques to expand beyond the current limitations of actuarial reliability prediction methods.