Key Words: Technology Obsolescence, Life Cycle, Cost of Ownership, Component Obsolescence, Logistic Downtime, Aircraft, Aging Aircraft, DMSMS
This paper discusses the insights obtained from managing obsolescence mitigation strategies within the Department of Defense (DoD) during the last two decades. The numerous solutions investigated, from alternative or substitute components to complete system redesign can affect reliability and maintainability (R&M) as well as cost. However, with careful implementation of design practices many of the effects can be mitigated. In addition to presenting the common practices that mitigate the effect and risk of DMSMS, this paper also addresses the nonrecurring engineering (NRE) costs impacts associated with ensuring that solutions meet the original product’s system performance and R&M requirements.
Rapid changes in item or material technology, uneconomical production requirements, and extended product life cycles are major causes of component obsolescence. Component obsolescence has affected the Department of Defense (DoD), the airline community, and many commercial industries. The DoD term for component obsolescence is Diminishing Manufacturing Sources and Material Shortages (DMSMS). This paper will use the term DMSMS. DMSMS can affect any phase of the product life cycle, from development to post production. DMSMS affects both cost and operations. If left unchecked DMSMS could cause increased downtime due to part shortages and decreased reliability if alternative solutions are not thoroughly qualified.
According to data collected by the Government Industry Data Exchange Program, 84% of discontinuance notices received by them are for electronic components, specifically active devices. The remaining 16% are for mechanical and passive devices. In addition to components, materials, chemical processes, and software could also become obsolete. The primary focus of this paper is on component obsolescence of active devices.
There has been significant work completed in the area of component obsolescence and solutions, however there have been no papers presented at reliability and maintainability symposiums that discuss the potential reliability issues associated with obsolescence.
BRC – Baseline Repair Capability
COTS – Commercial Off The Shelf
DMEA – Defense MicroElectronics Activity
DMSMS – Diminishing Manufacturing Sources and Material Shortages
RCIA™ – Repair Capability Impact Analysis
RRC – Reduced Repair Capability
Attrition – products can no longer be repaired due to normal wearout caused by previous repairs
Component – the term component in this paper applies to devices e.g., resistors, semiconductors, capacitors, diodes.
Part – the term part is used interchangeably with component
The rapid change in electronics and microcircuit technology continues to advance in accordance with Moore’s Law, which states that the density of semiconductors (e.g., fabrication process minimum feature size measured in micrometers) doubles about every 18 months. The electronic and microcircuit industries are increasingly more sensitive to rapidly improving consumer products rather than DoD and other industries (such as the airlines) that have to retain their equipment for much longer than 18 months due to economical considerations. As a result, the DoD continues to see the availability of military grade microcircuits decreasing and the number of DMSMS situations increasing. To relate obsolescence to non-DoD products, consider the rapid technology advancements in consumer products such as personal computers, cellular phones, and personal data assistants (PDAs) over the last decade, and how often they need to be upgraded or replaced. Technology is also at the maturity point for these types of products where level of repair analysis models will likely indicate that the most cost effective solution is to discard (throw-away) and replace them when they fail⎯if they fail at all during their useful life. But what if a computer purchased 10 years ago had to operate for another 20 years because there was insufficient funding to acquire, integrate (install), and fully qualify a replacement with current technology. This is often the case with many DoD, airline, and NASA systems that rely on mission essential equipment and products that if replaced, the replacement must undergo extensive qualification testing. The DoD expects to operate its current fleet of tanks for another 30 years, the KC-135 aircraft for at least another 35 years, and the B-52 aircraft, which has flown for over 50 years, is expected to be in operation for at least 40 more years! For the past 20 years, the electronic digital systems used in the DoD have traditionally been designed using 5-Volt (5-V) technology. With continual technology advances, digital designers can now obtain higher densities and faster devices using 0.35 μm (3.3-V) or lower 0.18 μm process technology (2.5-V). The benefits of the lower voltage include efficiencies in operating temperatures, speed, power consumption, size, and densities. These benefits are ideal to allow the high-volume, low-power consumer products to decrease in size, but according to experts, this trend will contribute to an increase in obsolete components for the DoD and airlines. The question that remains is how will a computer or other legacy electronic system be maintained if either a failed component is no longer manufactured or software is no longer supported?
Microelectronic and semiconductor obsolescence have been identified within the DoD since 1978 (Ref. 1). One of the first plans for managing obsolescence was developed for the U.S. Navy P-3 aircraft in 1984 (Ref 2). With over two decades of coping with DMSMS problems, the DoD has developed many resolution options and mitigation techniques that minimize the recurrence of them in the future. While each of the potential resolutions have cost implications, many may also affect reliability, maintainability, and supportability (RM&S).
Each of the services within the DoD have published a resolution guide identifying not only suggested resolutions but also policy and procedures:
The DMSMS resolutions contained in these documents are well known and usually are applied to existing or newly arising problems. The guides also provide information about coordinating actions with activities such as the Defense Logistics Agency (DLA), Defense Supply Center Columbus (DSCC), and the Defense MicroElectronics Activity (DMEA). The most commonly implemented resolution techniques for electronic components are described in the following sections along with their most prevalent RM&S considerations.
The use of obsolete, surplus components currently owned by the product manufacturer or currently in-stock at a warehousing activity such as DLA. RM&S considerations include:
The use of an item found in equipment beyond economical repair at depots or surplus items from the Defense Reutilization and Marketing Service (DRMS). Reclamation can occur at the system, module, circuit board, or component level. Because of potential reliability problems, DMEA does not recommend reclamation of components from circuit boards. Even though removal processes have improved recently, reclamation of components should only be implemented as a last resort or temporary measure until other resolutions can be determined. RM&S considerations include:
The use of part that is equal to or better than the part specified on a parts list. Such parts may be (1) listed in a specification or standard as superseding parts; (2) upgraded or better than original parts (such as JANTX in place of JAN, Standard Microcircuit Drawing parts in place of vendor unscreened parts, military temperature range parts in place of commercial temperature range parts); or (3) equivalent or interchangeable parts that are functionally the same, mechanically the same, and of the same quality as the specified parts, such as parts from a different vendor. RM&S considerations include:
A part whose performance may be less capable than the part specified on a parts list for one or more reasons (e.g., quality or reliability level, tolerance, parametrics, temperature range). RM&S considerations include:
Aftermarket manufacturers such as Rochester Electronics or Lansdale Semiconductor are authorized by the original equipment manufacturer (OEM) to provide custom assembly of obsolete integrated circuits using existing wafer and die. In some cases, a manufacturer has acquired the photo mask sets or the entire assembly process from the OEM. A photo mask holds the pattern for each layer of an integrated circuit design. The manufacturer is then authorized to produce wafer, cut die, and package the cut die. In some cases, finished goods provided in a catalog may be considered as a substitute part. The RM&S considerations for aftermarket are the same as for substitute except that in-circuit evaluations are always needed due to variations in manufacturing processes.
The purchase of a sufficient quantity of an obsolete item to meet the projected demands of the supported equipment for its expected operational lifetime. Unfortunately LOT buys seldom obtain the correct quantity of parts; either too many or too few components are procured, especially if the expected operational lifetimes are extended as has occurred with the B-52 aircraft. RM&S considerations are the same as those for the existing stock resolution technique. Preferable to LOT buy is bridge buy, a purchase of limited-quantity components to support near-term requirements until a longer-term solution can be achieved.
A manufacturing process that produces a substitute F3I item for the unobtainable item. Through microcircuit emulation, inventory reduction can be achieved because obsolete items can be replaced with devices that emulate the original and can be manufactured and supplied on demand. RM&S considerations for emulation are the same as for aftermarket.
This paper classifies circuit board redesign into three categories. A Type 1 redesign typically involves adding jumper (white) wires or additional components such as resistors to adjust timing. A Type 2 redesign typically involves a board relayout to replace not more than one (1) or two (2) components. A Type 2 redesign is also known as a minor redesign. A Type 3 redesign often called a major redesign involves designing a DMSMS item out of the system and typically replaces the entire board with a F3I replacement. Type 3 redesigns are usually used as a last resort, but provide the opportunity to enhance system performance and improve R&M. Because substantial NRE and recurring logistics cost will accrue for Type 3 redesigns, they are most appropriate when a sufficient quantity of obsolete components are involved for the same circuit board (rules of thumbs have been discussed as greater than 5 to 10 components). The decision to implement a Type 3 redesign is made by conducting a cost trade-off analysis unique to each circuit board. An exception is when safety or performance measures are not meeting the requirements. RM&S considerations include:
A decision support system often based on performance measures and total cost of ownership is typically used to select the best resolution(s) defined in Sections 3.1 – 3.8. For each resolution, recurring procurement costs and NRE costs are needed to determine which one is the most cost effective to implement. Recurring costs are specific to each system, component and resolution. The typical NRE costs associated with each resolution however have been collected (Ref. 3) and are presented in Table 4-1. These values are provided with the understanding that these can be used as estimated default values when actual documented data do not exist. If documented verifiable actual data are available, then the actual data should be used
Table 4-1 provides the average values along with the lower and upper 90% s-confidence interval. Cost data were not collected for Type 1 redesign. In addition, cost metrics for LOT Buy are not included since they are based on the specific component failure rate, unit cost and system quantities. The basis for each of the cost values is provided in reference 4. Under certain circumstances, the resolutions identified may require any of the following additional actions: qualifying new sources, conducting radiation-hardening tests, conducting special tests for plastic-encapsulated microcircuits (PEMs), or program specific tests such as flight testing. The following should be considered:
The NAVAIR Aging Aircraft Integrated Product Team (AAIPT) is developing cost models that incorporate program-specific test costs. The model includes hardware and software laboratory tests, ground tests, and flight tests. When these additional test costs are included, additional refinement is provided to the decision-maker. When these costs are included in the estimate, it is not unusual for Type 3 redesigns to reach $2M or more. Adding to this cost impact is the frequency at which components are becoming obsolete. Ref. 5 indicates that, every year, 20% of the digital integrated circuits and 10% of the analog integrated circuits available to the DoD are going obsolete. Mitigating the cost impact and recurrence of obsolescence and planning ahead such that the lower cost resolutions can be implemented, can reduce the total cost of ownership for a system. Common practices that mitigate the impact and design strategies that reduce the future recurrence are available for incorporation into a DMSMS Management Plan.
Minimizing the impact of component (parts) obsolescence and technical obsolescence risk is the heart of the DMSMS concern. Risk management techniques have been addressed by AFMC (Ref. 6), DMEA (Ref. 7), and the Electronics Industries Alliance (Ref. 8).
Some of the management and design approaches from Ref. 6 that relate to RM&S considerations are:
Reference 7 provides three intensity levels of common practices that include activities that could be implemented to mitigate the risks of DMSMS:
Selecting a practice is influenced by the resources available to manage DMSMS. The practices associated with these levels form the basis of a program that can be implemented to mitigate the impact of DMSMS. Although an expense is associated with the implementation of a DMSMS program, cost avoidance and cost of ownership reductions can be realized from such a program. A list of the practices for each level is presented in Table 5-1. An event usually occurs that convinces the program manager that one or more practices need to be implemented. These events are called triggers. An example of a trigger would be higher management awareness of increased supportability problems.
Business case analyses have shown that the implementation of these practices can result in lowering the cost of resolving obsolescence problems and reducing TOC. It is important to note that as more practices are selected, the potential for reduction of TOC increases. The relative implementation cost versus potential for TOC reduction, along with a summary of the possible triggers, is shown in Figure 5-1.
Reference 8 describes methods that can be applied during system design to minimize the impact of future component obsolescence issues. These methods include:
Discontinuance notices originate when a part manufacturer announces that a part or a production line will be discontinued. Early notification of this event can help with the life-cycle planning needed to ensure that operational readiness levels can be maintained. Completing Level 1 practices such as part lists screening and monitoring and the Level 2 DMSMS prediction practice presented in Section 5 is just the first basic step in obsolescence management. Analysis of the current or potential problem parts identified by this first step is the key to cost effectively managing and resolving obsolescence problems.
As described in Section 4, NRE costs associated with each resolution could be significant. Planning ahead will not only help reduce the costs but also reduce operational impacts.
An example of how obsolescence can impact a products repair capability can now be investigated. Figure 6-1, ARINC’s Repair Capability Impact Analysis (RCIA) will be used for this investigation. Notional data are provided, however the overall analysis is based on current experiences in trying to keep aging aircraft flying. Let’s begin by determining the baseline repair capability. The baseline repair capability (BRC) quantifies how many expected failures there will be over the product’s useful life. The product could follow any reliability behavior such as a constant failure rate, wearout with age, or improvement with age. For this example we assumed the product, has a constant failure rate over its expected useful life, will no longer be produced once the production spares are purchased, and is a repairable system. Taking into account, expected failures, stock safety levels (to ensure adequate spares to fill the repair cycle pipeline), and planned attrition rates, a set number of spares is determined⎯creating the BRC.
The point where, due to attrition, the cumulative failure line crosses the BRC is known as the planned critical point. This point is the planned end of the product’s useful life. But,it may have to operate well beyond that time, as is the case for the B-52. As shown in Figure 6-1, failures after the planned critical point may not be supportable. With early planning and monitoring, steps such as incorporating a reliability improvement and upgrade program could extend the repair capability. Likewise, if the failures are monitored, unexpected wearout could also be identified early to avoid spare part shortages. DMSMS impacts are illustrated by the reduced repair capability (RRC) curve. The RRC curve illustrates the effect, over and above the normal attrition rate, that is caused when a product that fails cannot be repaired because the failed component is no longer manufactured and no resolutions are available!
When obsolescence management programs are implemented, planners could have procured sufficient quantities of alternate or substitute components or even initiated and possibly completed the engineering needed to incorporate a redesign. However, those resolutions can take from a few weeks, for alternate and substitute, to over a year for redesigns to be completed. Without early planning, maintenance and logistics downtime will increase as the product sits on the repair bench waiting for a part or part resolution. Left unchecked, operational readiness will be affected. When identifying solutions, take into account not only cost but also the RM&S considerations such as induced or dormant failures, revised maintenance procedures, and spares impact.
This is the just the beginning in documenting the effects obsolescence has on RM&S. Future work in this area will continue to develop additional resolutions, capture the RM&S considerations and costs associated with the resolutions, and most importantly identify new design strategies that will mitigate the impact of obsolescence. It is also expected that additional analyses and further refinement of models similar to the RCIA will continue.
I would like to thank the Defense MicroElectronics Activity, specifically Mr. Ron Shimazu, Mr. Wesley Trunnell, and Mr. Vance Anderson for their support, guidance, and encouragement they provided to me during the development of this paper.
Walter Tomczykowski is the DMSMS Program Director at ARINC. He received an MS in Reliability Engineering from the University of Maryland. For the last 20 years he has been performing product support work in the areas of reliability, maintainability, and obsolescence management for a variety of Air Force, Navy and Army programs. Currently he provides DMSMS support to DMEA, the NAVAIR Aging Aircraft IPT, and the JTIDS program office. He is the author of the DMEA DMSMS Cost Factors, the DMSMS Program Managers Handbook, and the DMSMS Acquisition Guidelines. Previously his work in reliability has been published in the "Wiley Encyclopedia of Electrical and Electronics Engineering" and the "Product Reliability, Maintainability, and Supportability Handbook".
Performing a “before a failure” investigation on electronics is typically done for various reasons. One reason is to identify weak components or sub-systems before committing to a full-blown production run and its associated expenses. Comparison testing of similar component parts to reduce costs and increase reliability of existing designs, or against a competitor’s offerings is another reason.
Application-driven choices reduce PCB reliability risks - Ball grid array (BGA) and quad flat pack no-lead (QFN) are the preferred packaging choices for most integrated circuit devices, but they present a greater potential for lead-free solder joint failure through thermal cycling, vibration and mechanical shock. Can the risk be properly mitigated? Find out in our Select the Right Mitigation for BGAs and QFNs webinar. It explores: component corner staking, edge bonding and underfill, operating conditions that impact mitigation, testing results based on a variety of solders and mitigation techniques, and application-driven mitigation choices.