The continued evolution in semiconductor fabrication, and the expectations that have come with it, has resulted in a revolution in chip-scale packaging (CSP). The tremendous cost and struggles to produce the next-generation integrated circuit, including $10 billion for 450mm wafer fabs  and TSMC’s well documented challenges with yield at 28nm , have driven significant interest and resources into new materials and new architectures for next-generation semiconductor packaging. These challenges have not only allowed chipscale packaging to continue its steady improvement in miniaturization and density, but it has also flowered an expansion of the very concept of a chipscale package. This includes the recent introduction of silicon interposers, through-silicon vias, and embedded die designs.
These changes and improvements create great opportunity f o r improvements in form, fit and function. The ability of stacked die to provide 16GB of flash memory in today’s smart phones are well-documented, as is the thin form factor of the iPhone’s package-on-package (PoP) microprocessor. However, every revolution in packaging, from the introduction of the ball grid array (BGA) to the quad flat pack no-lead (QFN) to the PoP, has typically gone through a two stage lifecycle. In the first stage, the high-volume OEMs work with the largest component manufacturers and the outsourced assembly and test (OSAT) vendors to ensure that the new packaging concept is highly compatible with the supply chain’s manufacturing process and qualification procedures. Areas of concern are identified based on initial studies, information is exchanged among the parties involved, and modifications are either made to the packaging or the manufacturing process. After a successful launch, the package lifecycle enters the second stage, where other component manufacturers incorporate the new packaging concept into their technology and then attempt to convince the broader market that the new package is appropriate for their design, manufacturing environment, and end-use application. Unfortunately, as one could guess, the second stage never goes as well as the first.
What is often missing from the second stage is an understanding from all parties on how to identify what they don’t know they don’t know. The high-volume supply chain is rarely representative of the challenges and idiosyncrasies faced by tier 2 and tier 3 component vendors and OEMs and is therefore not necessarily an appropriate staging area to shake out potential issues with the packaging technology. If issues are identified, resolutions can be hidden by nondisclosure agreements and aggressive IP posturing. The critical question that must be asked, and often is not, by this second wave of users and consumers is how the changes in the packaging technology could induce failure in ways that are not relevant and may not even be detected by the classic qualification activities prescribed by either JEDEC or the Automotive Electronics Council (AEC). And that requires an understanding of reliability physics. Reliability physics, also known as physics of failure (PoF), is the process of using modeling and simulation based on the fundamentals of physical science (physics, chemistry, mechanics, etc.) to predict reliability and prevent failures. With semiconductor packaging, what this really means is being able to accurately characterize material movement. The majority of the key failure mechanisms of concern (fatigue, creep, electromigration, etc.) relate back to material movement and how it changes as a function of stress. Success in the domain of reliability physics requires an understanding of the stress, the magnitude of the stress, the rate at which this stress is driving material movement, and when this material movement could induce failure.
An example of this structure can be seen in creep. Creep is the tendency of a solid to permanently deform when subject to a fixed load. As seen in Figure 1, the presence and type of creep in solder varies depending upon the stress, normalized by modulus, and temperature, normalized by melt temperature. Having this basic knowledge, one can than start to look at the packaging architecture and ask, ‘Is creep a risk in my use environment?’ If the answer is yes, other questions start to naturally fall into place: 1) at the given temperature, which creep mechanism will dominate; 2) what are the potential sources of stresses that will drive creep. The ability to sufficiently answer these questions will then point towards the right combination of characterization, simulation/modeling, and testing necessary to have a robust qualification plan.
The process of using reliability physics to help design and qualify CSP can quickly become convoluted and academic if every stress, mechanism, and algorithm is presented and explained. It has been our experience that a more insightful approach for the practicing engineer, either at the component manufacturer or the OEM, is to provide case studies relevant to current generation technology. With this in mind, the state-of-the-art packaging of the Virtex 7 2000T was selected as an appropriate guinea pig.
Before moving forward , it is important to note that neither DfR Solutions nor the authors have had any prior engagement with the design, manufacturing, marketing, or assembly of this package and any evaluation of the package does not imply or deny any activity by the manufacturers. Instead, the revolutionary aspects of this package make it an excellent example of the recent changes in CSP beyond the classic memory device or fine-pitch BGA. The 2000T pushes the boundaries of CSP in terms of size and mass. It is a significant package at over 63mm corner-to-corner. And while the 2000T does not fall into the traditional classification of a CSP as per IPC/ JEDEC J-STD-012 Implementation of Flip-Chip and Chip-Scale Technology (package area of no more than 1.2X the original die area), the total volume of silicon, including active die and interposer, may make JEDEC rethink its definition. Most importantly, it is the first known off-the-shelf component to use silicon interposer technology with through-silicon vias (TSV). The silicon interposer, fabricated at 65nm, connects four identical field programmable gate array (FPGA) die fabricated at 28nm.
Before starting the case study, it is important to note that a critical aspect of physics-based design and qualification is linking the knowledge of failure mechanisms and failure drivers back to the manufacturing process. One of the classic limitations of the standard JEDEC qualification process is that sample selection is based on a nominal manufacturing process with the hope that three lots of a sufficient size will somehow magically introduce sufficient variation into the sample population. Reliability physics requires asking what aspect of the manufacturing and assembly processes could be a strong influence on a failure mechanism and what is the realistic worst-case setting that can be reasonably incorporated into the design of experiments.
The case study of the 2000T will focus on three important mechanisms: electromigration, creep and fatigue. Electromigration is the diffusion of material within an interconnect, such as a trace or via or solder ball, due to the flow of electrons. It is dependent upon the current density, temperature, and material/geometry of the interconnect. Black’s equation  is the algorithm typically used to describe this behavior
where A is a constant, j is the current density, n is an exponent (typically around 2), Ea is the activation energy (typically around 0.8eV for Pb-free solder bumps), k is Boltzmann’s constant, and T is the temperature of the solder bump (NOT the temperature of the package).
Traditionally, electromigration has been the provenance of die designers, but increasing current densities and smaller packaging geometries have driven the concern to first level interconnects. One of the first things we noticed about the 2000T is the size of the connections between the four FPGA die and the silicon interposers. Described as micro-bumps, their 25µm diameters are a significant reduction from the traditional 75-125µm diameter of solder bumps and even smaller than the familiar 45µm copper columns more recently introduced. Looking at Black’s equation, we can see that this new geometry, for the same current load through the interconnects, would potentially decrease lifetimes by an order of magnitude.
However, recent experiments also tell us that copper columns, by reducing current crowding and potentially driving SnCu intermetallic formation, can increase lifetimes by one to two orders of magnitude over Pb-free solder bumps (which, in turn, were about an order of magnitude worse than the high Pb bumps they replaced). There is also the possibility that current loads through the power connections and bump temperature could be higher compared to previous generations (a typical trend with silicon fabrication).
While a clear answer to electromigration in this particular packaging cannot be provided in this article, the key point about this relatively quick exercise is to demonstrate the ability of reliability physics to bring questions and discussions to the forefront of the design and qualification process. It can also bring out questions regarding the manufacturing process. For example , how sensitive is electromigration through the microbump and TSV structures to the manufacturing process? If the TSV is 90% filled, especially near the traces on the surface, how much does that reduce lifetime due to electromigration?
If information obtained through modeling and simulation demonstrates a sufficient level of design robustness, reliability physics can then be used to design an appropriate qualification test plan. While testing for electromigration is typically a full or partial factorial, if confidence in the constants in Black’s equation is high, then only one condition need be selected. In either situation, the conditions must be carefully selected. Again, reliability physics provides clear guidance on this decision. At elevated temperatures (100°C and above), the little tin (Sn) available under the copper column quickly reacts to form SnCu intermetallics that are very resistant to electromigration. However, at lower temperatures, the diffusion behavior of tin changes substantially from bulkdominated to grain boundary dominated  (this has been well documented in tin whisker experiments). This could result in non-relevant extrapolations from standard test conditions.
Additional mechanisms of interest for the 2000T or any other similar package architecture would be creep and fatigue of the interconnects. In this case, there are three levels of interconnections (micro bumps, solder bumps, and solder balls). It is well known that temperature cycling is the common JEDEC test used to characterize the creep and fatigue mechanisms of solder in semiconductor packaging. However, this basic approach has been called into question recently by differences in specifications and changes in materials.
To start, creep and fatigue behavior can tend to be captured through quantifying the strain range experienced during a temperature cycle  [. For most connections, this has been through
where LD is the diagonal distance, hs is the solder joint height, Δα is the difference in CTE between two structures (die and substrate, substrate and board, etc.) and ΔT is the change in temperature. More recently, foundation stiffness models  have been added to the mix to capture some of the complex mechanics that influence the stresses being applied to the solder connections; where F is the shear force, LD is length, E is the elastic modulus, A is the area, h is the thickness, G is the shear modulus, and a is the edge length of the bond pad.
The need to understand the physics behind this degradation phenomenon has been recently heightened due to differences between JEDEC JESD47 Stress-Test-Driven Qualification of Integrated Circuits and IPC 9701 Performance Test Methods and Qualification Requirements for Surface Mount Solder Attachments. The former specification, favored by the component industry, recommends 2300 cycles of 0 to 100°C. The latter specification, favored by the OEMs, recommends 6000 cycles of 0 to 100°C (if reliability physics has not been used to develop a tailored test plan). One can immediately see that the supply chain is now, potentially, testing under conditions up to two-thirds more benign than the OEM is expecting. On top of this, JESD47 makes no mention of the test board dimensions. Especially for a large component like the Virtex 7, testing on smaller, thinner printed boards could extend times to failure by 2 to 3X.
On top of this discrepancy, there has been a critical change in packaging materials. Because of the brittle nature of low-k dielectric, some OSAT vendors have migrated from underfills with high glass transition temperatures (Tg) (greater than 110°C) to underfills with low glass transition temperatures (less than 80°C). This one modification, in one packaging material, has in one stroke invalidated the classic approach to temperature cycling.
The reason for this complete rethink in qualification practices is because the underfill does not undergo uniform changes as it approaches its Tg. As shown in Figure 2, the coefficient of thermal expansion (CTE) increases more rapidly than the elastic modulus decreases. This difference in timing is because changes in the CTE in polymers tend to be driven by changes in the free volume where changes in modulus tend to be driven by increases in translational/rotational movement of the polymer chains. Because lower levels of energy are required to increase free volume compared to increases in movement along the polymer chains, CTE changes before modulus .
The downside to this little bit of physics is that the underfill will now expand significantly and have the stiffness necessary to push against the flip chip, causing a significant rise in tensile stresses within the interconnect (Figure 3). The fact that this stress state is over a very narrow temperature range can create situations where small changes in temperature can induce failure in a far shorter period of time than the traditional -40°C/125°C requirements. In the case of the 2000T, there are two levels of underfill (chip to interposer and interposer to substrate) to consider. The actual underfill material is unknown by the authors, but reliability physics tells us that one or more low Tg underfills could make a JEDEC temperature cycle test inadequate for some applications.
The appropriate approach in this situation for design verification and qualification depends if you are an eventual user of this technology (component manufacturer) or customer of this technology (OEM). In both situations, you will want to fully characterize the underfill CTE and modulus as a function of temperature. For a user, this can be done by performing thermo-mechanical analysis (TMA) and dynamic material analysis (DMA) on the raw underfill. For the OEM, more sophisticated techniques, such as nanoindentation and digital image correlation (DIC) are required to reverse engineer the key material properties.
Once this information is obtained, simulations need to be performed assuming global and local temperature rise. Local temperature rise, where power dissipation is constrained over small area, is likely the greater risk as it can induce a complex stress state that may result in tensile stresses being applied to the interconnects. Tensile stresses can reduce time to failure by an order of magnitude compared to classic shear and a deviation from the standard physics of failure formula for solder joint fatigue. The results from these simulations may drive modifications to thermal cycling, including the introduction of mini-cycles around the Tg.
While advances in packaging technology allow for continued progress in Moore’s Law, it also requires additional due diligence from users and customers. While in all likelihood this new packaging architecture is very robust, using reliability physics to validate designs and develop qualification plans is a common sense approach for risk mitigation for all customers (reliability is applicationspecific). By forcing the right questions to be asked, reliability physics can hopefully prevent the industry-wide escapes that tend to periodically plague the electronic industry (see low ESR capacitors, red phosphorus encapsulants, etc.). Always remember: an educated consumer is the best customer.
Craig Hillman received his BS in metallurgical engineering and materials science and engineering and public policy from Carnegie Mellon and his PhD in materials science from the U. of California – Santa Barbara and is CEO at DfR Solutions; email firstname.lastname@example.org
Nathan Blattau received his BS in civil engineering and his PhD in mechanical engineering from the U. of Maryland and is SVP at DfR Solutions.