NFF or No Fault Found determinations are a time-consuming and expensive issue for most repair stations. In most cases, it is difficult to recoup any related labor expenses from customers. That is, no one wants to pay for your time if they think that you didn’t do any work. This article in the February/March 2014 edition of Aviation Maintenance magazine details how some service facilities are incorporating additional measures to eliminate this process. These measures include extensive environmental testing, more test equipment, and analysis methods.
Getting Serious About No Fault Found
Written By Charlotte Adams: Aviation Maintenance
In these days of tight defense budgets and thin airline margins, the perennial no fault found (NFF)—or as the military calls it, cannot duplicate (CND)—problem is like a weight that is dragged along because there seems to be no alternative. It affects military, commercial and business aviation, reducing readiness and bloating expenses. Yet NFFs are hard to pinpoint and eliminate because they may be momentary, like the flickering of a light bulb before it fails.
An NFF occurs when a reported fault cannot be duplicated in the avionics shop and therefore cannot be fixed. The box that malfunctioned is returned to service but eventually reappears for testing. Ultimately, if the problem cannot be identified and fixed after repeated attempts, the unit is quarantined and labeled as a rogue or bad actor.
NFFs may be associated with the growing complexity of avionics designs, aging avionics, the wear and tear of constant operations, and the insufficiency of some test equipment and procedures. The problem can be exacerbated by poor communications between operators and maintainers and even between designers and users. That seems to be changing, however, as designers, operators and maintainers—are taking a hard look at the problem and devising new ways of attacking it.
The costs are considerable, and the pity is, it’s all wasted money. Airline experts cite costs in the millions of dollars for NFF problems. A single line replaceable unit (LRU) program—involving the removal and test of 30 units a year—might cost an airline $27,000 a year if the NFF rate was 60 percent, according to one expert.
The problem in military aviation rises to a higher level. According to an article on avionics chassis wire testing in the Spring 2011 newsletter of the Joint Services Wiring Action Group (JSWAG), a cross-service forum for resolving wiring, interconnect and fiber optic system problems, the Office of the Secretary of Defense estimates that intermittent avionics faults inside weapons replaceable assemblies (WRAs) cost the Defense Department (DoD) more than $2 billion annually. (See http://www.navair.navy.mil/jswag/docs/newsletters/JSWAG_Spring_2011.pdf.) The JSWAG bulletin adds, “There is no standardized, automated, DoD approved process which capitalizes on a logistically supportable solution.”
A 2010 press release from the U.S. Naval Air Systems Command (NAVAIR) facility at Lakehurst, N.J., lamented the “lack of advanced avionics diagnostics and prognostics capabilities” in Navy and Marine Corps aircraft maintenance activities. It estimated that about 72 percent of total Navy and Marine Corps maintenance actions at the time were related to avionics and described CND conditions as “wasting numerous maintenance-man-hours, increasing aircraft downtime, and increasing logistics and testing costs.”
Eclypse International of Corona, Calif., offers a wiring diagnostic device called the Automatic Wire Test Set (AWTS). The equipment has been fielded by the Naval Air Systems Command (NAVAIR), for one. NAVAIR has applied an AWTS test program set that the agency developed with Eclypse to the radar receiver chassis—the weapons replaceable assembly (WRA) with all the shop replaceable assemblies (SRAs) removed—of the F/A-18, according to a newsletter issued by the Joint Services Wiring Action Group (JSWAG), a cross-service forum. The AWTS includes programmable power energization that allows verification of electrical systems integrity with the least amount of disconnections, the company maintains. AWTS also includes a “Self Learn” function that determines the circuits’ condition, Eclypse says.
A more recent statistic from the civil aviation world was supplied by the NFF Steering Committee, a panel under the aegis of the American Institute of Aeronautics and Astronautics (AIAA) that is looking at NFF issues in both avionics and mechanical systems. In response to our questions, the group’s chairwoman, Lori Fischer, estimated that organizations—at least on the commercial side—without an NFF focus experienced an approximately 40 percent NFF rate for mechanical LRU returns and an approximately 60 percent rate for avionic LRU returns. The NFF rates are significantly lower for organizations with aggressive NFF reduction strategies such as the methodology outlined in ARINC 672 and approaches like Conserve All Serviceable Hardware (CASH) and Holistic Systems Maintenance (HSM), Fischer says.
The two-year-old AIAA group emphasizes that it is a technical committee and is not promoting a quick fix for avionics NFF problems. The committee encourages teaming between organizations such as airlines, airframers, and their suppliers at various levels. NFF is a difficult issue because it involves daunting, cross-disciplinary complexity and requires data sharing, something that isn’t always popular in a highly competitive industry. The AIAA committee is working with aerospace industry partners who—through a Lean-like process of self-analysis—will be able to identify ways to reduce the NFF problem. Their findings could be incorporated into the ARINC document.
NFF rates vary widely by product type, design or age, says Joe Kenney, vice president of product integrity for avionics manufacturer and service provider, Honeywell, focusing his comments on the business aviation sector. Units that record faults in memory and that can include flight conditions, for example, are easier to diagnose than units with no memory, no fault logs, and no system-testing available in the repair shops, he explains. Industry data shows that NFF rates can vary between 15 percent at best to more than 60 percent at worst, depending on these factors, Kenney adds.
Outside the Box
One of the trends in aviation maintenance is to look beyond the box to the system of which the LRU is a part. Technicians at Honeywell, for example, are receiving additional systems training, in order to understand how a unit may interact with other units in the aircraft and result in a failure that is not detectable by testing the unit in a “singular fashion”—on a bench without that system interaction, Kenney says. “Many times, having the larger systems knowledge is the key to understanding why a fault is reported by a customer but cannot be seen on the bench when the unit is tested in isolation from other units it needs to “communicate with” in an integrated avionics architecture, he explains.
Many of the NFFs returned from customers failed on installation due to system integration issues, Kenney says. “The integration of these components together is one of the main reasons that NFFs are such an issue.” Currently, testing is done on each component separately rather than on the full system. This is changing, however, as testing is being updated to include subassemblies and a better system integration approach, he says.
To show how difficult the troubleshooting problem can get in integrated systems, Kenney cites a recent example involving a Honeywell box used in a business aviation aircraft. It turned out, as the company learned from the aircraft manufacturer, that the problems experienced by the Honeywell box were being caused by an intermittent in another manufacturer’s box.
In this context Honeywell identified an industry trend of designing more nonvolatile memory into devices than they require for their own purposes. The point of these “mini-flight data recorders” is to help in troubleshooting problems like NFFs and to provide additional data in an accident investigation. The additional nonvolatile memory can be used to collect data from the data bus that relates to the integrated system rather than just to an individual manufacturer’s component of the architecture.
From a mechanical LRU standpoint, one advancement involves the testing of an LRU in development at the system level instead of at the component level only. This approach allows changes to be made to the design before the LRU goes into production, Fischer says.
Scott McKenzie, an avionics/instruments technical representative with Duncan Aviation, also associates NFFs with integrated avionics systems. Although NFFs are often a symptom of aging wiring components affected by corrosion, constant vibration, and temperature swings, NFFs occur in newer equipment, as well. “Generally speaking, the more components that are a part of an integrated system, the higher the percentage of NFFs,” he observes vis-àvis business aviation aircraft.
He says that NFFs, as a percentage of total faults in Duncan Aviation’s bench environment, can vary from as low as five percent for basically self-contained components like gyros to over 30 percent for components that are part of an integrated system such as an autopilot or a radar.
Digital ATE works well much of the time. It applies stimulus, measures and compares a circuit or wire one at a time, explains Ken Anderson, vice president of sales and business development for Universal Synaptics, an Ogden, Utah, avionics test equipment company that has taken a hardware-based analog neural network approach to the no fault found (NFF) problem. Although conventional ATE can multiplex rapidly between circuit measurements, intermittents can be nanoseconds in duration. So finding the culprits can be hit or miss. (The company uses digital processing simultaneously on the back-end for packaging and reporting data.)
Anderson divides avionics maintenance cases into two categories, each representing about half of the total problem. The first and harder half is the NFFs. Hard failures, where a component or a wire fails completely, form the second half. Conventional ATE can find the second, but rarely the first, Anderson asserts. Conventional ATE, moreover, is digital—it samples and averages inputs from the unit under test. This means that the peaks and displacements—the “tremors” that are the intermittents—are lost, Anderson maintains.
Universal Synaptics takes a comprehensive approach to locating NFFs—the little “ohmic glitches,” as the company calls them—in the thousands of electrical distribution components found in the typical line replaceable unit (LRU), or box. Instead of testing circuits sequentially, the company’s Intermittent Fault Detection & Isolation System (IFDIS) tester monitors them at the same time, Anderson says.
To illustrate the complexity of the task, an LRU might have 20 to 100 wires coming into it, Anderson says. Inside the box, however, there might be 1,000 to 8,000 wires and other electrical distribution components, such as connectors, connector contacts, wire wrap, solder joints, and circuit traces which degrade over time and become corroded, cracked or loose—in other words, intermittently effective. The Universal Synaptics tester simultaneously and continuously monitors every single electrical path in the unit under test (UUT), while exposing the UUT to a simulated operational environment. The IFDIS can detect any intermittent discontinuities on any circuit, at durations as short as 50 nanoseconds (0.00000005 seconds), Anderson says.
The initial application of the company’s IFDIS depot tester was the Modular Low Power Radio Frequency (MLPRF) unit found in the radar system of the U.S. Air Force’s F-16 fighter aircraft. During this project USAF technicians were able to save $62 million, or about 28 times the approximately $2.2 million they spent on two IFDIS’s, Anderson says. The tester found intermittent coaxial cable lines, broken wires (held together with heat shrink tubing that masked their intermittent condition), cracked solder joints, loose crimps and unsoldered pins. One hundred and thirty-eight previously “unrepairable” MLPRFs were returned to service and the unit’s reliability tripled, he says. Savings were realized from reducing the number of boxes going to the depot and returning the previously “unrepairable” LRUs to the field, he says.
Universal Synaptics also has been working with the U.S. Navy on an F/A-18 generator converter unit (GCU). This demonstration project involved testing six units over a one-month period. Eighty percent of the units the company tested had one or more intermittents, Anderson says, and “every one of these units had been certified as ‘ready for installation,’” he recalls.
This complex test equipment seems expensive compared to conventional ATE, ranging from $149,000 to around $2 million. However, based on the high cost of the military’s NFF problem, the price seems much more reasonable, Anderson says. The price is based on the number of test point modules that are required. In addition to the processor, the IFDIS also requires the fabrication of “interfaces” that are essentially replicas of the cards minus the chips and other components. During test the UUT is powered off and the IFDIS supplies the stimulus and monitors the UUT for any impedance changes. A portable version of the technology, a flight-line tester, called the Ncompass-Voyager, sells for about $100,000. It is used to test an aircraft’s wiring system, looking for problems such as opens, shorts, incorrect wiring and intermittents. With 256 test points, the Ncompass-Voyager is actually is a single module of the larger system, Anderson explains.
Universal Synaptics also has worked with commercial aviation. A U.S. airline client in 2012 was experiencing a 40-50 percent NFF rate on the engine controller unit of an auxiliary power unit (APU) on one of its Boeing 757s. The unit had undergone six prior removals although it was being tested and returned to service by the original equipment manufacturer, Anderson says. But when the unit was tested with IFDIS during a project with the customer, nine intermittent faults were found. After repairs were made, that APU has now been on wing more than 300 days, with over 3,000 consecutive flying hours, he says. The larger population of APUs, meanwhile, had registered 112 removals in a year, he adds.
A common way of pinpointing incipient failures is to apply environmental stimulus in an endeavor to simulate the conditions a box was experiencing at the time a failure was noted.
Sometimes problems lie in the conditions present at the time the squawk occurs, Duncan’s McKenzie explains. But environmental stimulus is not a panacea. “While we do our very best to be able to simulate all possible conditions that are present, such as extreme temperatures, atmospheric pressure, humidity or vibration, it may not recreate the factors that helped to cause the original fault,” he says. “For example, if bleed through is being experienced on a VHF [very high frequency] transceiver on the aircraft, but the unit passes all tests in a bench environment, the problem could be caused by poor shielding of various coaxial cables in the aircraft. It is impossible to recreate the potential noise that is generated on the aircraft in a bench setting,” he says.
Honeywell has dealt successfully with NFF issues by using traditional environmental simulation in new ways, Kenney says. Traditionally, shops simulate the environment by using extremes of hot and cold. But a Honeywell facility in Irving, Texas, used “a precisely controlled temperature ramp rate…instead of [temperature] extremes… to see radio frequency spurs causing processor faults in the radio altimeters,” he recalls. A Honeywell shop in Olathe, Kans., likewise innovated by using altitude chambers “to duplicate…arcing failures in several cockpit display models,” he says. “Traditionally, shops use altitude simulation to check arcing for LRUs mounted outside of the pressure vessel,” Kenney explains.
Honeywell’s avionics repair shops are currently using basic Six Sigma project methods, such as statistical analysis of maintenance data, to drive the reduction of NFFs in repaired units, Kenney says. In the case of older boxes intended for federated systems, for example, Honeywell uses historical maintenance data that it has collected from customers to narrow the possible solution set. These boxes lack sophisticated fault and performance recording capability.
Initial results from a project conducted in 2013 at the company’s Wichita, Kans., facility, using Six Sigma tools and testing enhancements, showed a 50 percent reduction in NFFs in the targeted defect category from the original project work, Kenney says. The initiative focused on “escapes”—zero-flight hour or installation failures.
ATE Improvements & Wish List
Sometimes home-baked is better. Duncan Aviation’s McKenzie cites a test set that was designed in-house for use on the MRO’s gyro bench. The test set “goes above and beyond what is required [by]…the overhaul manual,” he says. The company’s test set can “isolate specific circuits…to narrow down the cause of the squawk,” he explains.
But conventional ATE is limited in solving the most enigmatic intermittent problems “because intermittent failures rarely synchronize with the ATE’s scanning measurement window,” AIAA panel members observe.
The JSWAG journal also describes the limitations of conventional automatic test equipment: “Basic programs of…ATE such as [the] Consolidated Automated Support System (CASS) assume that conductive paths within the chassis are good. This assumption has resulted in numerous false shop replaceable assembly (SRA) replacements, and inabilities to fault isolate the true cause of the WRA [weapons replaceable assembly] failure.”
The AIAA committee would like to see a tester “that uses intermittent fault detection circuitry to monitor every electrical path simultaneously and continuously in the equipment under test,” while exposing the unit under test (UUT) to a simulated operational environment.
Another item on the NFF committee’s wish list is the design of components that are able to track and measure, in real-time, data about the environment that they experience in service.
Another approach is to use human factors-designed tools to troubleshoot NFF components, Honeywell’s Kenney says.
“We are looking at mobile applications that interface between the maintenance technician and our maintenance bit data to…improve NFF troubleshooting.” Honeywell apps made for I-phones, for example, allow company product support engineers to assist maintenance technicians, especially on older systems, by giving them the top things to check on an aircraft. The goal is to help the maintenance technicians take the guesswork out of where a failure occurred and to replace the correct part the first time around, he says. This would greatly reduce costs throughout the company and, in return, increase customer satisfaction because of reduced turnaround time, Kenney says.
AFIKLM E&M Q&A on No Fault Found
We asked Taco Vingerhoed, Avionics and Accessories director at AFI KLM E&M to give us his take on our NFF questions. Here’s what one of the world leaders in MRO had to say.
AM: Have there been any advances in testing technology that will make NFFs less common?
Vingerhoed: This is a continuous process. In first place the OEM’s are developing much better BITE (built-in test). They claim that at least 80 percent of internal failures of so called “homesick unit” components can be detected in the LRU’s by just doing self-test’s in the aircraft. That is preventing a lot of removals. The test equipment in the shop is becoming more sophisticated too.
AM: Where does the problem lie—in the design of a part, interconnections between parts, operation of part or the test of the part?
Vingerhoed: At AFI KLM E&M we call it 3C: Cable-Connector-Contact. This is the most common cause of intermittent problems. If the LRU has tested OK and we have more than three removals per year, we start the 3C process. The probability that we will discover a bad connection by exercising 3C is quite high.
AM: Do you have a system or philosophy for dealing with this problem?
Vingerhoed: At AFI KLM E&M we have a reliability program that will trigger NFF trends on which we act. It tracks every part turned in for repair by serial number. We watch the fleet and monitor them to understand what the data says. For the great majority of NFFs, AFI KLM E&M monitors the data and does a basic investigation to determine whether a cause can be found for the failure. For these, AFI KLM E&M undertakes a detailed process analysis of what went wrong. Direct feedback from ground engineers, our maintenance control centers and our avionic shops also function as triggers for us to act upon.
AM: Can you give any examples of where you have dealt successfully with NFF issues?
Vingerhoed: At a certain moment we noticed that some components from passenger service systems were having an increased NFF rate. CMM test was always passed but in the aircraft was poor performance. After investigation we discovered that some tantalum capacitors were failing two or three hours after power up. We decided to replace those capacitors during the next shop visit on every unit and thus decreased sustainably the NFF rate.
AM: Can you give a ballpark estimate of the cost of NFF as a percentage of total avionics maintenance cost or total maintenance cost?
Vingerhoed: The costs can be huge for NFFs. There are the costs of unnecessary logistics and additional LRU’s for inventory, man-hours for unnecessary repairs and, worst of all, possible delays of flights for unnecessary removals. Therefore, it is very difficult to provide NFF costs in advance or in general. Once when a NFF situation is isolated and solved, the costs can be established.
AM: Is repetitive testing necessary?
Vingerhoed: Sometimes, yes, but you must be careful. What would you do if you test it three times and it fails two times but the last time is OK? Would you deliver the LRU? The formal answer is “yes” because it passed the test. But we know that there is something hidden in the LRU. Therefore, it needs our special attention.
AM: Are better testers necessary? List of desired capabilities?
Vingerhoed: Not necessarily. The tester will do exactly what the CMMs say. After successful CMM tests the LRU will be considered airworthy. If a CMM test is not sufficient, than we must report it to OEM and request to change or add test steps. Better troubleshooting procedures might be required—flying probe, thermal probes, x-ray machines, etc. A complete fault tree analysis on an LRU, which is not always available now, would also help substantially.
AM: Should the environment the component is experiencing be taken into account in testing process??
Vingerhoed: It is certainly true that some NFFs are just failed units without shop confirmation of the fault. In those cases, the standard return-to-service testing fails to find the problem. However, those units will keep on failing once on-board an aircraft and are subsequently identified as a rogue unit. Most of the time shops will be able to find the source of the problem after putting the unit under extensive testing. This would include ‘shake & bake’ testing. However, since this only applies to a very small percentage of the failures one might question the economic viability of putting every unit under extensive testing.
AM: Is deep-dive analysis of historical maintenance data, to see whether problems have been building up, beneficial?
Vingerhoed: Typically we look into both historic shop reports and aircraft fault reports to get a complete picture of the unit’s history and the problem. Depending on the actual situation we also consult service bulletins, component maintenance manuals, modification history and training manuals.
AM: Do any of the on-board maintenance computers failure prediction technologies solve most of the issues for you?
Vingerhoed: The NFF problem would be solved if we had on-board maintenance computers that will identify units as a NFF. Working together with avionics shops, component vendors, aircraft manufacturers and ground engineers will help us create maintenance tooling, instructions and guidance. This in turn will help the ground engineer to identify the correct source of the problem on-board the aircraft limiting the NFF rates. It is because of this that AFI KLM E&M is able to lower NFF rates and improve the reliability of components.
Adams, Charlotte. “Getting Serious About No Fault Found.” Aviation Maintenance |avm-mag.com. February / March 2014: Pages 26 -32.