Flight data resilience

Andreas
Nov 29, 2022
6 min read

There is plenty of evidence that sensor malfunctions can be harmful to the safe conduct of flight. In fact, pilots are taught from early on, that they might face some sort of “misleading” indication. Engineers do the same on their part: They critically analyze failure modes and try to develop systems that are fault tolerant.

And yet, we see a seemingly never-ending line-up of accidents and incidents caused by faulty sensor readings. Therefore, the question must be asked: Can we do more?

Pilots have very useful techniques for aircraft handling, such as the “control and performance technique”, sometimes referred to as “Pitch/N1” or “Pitch/Power”. Additionally, instrument cross-checks are part of every pilot’s scan. Nevertheless, it is clear, that human brains have limitations and there is simply a limited amount of complexity that a human pilot can handle. See here for a dramatic example. Additionally, many aircraft employ Higher Order Flight Controls (HOFC) or some form of envelope protection and these shall be able to deal with faulty sensors too. In this article we will focus on the aircraft design aspect. In fact, many manufacturers have developed advanced concepts and recently EASA launched a special research effort for enhanced fault detection in air data systems under the lead of Airbus and TU Delft [1]. We will look at the basic ideas of such concepts without getting entangled in mathematics. We start our discussion with Angle-of-Attack (AOA) sensors, but the concepts here are independent of the parameter.

The simple design

The most simplistic way to monitor flight parameters is simply to hook up one sensor to one computer. It will be clear to everybody that this kind of design is not very robust.

ree — Figure 1: Simple layout: One sensor, one computer. A faulty value will simply be used by the computer.

Much to the shock of the industry this layout was used on the B737 MAX MCAS [2].

Hardware redundancy

The examples that follow explore the concept of hardware redundancy. This is widely-used in industry but certainly not without drawbacks, as we will see.

Two sensors, two computers: A little bit better…

The parameter of interest (here AOA) is measured by a second sensor and analyzed by two computers. This kind of layout is capable of detecting a “discrepancy” between the two sensor signals, however it then may be difficult to tell which one is correct if no other information is used. These systems typically issue a “miscompare” alert [3].

ree — Figure 2: Two individual readings are compared by the computers.

Three sensors, three computers: Better still…

Many transport category aircraft measure critical parameters in three or more different locations [3]. This certainly adds robustness to the system, but is by no means the “gold standard”. It should be emphasized that this layout does not guard against “common cause” failures, such as freezing of multiple sensors, nor is the identification of the correct value a simple undertaking. The famous “two-out-of-three” philosophy which is very common can get it wrong, as demonstrated by the incident described below.

Figure 3: Very common: Three sensors and three or more computers.

Many computers calculate the mean value of the sensors and the “reject” a sensor if it significantly deviates from the mean. There is just one problem: Maybe “the outlier” is the correct one…

This is precisely what happened on November 4th 2014 near Pamplona (Spain) [4]: After an uneventful takeoff and initial climb, an Airbus A321 was breaking cloud cover around FL200 when the commander noticed an unusual movement of the “alpha protection” band on his PFD. A little later, (around FL310) the aircraft went into a descent, which could not be arrested by flight crew inputs. The F/O was PF and tried to counteract this motion after disconnecting the autopilot. After a short discussion the commander took control, but faced the same difficulty. The aircraft descended to approximately FL270, where the crew were able to establish “level flight” while maintaining a constant aft stick-input. In this stricken state, the crew sent an AOC message to the maintenance department of the operator. The technicians did a marvelous job looking at the aircraft data and concluded that the AOA values 1 and 2 seemed incorrect and this directed the crew to switch off the corresponding air data computers. The aircraft reverted to “alternate law” and the aft stick-input was no longer required. The flight continued to its destination and landed without problems.

What had happened?

It turns out, that AOA 1 and AOA 2 were indeed frozen at a similar position and only AOA 3 was correct. However, due to the system logic, AOA 3 was rejected as the “outlier” without crew alert. When the Mach number became high enough, the envelope protection activated. This also explains, why the aircraft levelled off at FL 270, as the Mach number decreased during the descent. See Figure 4 below for the flight data:

ree — Figure 4: Flight data of the uncommanded pitch-down in cruise [4]

The important events can be located as follows (use time reference at the bottom):

- 07:04 AOA 1 and 2 are frozen at app. 4.5° and alpha prot. activates.

AOA 3 shows correct value, but is rejected by the computers.

- 07:08 Aircraft levels off at app. FL270 with constant aft stick-input

- 07:36 Crew switches OFF and ON ADR 3 with no effect

- 08:00 Crew switches OFF ADR 2. Aircraft changes to alternate law,

aft stick-input no longer required

- 08:28 AOA 1 and 2 become valid again

Note that the aircraft manufacturer has since changed the software and this malfunction can no longer occur, as the system now uses a more capable analysis method (see analytical redundancy below).

Analytical redundancy

Latest by now, it should be evident, that simply adding more sensor to measure the same parameter is neither an efficient, nor a robust design. It primarily adds complexity and the identification of the correct value can be tricky. Therefore, a smarter way to make a design more resilient is to use some form of analytical redundancy. This approach is particularly relevant for small UAV’s due to the limited space available [5].

ree — Figure 5: Analytical redundancy methods, based on [5]

Analytical redundancy can largely be grouped in three methods:

The model-based approach involves detailed knowledge of the aircraft dynamics during the entire flight, including environmental effects, such as icing. This is usually not easy to obtain and therefore this variant is rarely implemented.

The model-free analysis relies on much more basic kinematic relations, which are independent of specific aircraft dynamics. This concept is readily transferrable to other aircraft and has been proposed for several applications [5].

Purely data-driven methods rely on mathematical procedures to isolate “inconsistent” data without knowledge of kinematic relations. The mathematics can get very complex here, but a simple example would be a basic “trend monitor” that detects a “jump” in a measured parameter.

The concept of analytical redundancy based on kinematic relations

A typical implementation is shown below. The flight control computers are gathering parameters from different sensors. For simplicity, only one computer and one set of sensors is shown here.

ree — Figure 6: Collection of vital flight parameters

Regression, residual and Extended Kalman Filter…

Here comes the magic trick: Instead of simply using a parameter “as measured”, it is subject to a statistical integrity check. The mathematics here get very involved, and the interested reader is directed to [5] and [6] for a more thorough description. To grasp the concept, it is sufficient to look at the following example:

Using basic kinematic relations, an “estimate” is created for every parameter of interest, based on other flight parameters. The process to identify the most useful other parameters is called “regression analysis” [5].

ree — Figure 7: Measured and estimated AOA, residual calculation

Now, a comparison is made between the measured AOA and the estimated AOA. This is referred to as the “residual” (see Figure 7). In a perfect world, this would always be zero.

There are some peculiarities here with regard to discrete time and signal noise [6], but that is not relevant for the basic understanding.

Monitoring the residual

A continuous residual monitor can be used to detect any deviation from zero of the residual. This of course indicates a discrepancy between the measured value and the estimated one [5]. A faulty sensor value is therefore readily identified. Many systems, particularly in the UAV domain, in fact never use the measured value as such, but always some “processed” value using sophisticated filters such as Extended Kalman Filter (EKF) [5]. On transport category aircraft, many manufacturers offer the possibility to display “synthetic air data”, once a fault has been identified in a sensor.

To see the concept of residual monitoring “in action”, refer to Figure 8 below. A hypothetical AOA data set of an aircraft in climb to cruise altitude is depicted. At some point, the AOA sensor freezes in position. It can then be observed, how the residual starts to grow as the estimated value begins to drift away from the measured value. At the residual alarm level, the AOA sensor would be “flagged” as unreliable.

ree — Figure 8: Residual calculation and alarm level

Compare the data of Figure 8 with the AOA data from Figure 4 and see how the concept identifies the frozen sensor. The neat thing about analytical redundancy is that it does not require nearly as much installed equipment as the conventional hardware redundancy.

From a practical perspective, the best choice is usually a mix between hardware redundancy and analytical redundancy, taking into consideration the specific airframe limitations. The UAV industry is heavily involved here, given the hardware constraints there. We shall see an increased usage of this concept on transport category aircraft to drastically reduce the probability of misleading data being used by either pilots or computers.

References

[1] EASA, «Enhanced fault detection and diagnosis solutions for air data systems,» 2022. [Online]. Available: https://www.easa.europa.eu/en/research-projects/enhanced-fault-detection-and-diagnosis-solutions-air-data-systems. [Zugriff am 19 11 2022].

[2] The house committee on transportation and infrastructure, «Final committee report: Boeing 737 MAX,» 2020.

[3] I. Moir, A. Seabridge und M. Jukes, Civil Avionics Systems, Wiley, 2nd ed., 2013.

[4] German Federal Bureau of Aircraft Accident Investigation, «Interim report: BFU 6X014-14, serious incident A321 near Pamplona,» 2015.

[5] K. Sun und D. Gebre-Egziabher, «Air data fault detection and isolation for small UAS using,» Wiley Institute of Navigation, Bd. 68, pp. 577-600, 2021.

[6] Z. Li, Y. Cheng, H. Wang und H. Wang, «Fault detection approach applied to inertial navigation system / air data system integrated navigation system with time‐offset,» IET Radar, Sonar & Navigation, Bd. 15, pp. 945-956, 2021.

www.engineeringpilot.com

Flight data resilience

Recent Posts