The tragic crash of Air France 447 (AF447) in 2009 sent shock waves around the world. The loss was difficult to understand given the remarkable safety record of commercial aviation. How could a well-trained crew flying a modern airliner so abruptly lose control of their aircraft during a routine flight?
AF447 precipitated the aviation industry’s growing concern about such “loss of control” incidents, and whether they’re linked to greater automation in the cockpit. As technology has become more sophisticated, it has taken over more and more functions previously performed by pilots, bringing huge improvements in aviation safety.
Loss of control typically occurs when pilots fail to recognize and correct a potentially dangerous situation, causing an aircraft to enter an unstable condition. Such incidents are typically triggered by unexpected, unusual events – often comprising multiple conditions that rarely occur together – that fall outside of the normal repertoire of pilot experience. For example, this might be a combination of unusual meteorological conditions, ambiguous readings or behavior from the technology, and pilot inexperience – any one or two of which might be okay, but altogether they can overwhelm a crew. Safety scientists describe this as the “Swiss cheese” model of failure, when the holes in organizational defenses line up in ways that had not been foreseen. These incidents require rapid interpretation and responses, and it is here that things can go wrong.
Our research, examines how automation can limit pilots’ abilities to respond to such incidents, as becoming more dependent on technology can erode basic cognitive skills. By reviewing expert analyses of the disaster and analyzing data from AF447’s cockpit and flight data recorders, we found that AF447, and commercial aviation more generally, reveal how automation may have unanticipated, catastrophic consequences that, while unlikely, can emerge in extreme conditions.
Loss of AF447
AF447 was three and a half hours into a night flight over the Atlantic. Transient icing of the speed sensors on the Airbus A330 caused inconsistent airspeed readings, which in turn led the flight computer to disconnect the autopilot and withdraw flight envelope protection, as it was programmed to do when faced with unreliable data. The startled pilots now had to fly the plane manually.
A string of messages appeared on a screen in front of the pilots, giving crucial information on the status of the aircraft. All that was required was for one pilot (Pierre-Cédric Bonin) to maintain the flight path manually while the other (David Robert) diagnosed the problem.
But Bonin’s attempts to stabilize the aircraft had precisely the opposite effect. This was probably due to a combination of being startled and inexperienced at manually flying at altitude, and having reduced automatic protection. At higher altitudes, the safe flight envelope is much more restricted than at lower altitudes, which is why pilots rarely hand-fly there. He attempted to correct a slight roll that occurred as the autopilot disconnected but over-corrected, causing the plane to roll sharply left and right several times as he moved his side stick from side to side. He also pulled back on the stick, causing the plane to climb steeply until it stalled and began to descend rapidly, almost in free-fall.
The AF447 tragedy starkly reveals the interplay between sophisticated technology and its human counterparts. This began with the abrupt and unexpected handover of control to the pilots, one of whom, unused to hand flying at altitude, made a challenging situation much worse. A simulation exercise after the accident demonstrated that with no pilot inputs, AF447 would have remained at its cruise altitude following the autopilot disconnection.
With the onset of the stall, there were many cues about what was happening available to the pilots. But they were unable to assemble these cues into a valid interpretation, perhaps because they believed that a stall was impossible (since fly-by-wire technology would normally prevent pilots from causing a stall), or perhaps because the technology usually did most of the “assembling” of cues on their behalf.
The possibility that an aircraft could be in a stall without the crew realizing it was also apparently beyond what the aircraft system designers imagined. Features designed to help the pilots under normal circumstances now added to their problems. For example, to avoid the distractions of false alarms, the stall warning was designed to shut off when the forward airspeed fell below a certain speed, which it did as AF447 made its rapid descent. However, when the pilots twice made the correct recovery actions (putting the nose-down), the forward airspeed increased, causing the stall alarm to reactivate. All of this contributed to the pilots’ difficulty in grasping the nature of their plight. Seconds before impact, Bonin can be heard saying, “This can’t be true.”
Implications for Organizations
This idea – that the same technology that allows systems to be efficient and largely error-free also creates systemic vulnerabilities that result in occasional catastrophes – is termed “the paradox of almost totally safe systems.” This paradox has implications for technology deployment in many organizations, not only safety-critical ones.
One is the importance of managing handovers from machines to humans, something which went so wrong in AF447. As automation has increased in complexity and sophistication, so have the conditions under which such handovers are likely to occur. Is it reasonable to expect startled and possibly out-of-practice humans to be able to instantaneously diagnose and respond to problems that are complex enough to fool the technology? This issue will only become more pertinent as automation further pervades our lives, for example as autonomous vehicles are introduced to our roads.
Second, how can we capitalize on the benefits offered by technology while maintaining the cognitive capabilities necessary to handle exceptional situations? Pilots undergo intense training, with regular assessments, drills, and simulations, yet loss of control remains a source of concern. Following the AF447 disaster, the FAA urged airlines to encourage more hand-flying to prevent the erosion of basic piloting skills and this points to one avenue that others might follow. Regular, hands-on engagement and control builds and maintains system knowledge, enabling operators, managers, and others who oversee complex systems, to identify anomalies, diagnose unfamiliar situations, and respond quickly and appropriately. Structured problem-solving and improvement routines that prompt one to constantly interrogate our environment can also help with this.
Commercial aviation offers a fascinating window into automation, because the benefits, as well as the occasional risks, are so visible and dramatic. But everyone has their equivalent of autopilot, and the main idea extends to other environments: when automation keeps people completely safe almost all of the time, they are more likely to struggle to reengage when it abruptly withdraws its services.
Organizations must now consider the interplay of different types of risk. More automation reduces the risk of human errors, most of the time, as shown by aviation’s excellent and improving safety record. But automation also leads to the subtle erosion of cognitive abilities that may only manifest themselves in extreme and unusual situations. However, it would be short-sighted to simply roll back automation, say by insisting on more hand-flying, as that would increase the risk of human error again. Rather, organizations need to be aware of the vulnerabilities that automation can create and think more creatively about ways to patch them.