SYSTEM ACCIDENT

A 'system accident', is defined as an accident that involves "the unanticipated interaction of multiple failures" in a complex system. This complexity can either be technological or organizational, and frequently is some combination of the two. Because of the complexity of these accidents they are virtually unpredictable, yet the problems which cause them are blatant in hindsight.[1] These accicdents resemble Rube Goldberg Devices in the way that small errors of judgement, flaws in technology, and insignificant damages combine to form an emergent disaster. System accidents were first defined by Charles Perrow in 1984. William Langewiesche has also been an influential writer on system accidents. Perrow also uses the term 'normal accident'.
Once an enterprise passes a certain point in size, with many employees, specialization, backup systems, double-checking, detailed manuals, and complex communication, employees recourse primarily to following rigid protocol and habit. The close-knit nature of some groups in the large office or hospital can also lead to the phenomenon known as groupthink.
A key insight is that this can happen with safety systems just as with any other aspect of an endeavor, especially if the social norm is that one cannot question "safety", such as happened in the Space Shuttle Challenger Disaster. Here a group of engineers raised the question of whether the O-rings would malfunction at the low temperatures forcast for the launch day. They were ignored because NASA officials did not want to delay scheduled take off further (it had already been pushed back five times to six days after its scheduled launch date). Several days later, the Challenge exploded shortly after take off because the o-rings malfunctioned from cold weather.[2]
Often processes and events are opaque, as opposed to transparent. (Perrow terms this ''incomprehensibility,''[3]) When processes are "opaque", it means that there is little comunication between the top of the chain of command and the people actually carrying out orders. This can lead to a situation where the upper management has little realistic information about the current situation, and the lowest level has only their last orders, protocol, and their own intuition to act by.

Contents
Examples of system accidents
Three Mile Island
ValuJet 592, The Everglades, May 1996
Space Shuttle Columbia as a System Accident
See also
References
External links

Examples of system accidents


Three Mile Island

The failure of unit 2 occurred at 4 am on 28 March 1979 while the reactor was operating at 97% power. A relatively minor malfunction in the secondary cooling circuit caused the temperature in the primary coolant to rise. This in turn caused the reactor to shut down automatically. Shut down took about one second. At this point a relief valve failed to close.[4] The relief valve, which released steam in a manner similar to a teakettle, failed to close after releasing the excess heat, but it's sensor reported that it was closed. the backup pumps had started immediately after shutdown of the main pump, but unfortunately the water lines were closed for testing two days before and were never reopened. There was also no sensor for the water line flow in the control room, but the pumps sensors reported that they were functioning properly. The human control room operators were provided no information on water level in the reactor, but rather judged by pressure, which was being stabilized by the open valve, which they thought was closed. The water line malfunction went unnoticed for eight minutes. When the valves were opened, "steam voids", places where water cannot flow through a pipe because it converts to steam and stops flowing, had formed in the cooling system, rendering it useless. Three of these four failures had occurred separately with no harm, but the coincidental combination of human error and mechanical failure created a near-disaster.[5]
In their analysis of the Three Mile Island nuclear reactor accident, Cantelon and Williams (1982, p. 122) note that the failure was caused by a combination of mechanical and human errors, but the recovery worked 'because professional scientists made intelligent choices that no plan could have anticipated.'[6]
ValuJet 592, The Everglades, May 1996

Mechanics removed oxygen canisters from three older aircraft and put in new ones. Most of the emphasis was on installing the new canisters correctly, rather than disposing of the old canisters properly. These were simply put into cardboard boxes and left on a warehouse floor for a number of weeks. The canisters were green-tagged to mean serviceable. A shipping clerk was later instructed to get the warehouse in shape for an inspection. (This very human motive to keep up appearances often plays a contributory role in accidents.) The clerk mistakenly took the green tags to mean non-serviceable and further concluded that the canisters were therefore empty. The safety manual was neither helpful for him nor for the mechanics, using only the technicall jargon "expired" canisters and "expended" canisters. The five boxes of canisters were categorized as "company material," and along with two large tires and a smaller nose tire, were loaded into the plane's forward cargo hold for a flight on the afternoon of Saturday, May 11, 1996. A fire broke out minutes after take-off, causing the plane to crash. All five crew members and 105 passengers were killed.
If the oxygen generators had been better labeled—that they generate oxygen through a chemical reaction that produces heat, and thus were unsafe to handle roughly, not to mention fly in a cargo hold—the crash might have been averted.
Many writers on safety have noted that the mechanics did not use plastic safety caps which protect the fragile nozzles on the canisters, and in fact, were not provided with the plastic caps. The mechanics, unaware of the hazardous nature of the canisters, simply cut the lanyards and taped them down. ValuJet was in a sense a virtual airline with maintenance contracted out, and the contracting firms often went another level in this process by contracting out themselves. The individual mechanics, in William Langewiesche's words, "inhabited a world of boss-men and sudden firings," obviously not an environment where people can confidently raise concerns. The disaster could also have been prevented by proper disposal or labeling of the tanks.
Space Shuttle Columbia as a System Accident

The Columbia disaster began long before the Columbia even left the ground. The bridge between the shuttle and the dock is constructed of a resin foam, which is extremely stong and dense, yet simple to manipulate and form. A chunk of this foam fell and hit the shuttle's wing, damaging the heat shield. A damage assessment program, ''Crater'', predicted damage to the wing, but was dismissed as an overestimation by Nasa management. In a risk-management scenario similar to the Challenger disaster, NASA management failed to recognize the relevance of engineering concerns for safety. Two examples of this were failure to honor engineer requests for imaging to inspect possible damage, and failure to respond to engineer requests about status of astronaut inspection of the left wing. NASA's chief thermal protection system (TPS) engineer was concerned about left wing TPS damage and asked NASA management whether an astronaut would visually inspect it. NASA managers never responded.
NASA managers felt a rescue or repair was impossible, so there was no point in trying to inspect the vehicle for damage while on orbit. However, the CAIB determined either a rescue mission or on-orbit repair, though risky, might have been possible had NASA verified severe damage within five days into the mission. Report of Columbia Accident Investigation Board, Volume I, chapter 6, page 173 (PDF) Columbia Accident Investigation Board In-Flight Options Assessment, Volume II, appendix D.12 (PDF) Columbia Accident Investigation Board

See also


References


1. [1]
2. Report of the Presidential Commission on the Space Shuttle ''Challenger'' Accident, Volume 1, chapter 5 Rogers Commission report
3. [2]
4. [3]
5. [4]
6. [5]


★ Alan Cooper, ''The Inmates Are Running the Asylum: Why High Tech Products Drive Us Crazy and How to Restore the Sanity'', 1999, Sams, ISBN 0-672-31649-8.

★ Robert L. Helmreich, Anatomy of a system accident: The crash of Avianca Flight 052, ''International Journal of Aviation Psychology'', vol. 4, no. 3, pp. 265–284, 1994.

★ William Langewiesche, The Lessons of ValuJet 592, ''The Atlantic Monthly'', March 1998, pages 81-98.

★ William Langewiesche, Columbia's Last Flight, ''The Atlantic Monthly'', Nov. 2003, pages 58-87.

★ Charles Perrow, ''Normal Accidents: Living with High-Risk Technologies'', New York: Basic Books, 1984. Paperback reprint, Princeton, N.J.: Princeton University Press, 1999, ISBN 0-691-00412-9.

External links



[6] An eight-page paper titled "Organizationally Induced Catastrophes" by Charles Perrow. This is a good overview of his general approach. It also includes his analysis, some of which is quoted above, of Three Mile Island.

[7] Three Mile Island. "… it is the same way every partial core melt-down has gone. People haven't believed the instrumentation as they went along."

[8] And, a different view. The author attributes Three Mile Island to plain old bad management.

[9] William Langewiesche's article on the ValuJet crash.

[10] ValuJet, Brian Stimpson breaks the accident into seven steps, none of which was blatant, but all of which together lead to the accident.

[11] ValuJet, a CNN story six months later regarding tests on how the generating canisters may have ignited in the forward cargo hold. The context is the second day of National Transportation Safety Board hearings.

[12] Langewiesche's article on Space Shuttle Columbia.

[13] The author helps to develop the STAMP remedy — Systems Theoretic Accident Modeling and Process.

[14] An American Airlines flight from Miami to Cali, Colombia. The navigational system was perfectly accurate, but not at all informative. The accident was written off as "human error."

[15] Fred Wolf and Eli Berniker, “Validating Normal Accident Theory: Chemical Accidents, Fires, and Explosions in Petroleum Refineries,” presented at High Consequences System Surety Conference, Sandia National Laboratory, November 1999.

This article provided by Wikipedia. To edit the contents of this article, click here for original source.

psst.. try this: add to faves