Disaster by Management, C.M. Rodrigue, AAG 2004

Disaster by Management:
The Columbia Accident and September 11th

Christine M. Rodrigue
Department of Geography
California State University, Long Beach

Paper presented to:
Hazards and Disasters: Management and Mitigation
Special session sponsored by the Hazards Specialty Group
Association of American Geographers
Philadelphia, 17 March 2004

This paper compares the structure of human errors in two disasters with sociogenic causes: the Columbia Shuttle accident and the FBI failure to act on intelligence presaging the 9/11 terrorist attack. In each case, technical information suggesting disaster was weakly transmitted within an elaborate bureaucracy and high-level decision-makers failed to authorize action that may have prevented tragedy. The result was truly "disaster by management."

Themes from prior literature related to these cases are shown here.

Loss of the Shuttle Columbia

The Space Shuttle Columbia was launched from Canaveral on January 16th, 2003, and 81.7 seconds later, a large piece of foam insulation from the External Tank detached and hit the left wing's leading edge at a relative speed difference of some 671-922 km/hr. The foam piece was substantial, about the size of a box fan or a suitcase, weighing about 0.76 kg. This impact penetrated one of the dark Reinforced Carbon-Carbon panels or RCCs that protect the internal structure of the wings from the superheated air created by the Orbiter's entry into the atmosphere. At 8:44 a.m. on February 1st, Columbia re-entered the atmosphere and, about 16 minutes later, the Orbiter to broke up over Texas, killing all aboard.

The Columbia Accident Investigation Board, however, found that this tragedy had its roots more in NASA's internal organizational structure, history, and Congressional and White House pressures than in this sequence of mechanical failure.

After the successful Apollo 11 moon landing in July 1969, Nixon ordered NASA's budget cut as far as politically possible: Its geopolitical legitimatization function had been met. NASA was only able to salvage the Shuttle from its post-Apollo ambitions, by promising that it could eventually become a highly cost-effective, ideally self-supporting launch vehicle and that it would be able to function as a scientific platform in its own right.

The Columbia was the first Shuttle put into service, after several delays, in April, 1981. Despite Reagan's announcement in July that the program was now "fully operational," the Shuttle remained very much an experimental vehicle. It took longer to get a returned Shuttle ready for its next mission than anticipated because of safety-compromising developments revealed after each flight. Congress and the White House became increasingly impatient with the constant schedule delays and cost overruns, exerting pressure for NASA to get its act together and manage its resources in a more businesslike manner: essentially a call for managerialism.

These schedule and budgetary pressures led to the managerial decision to launch Challenger in January, 1986, over the concerns of engineers at contractors and at Kennedy, with lethal results. The Rogers Commission report on the accident forcefully exposed the managerialist logic that had led to this disaster and the tendency of NASA managers to normalize anomaly on the basis that no Shuttle had crashed yet.

The Shuttle Program returned to flight 32 months later, in September 1988. NASA implemented many reforms listed in the Rogers Commission report, but the Shuttle was now organizationally linked with the International Space Station in the Human Space Flight Initiative. The Shuttle budget became progressively more constrained by a 1994 edict from the OMB requiring ISS cost overruns be taken from the other part of the Human Space Flight Initiative, meaning the Shuttle. From 1993 to 2003, while the overall NASA budget fell 13%, the Shuttle's budget fell 40% in inflation-corrected dollars. The Shuttle workforce was reduced from 30,091 in 1993 to just 17,462 in 2002, a loss of 42% of the labor force! "Faster, cheaper, better"?

Meanwhile, schedule pressure had resumed in the Shuttle Program, due to its coupling with the International Space Station. By the time the current Bush administration was in office, the ISS program was $4 billion over projected budget. The US scaled back its ISS contributions to just the completion of the US Node 2, which would allow Europe and Japan to connect their modules to the ISS. It was decided that it had to be installed by February 19th, 2004. This deadline set up pressure to sequence Shuttle missions very tightly: Any delay would add to the cost of the ISS, and that was unacceptable in the managerialist ethos that pervaded NASA's external political milieu and, thus, Shuttle management.

Budgetary and scheduling forces combined to destroy Columbia and its crew. On the day after the launch, routine examination of launch videos showed the foam strike. The Intercenter Photo Working Group e-mailed a video clip all over the Shuttle Program and requested that Kennedy Shuttle Program management obtain higher resolution imagery from the Department of Defense, which the manager at Kennedy attempted to do. Engineers, following NASA procedure, formed a Debris Assessment Team.

Boeing engineers on the Team arranged for their Houston operation to run a model called "Crater" designed to predict depth of penetration through tiles or RCC by falling foam or ice pieces. This was the first application of Crater since its move to Houston from Boeing's Huntington Beach, California, operation. The model was designed for popcorn-sized cylindrical-shaped pieces and had never been used on something so large as the Columbia foam piece. It was known to overestimate penetration to err on the side of caution. This time, it predicted penetration clear through to the wing interior. The team that ran it did not consult with Huntington Beach despite not being confident in the results, a lack of confidence that played into management's existing preferences to see foam strikes as ordinary action items that trigger maintenance hassles, rather than a safety-of-flight emergency that would trigger crew rescue actions.

The Debris Assessment Team was so concerned that it also made two separate requests for Department of Defense imagery, both through the Johnson Space Flight Center Engineering Management Directorate, that is, up the engineering chain of command. They thought this would be faster than going through the elaborate Shuttle Management hierarchy. The Johnson manager complied and also notified the Chair of the Mission Management Team.

The Chair knew that a particularly bad strike had happened on the previous mission three months earlier, but that Shuttle management had not grounded the fleet pending investigation. The Shuttle had, after all, come down safely, and the Shuttle Program was under very tight schedule pressure to meet the February 19, 2004, ISS deadline. She, thus, felt that foam strikes were not all that serious a problem. She had normalized anomaly in her own mind and was dominated by managerialist concerns for flight scheduling and budget. She was dismayed that the Photo Working Group had gone directly to the DoD without going through the Shuttle management channels, not knowing the Debris Assessment Team had, too. So, she contacted DoD and told them to stop working on these requests. The Debris Assessment Team, when it heard about this, took that as a final order, not a point of debate, given the mechanistic and hierarchical organization of NASA management.

The Chair, however, had expressed concerns to others in Shuttle Management, apparently looking for reassurance that the foam-strike was not a safety-of- flight issue. She also commented that imagery was no longer being pursued, "since even if we saw something, we couldn't do anything about it. The Program didn't want to spend the resources." This statement expresses the dilemma faced by risk managers making decisions in conditions of uncertainty, caught between the Type II error of dismissing a very serious problem or the Type I error of taking a problem too seriously and squandering resources and opportunities. There were, however, at least two scenarios for crew rescue, had NASA scrambled into emergency mode.

With only the results of Crater to go on and without the high resolution data they needed from DoD, the Debris Assessment Team's presentation on the ninth day after launch was riddled with large uncertainties. Management found nothing to bend their concerns toward Type II error from their natural inclination to worry more about Type I opportunity costs. And the rest is history.

FBI Headquarters' Response to Field Office Concerns before 9/11

The second analysis focusses on FBI Headquarters' response to the anxieties of field officers about the sudden uptick in Middle Easterners' interest in aviation and other odd behavior. The organizational context of this failure must be seen in light of the FBI's history. The FBI has alternated between centralized and mechanistic organization and decentralized and organic organization, with highly publicized abuses of investigatory power in both organizational modes. The FBI has had trouble finding a healthy balance between field offices and Headquarters.

Three incidents in the 9/11 context show this dilemma in action. They took place at a time of extremely heightened concern at the highest echelons of the Federal government about impending al-Qaeda attacks. On July 5th, Richard Clarke, then the National Coördinator of Security, Infrastructure Protection, and Counter-terrorism, spoke to a White House gathering of senior officials from the FBI and other Federal agencies about signs that al-Qaeda was planning what he called a "really spectacular" attack on Americans in the very near future, briefing them on the indications. He counseled them to cancel vacations, defer non-vital travel, put off scheduled exercises, and place domestic rapid response teams on much shorter alert. For some reason, that urgent concern was not communicated down the hierarchy at FBI Headquarters, much less diffused out to the field offices around the country. In light of that failure, the behavior of Headquarters analysts and operations specialists makes sense.

On July 10th, 2001, a Special Agent with the Phoenix field office sent on an "electronic communication" detailing his suspicions about the inordinate number of Middle Eastern men taking lessons at flight schools. He requested that Headquarters open investigations on certain of these named individuals because of their possible links to terrorist organizations. This particular message was sent to the Counterterrorism Division at Headquarters where it was eventually relayed to Intelligence Operations Specialists or IOSs in the Usama Bin Ladin Unit or UBLU. Several IOSs there evaluated the request for investigation of the visa status of the individuals named in the EC. They discussed the legality of the request in terms of racial profiling. They reported thinking that the aviation issue was just a speculative scenario.

Obviously, all these IOSs had not received information about the high-level concern about an impending "really spectacular" al-Qaeda attack. Without that context to alter their conventional view of the world, worries about past FBI abuses, and risk perception, their decision on August 7th to report closure on the case with no further action is not unreasonable.

Internal communication problems come through in the second incident, too. An agent from the New York Field Office was conducting a criminal investigation of the USS Cole bombing of October 2000. On August 29th, 2001, he asked FBI Headquarters to allow New York to use all its criminal investigative resources to find one Khalid al-Midhar, who had apparently recently met with a suspect connected to the Cole attack and had, furthermore, entered the United States in July 2001. Al-Midhar was apparently the coördinator for the 9/11 operation.

The agent was told that the FBI National Security Unit had advised that this could not be done: There is a "Wall" barring sharing of information between criminal and intelligence investigations, because use of intelligence information in a criminal case can compromise the sources of intelligence. The agent, hearing this, was furious, arguing "someday someone will die," but Headquarters would not relent.

Information flow within the FBI organization, thus, involves not only impediments between Headquarters and field offices but also within two parallel, vertical stovepipes: Criminal investigations and intelligence investigations. The Wall makes conceptual sense, given the way that communication between these two functions might possibly disrupt them.

The third incident entails another failure of communication between Headquarters and a field office. On August 15th, 2001, Pan Am employees at a Minneapolis flight school called the local FBI field office to report their concerns that Zacarias Moussaoui was acting very oddly and might be a threat to national security.

The Minnesota Field Office contacted French intelligence and learned that he had ties with extremist Islamist groups, including one of the Chechen rebel groups. They arrested him the next day on immigration violations and asked Headquarters for a Foreign Intelligence Surveillance Act or FISA search warrant. They were turned down, because an agent in the Radical Fundamentalist Unit at headquarters felt that the Minneapolis Field Office had not shown sufficient cause for a FISA search warrant, especially in light of an investigation of possible abuses of the FISA search warrant process initiated by Attorney General John Ashcroft half a year earlier. The RFU agent was acting to prevent abuse of the FISA search process by a Field Office supervisor he felt had a flimsy ad hoc case as well as a habit of resorting to FISA. The RFU agent was balancing the Type II error of letting a terrorist go on to commit a crime with the Type I error of authorizing a request for a FISA search without sufficient cause, and the earlier investigation of FISA search abuses no doubt sensitized him more to this possibility than the other. Sadly, a search of Moussaoui's possessions would have revealed the 9/11 plan in detail and possibly allowed its prevention.

In all three cases, hindsight reveals failures of communication within the FBI through tensions between Headquarters and field offices and through the Wall dividing intelligence and criminal investigations. In all three of these cases, the decisions of Headquarters personnel seem reasonable in light of what they knew at the time and their legitimate concerns about due process and proper investigation. What is disturbing is the failure to communicate on the part of the highest levels in the FBI, the people who were privy to the meeting in the White House on July 5th, in which Richard Clarke asked them to alter their behavior drastically to lower their exposure to an imminent and spectacular attack. Given that kind of concern, why was the situation not communicated down to all Headquarters staff, if not directly all the way out to the field offices? Had this information been conveyed to them as forcefully as Clarke had conveyed it to the White House meeting, the personnel at Headquarters might have been much more receptive to the Type II risk of failing to stop an imminent terrorist attack rather than to the Type I risk of abusing FISA search powers or the Wall between intelligence and criminal investigation.

Conclusions

To conclude, both the Columbia accident and the FBI handling of field office concerns before 9/11 seem to validate normal accident theory. Communication about risks appear to have been hog-tied in complex bureaucracies. Unpredictable external constraints acted on both agencies and led to a shift in risk managers' perception of the relative importance of the precautionary principle and the opportunity costs its application can impose.

In NASA's case, the failure in communication can be traced to its external political environment and funding base, its geographically ornate and hierarchical structure, and the lower status and timidity of risk assessors compared with managers. In the FBI's case, the most egregious failure of communication was between the most senior levels of the Bureau and the lower-ranked personnel there at Headquarters, which affected their decision-making concerning the distant field offices.

In both agencies, there were, additionally, parallel chains of command and communication. At NASA, individuals may find themselves wearing hats as engineers, as technical staff within the Shuttle Program, and as employees within the line structure of a NASA center, and it may not be clear to them which chain they should jerk to call attention to a safety-of-flight issue. At the FBI, intelligence and criminal investigation functions are kept strictly separated and compartmentalized.

The consequence of these barriers to communication along hierarchies, between chains of command, and across space was an imbalanced focus on the managerialist concerns of efficiency, budget, scheduling, and rules and regulations, instead of on the risk to human life. Managers had normalized anomaly and resisted data that contradicted their biases in perception, leading to what one NASA engineer called "worlds of pain."

Sources

Columbia Accident Investigation Board report is available at:

<http://www.spaceflight.nasa.gov/shuttle/investigation/index.html>

Congressional testimony about 9/11 is collected at:

<http://www.fas.org/irp/congress/2002_hr/>

first placed on the web: 03/19/04
last revised: 03/19/04
repaired typos: 09/20/23
maintained by C.M. Rodrigue

Disaster by Management: The Columbia Accident and September 11th

Disaster by Management:
The Columbia Accident and September 11th