Disaster by Management:
The Columbia Accident and September 11th
Christine M. Rodrigue
Department of Geography
California State University, Long Beach
Paper presented to:
Hazards and Disasters: Management and Mitigation
Special session sponsored by the Hazards Specialty Group
Association of American Geographers
Philadelphia, 17 March 2004
This paper compares the structure of human errors in two disasters with
sociogenic causes: the Columbia Shuttle accident and the FBI failure to act
on intelligence presaging the 9/11 terrorist attack. In each case, technical
information suggesting disaster was weakly transmitted within an elaborate
bureaucracy and high-level decision-makers failed to authorize action that may
have prevented tragedy. The result was truly "disaster by management."
Themes from prior literature related to these cases are shown here.
Loss of the Shuttle Columbia
The Space Shuttle Columbia was launched from Canaveral on January 16th, 2003,
and 81.7 seconds later, a large piece of foam insulation from the External
Tank detached and hit the left wing's leading edge at a relative speed
difference of some 671-922 km/hr. The foam piece was substantial, about the
size of a box fan or a suitcase, weighing about 0.76 kg. This impact
penetrated one of the dark Reinforced Carbon-Carbon panels or RCCs that
protect the internal structure of the wings from the superheated air created
by the Orbiter's entry into the atmosphere. At 8:44 a.m. on February 1st,
Columbia re-entered the atmosphere and, about 16 minutes later, the Orbiter to
broke up over Texas, killing all aboard.
The Columbia Accident Investigation Board, however, found that this tragedy
had its roots more in NASA's internal organizational structure, history, and
Congressional and White House pressures than in this sequence of mechanical
failure.
After the successful Apollo 11 moon landing in July 1969, Nixon ordered NASA's
budget cut as far as politically possible: Its geopolitical legitimatization
function had been met. NASA was only able to salvage the Shuttle from its
post-Apollo ambitions, by promising that it could eventually become a highly
cost-effective, ideally self-supporting launch vehicle and that it would be
able to function as a scientific platform in its own right.
The Columbia was the first Shuttle put into service, after several delays, in
April, 1981. Despite Reagan's announcement in July that the program was now
"fully operational," the Shuttle remained very much an experimental vehicle.
It took longer to get a returned Shuttle ready for its next mission than
anticipated because of safety-compromising developments revealed after each
flight. Congress and the White House became increasingly impatient with the
constant schedule delays and cost overruns, exerting pressure for NASA to get
its act together and manage its resources in a more businesslike manner:
essentially a call for managerialism.
These schedule and budgetary pressures led to the managerial decision to
launch Challenger in January, 1986, over the concerns of engineers at
contractors and at Kennedy, with lethal results. The Rogers Commission report
on the accident forcefully exposed the managerialist logic that had led to
this disaster and the tendency of NASA managers to normalize anomaly on the
basis that no Shuttle had crashed yet.
The Shuttle Program returned to flight 32 months later, in September 1988.
NASA implemented many reforms listed in the Rogers Commission report, but the
Shuttle was now organizationally linked with the International Space Station
in the Human Space Flight Initiative. The Shuttle budget became progressively
more constrained by a 1994 edict from the OMB requiring ISS cost overruns be
taken from the other part of the Human Space Flight Initiative, meaning the
Shuttle. From 1993 to 2003, while the overall NASA budget fell 13%, the
Shuttle's budget fell 40% in inflation-corrected dollars. The Shuttle
workforce was reduced from 30,091 in 1993 to just 17,462 in 2002, a loss of
42% of the labor force! "Faster, cheaper, better"?
Meanwhile, schedule pressure had resumed in the Shuttle Program, due to its
coupling with the International Space Station. By the time the current Bush
administration was in office, the ISS program was $4 billion over projected
budget. The US scaled back its ISS contributions to just the completion of
the US Node 2, which would allow Europe and Japan to connect their modules to
the ISS. It was decided that it had to be installed by February 19th, 2004.
This deadline set up pressure to sequence Shuttle missions very tightly: Any
delay would add to the cost of the ISS, and that was unacceptable in the
managerialist ethos that pervaded NASA's external political milieu and, thus,
Shuttle management.
Budgetary and scheduling forces combined to destroy Columbia and its crew. On
the day after the launch, routine examination of launch videos showed the foam
strike. The Intercenter Photo Working Group e-mailed a video clip all over
the Shuttle Program and requested that Kennedy Shuttle Program management
obtain higher resolution imagery from the Department of Defense, which the
manager at Kennedy attempted to do. Engineers, following NASA procedure,
formed a Debris Assessment Team.
Boeing engineers on the Team arranged for their Houston operation to run a
model called "Crater" designed to predict depth of penetration through tiles
or RCC by falling foam or ice pieces. This was the first application of Crater
since its move to Houston from Boeing's Huntington Beach, California,
operation. The model was designed for popcorn-sized cylindrical-shaped pieces
and had never been used on something so large as the Columbia foam piece. It
was known to overestimate penetration to err on the side of caution. This
time, it predicted penetration clear through to the wing interior. The team
that ran it did not consult with Huntington Beach despite not being confident
in the results, a lack of confidence that played into management's existing
preferences to see foam strikes as ordinary action items that trigger
maintenance hassles, rather than a safety-of-flight emergency that would
trigger crew rescue actions.
The Debris Assessment Team was so concerned that it also made two separate
requests for Department of Defense imagery, both through the Johnson Space
Flight Center Engineering Management Directorate, that is, up the engineering
chain of command. They thought this would be faster than going through the
elaborate Shuttle Management hierarchy. The Johnson manager complied and also
notified the Chair of the Mission Management Team.
The Chair knew that a particularly bad strike had happened on the previous
mission three months earlier, but that Shuttle management had not grounded the
fleet pending investigation. The Shuttle had, after all, come down safely,
and the Shuttle Program was under very tight schedule pressure to meet the
February 19, 2004, ISS deadline. She, thus, felt that foam strikes were not
all that serious a problem. She had normalized anomaly in her own mind and
was dominated by managerialist concerns for flight scheduling and budget.
She was dismayed that the Photo Working Group had gone directly to the DoD
without going through the Shuttle management channels, not knowing the Debris
Assessment Team had, too. So, she contacted DoD and told them to stop working
on these requests. The Debris Assessment Team, when it heard about this, took
that as a final order, not a point of debate, given the mechanistic and
hierarchical organization of NASA management.
The Chair, however, had expressed concerns to others in Shuttle Management,
apparently looking for reassurance that the foam-strike was not a safety-of-
flight issue. She also commented that imagery was no longer being pursued,
"since even if we saw something, we couldn't do anything about it. The
Program didn't want to spend the resources." This statement expresses the
dilemma faced by risk managers making decisions in conditions of uncertainty,
caught between the Type II error of dismissing a very serious problem or the
Type I error of taking a problem too seriously and squandering resources and
opportunities. There were, however, at least two scenarios for crew rescue,
had NASA scrambled into emergency mode.
With only the results of Crater to go on and without the high resolution data
they needed from DoD, the Debris Assessment Team's presentation on the ninth
day after launch was riddled with large uncertainties. Management found
nothing to bend their concerns toward Type II error from their natural
inclination to worry more about Type I opportunity costs. And the rest is
history.
FBI Headquarters' Response to Field Office Concerns before 9/11
The second analysis focusses on FBI Headquarters' response to the anxieties of
field officers about the sudden uptick in Middle Easterners' interest in
aviation and other odd behavior. The organizational context of this failure
must be seen in light of the FBI's history. The FBI has alternated between
centralized and mechanistic organization and decentralized and organic
organization, with highly publicized abuses of investigatory power in both
organizational modes. The FBI has had trouble finding a healthy balance
between field offices and Headquarters.
Three incidents in the 9/11 context show this dilemma in action. They took
place at a time of extremely heightened concern at the highest echelons of the
Federal government about impending al-Qaeda attacks. On July 5th, Richard
Clarke, then the National Coördinator of Security, Infrastructure
Protection, and Counter-terrorism, spoke to a White House gathering of senior
officials from the FBI and other Federal agencies about signs that al-Qaeda
was planning what he called a "really spectacular" attack on Americans in the
very near future, briefing them on the indications. He counseled them to
cancel vacations, defer non-vital travel, put off scheduled exercises, and
place domestic rapid response teams on much shorter alert. For some reason,
that urgent concern was not communicated down the hierarchy at FBI
Headquarters, much less diffused out to the field offices around the country.
In light of that failure, the behavior of Headquarters analysts and operations
specialists makes sense.
On July 10th, 2001, a Special Agent with the Phoenix field office sent on an
"electronic communication" detailing his suspicions about the inordinate
number of Middle Eastern men taking lessons at flight schools. He requested
that Headquarters open investigations on certain of these named individuals
because of their possible links to terrorist organizations.
This particular message was sent to the Counterterrorism Division at
Headquarters where it was eventually relayed to Intelligence Operations
Specialists or IOSs in the Usama Bin Ladin Unit or UBLU. Several IOSs there
evaluated the request for investigation of the visa status of the individuals
named in the EC. They discussed the legality of the request in terms of
racial profiling. They reported thinking that the aviation issue was just a
speculative scenario.
Obviously, all these IOSs had not received information about the high-level
concern about an impending "really spectacular" al-Qaeda attack. Without that
context to alter their conventional view of the world, worries about past FBI
abuses, and risk perception, their decision on August 7th to report closure on
the case with no further action is not unreasonable.
Internal communication problems come through in the second incident, too. An
agent from the New York Field Office was conducting a criminal investigation
of the USS Cole bombing of October 2000. On August 29th, 2001, he asked FBI
Headquarters to allow New York to use all its criminal investigative resources
to find one Khalid al-Midhar, who had apparently recently met with a suspect
connected to the Cole attack and had, furthermore, entered the United States
in July 2001. Al-Midhar was apparently the coördinator for the 9/11
operation.
The agent was told that the FBI National Security Unit had advised that this
could not be done: There is a "Wall" barring sharing of information between
criminal and intelligence investigations, because use of intelligence
information in a criminal case can compromise the sources of intelligence. The
agent, hearing this, was furious, arguing "someday someone will die," but
Headquarters would not relent.
Information flow within the FBI organization, thus, involves not only
impediments between Headquarters and field offices but also within two
parallel, vertical stovepipes: Criminal investigations and intelligence
investigations. The Wall makes conceptual sense, given the way that
communication between these two functions might possibly disrupt them.
The third incident entails another failure of communication between
Headquarters and a field office. On August 15th, 2001, Pan Am employees at a
Minneapolis flight school called the local FBI field office to report their
concerns that Zacarias Moussaoui was acting very oddly and might be a threat
to national security.
The Minnesota Field Office contacted French intelligence and learned that he
had ties with extremist Islamist groups, including one of the Chechen rebel
groups. They arrested him the next day on immigration violations and asked
Headquarters for a Foreign Intelligence Surveillance Act or FISA search
warrant. They were turned down, because an agent in the Radical
Fundamentalist Unit at headquarters felt that the Minneapolis Field Office had
not shown sufficient cause for a FISA search warrant, especially in light of
an investigation of possible abuses of the FISA search warrant process
initiated by Attorney General John Ashcroft half a year earlier. The RFU
agent was acting to prevent abuse of the FISA search process by a Field Office
supervisor he felt had a flimsy ad hoc case as well as a habit of
resorting to FISA. The RFU agent was balancing the Type II error of letting a
terrorist go on to commit a crime with the Type I error of authorizing a
request for a FISA search without sufficient cause, and the earlier
investigation of FISA search abuses no doubt sensitized him more to this
possibility than the other. Sadly, a search of Moussaoui's possessions would
have revealed the 9/11 plan in detail and possibly allowed its prevention.
In all three cases, hindsight reveals failures of communication within the FBI
through tensions between Headquarters and field offices and through the Wall
dividing intelligence and criminal investigations. In all three of these
cases, the decisions of Headquarters personnel seem reasonable in light of
what they knew at the time and their legitimate concerns about due process and
proper investigation. What is disturbing is the failure to communicate on the
part of the highest levels in the FBI, the people who were privy to the
meeting in the White House on July 5th, in which Richard Clarke asked them to
alter their behavior drastically to lower their exposure to an imminent and
spectacular attack. Given that kind of concern, why was the situation not
communicated down to all Headquarters staff, if not directly all the way out
to the field offices? Had this information been conveyed to them as forcefully
as Clarke had conveyed it to the White House meeting, the personnel at
Headquarters might have been much more receptive to the Type II risk of failing
to stop an imminent terrorist attack rather than to the Type I risk of
abusing FISA search powers or the Wall between intelligence and criminal
investigation.
Conclusions
To conclude, both the Columbia accident and the FBI handling of field office
concerns before 9/11 seem to validate normal accident theory. Communication
about risks appear to have been hog-tied in complex bureaucracies.
Unpredictable external constraints acted on both agencies and led to a shift
in risk managers' perception of the relative importance of the precautionary
principle and the opportunity costs its application can impose.
In NASA's case, the failure in communication can be traced to its external
political environment and funding base, its geographically ornate and
hierarchical structure, and the lower status and timidity of risk assessors
compared with managers. In the FBI's case, the most egregious failure of
communication was between the most senior levels of the Bureau and the
lower-ranked personnel there at Headquarters, which affected their
decision-making concerning the distant field offices.
In both agencies, there were, additionally, parallel chains of command and
communication. At NASA, individuals may find themselves wearing hats as
engineers, as technical staff within the Shuttle Program, and as employees
within the line structure of a NASA center, and it may not be clear to them
which chain they should jerk to call attention to a safety-of-flight issue.
At the FBI, intelligence and criminal investigation functions are kept
strictly separated and compartmentalized.
The consequence of these barriers to communication along hierarchies,
between chains of command, and across space was an imbalanced focus on the
managerialist concerns of efficiency, budget, scheduling, and rules and
regulations, instead of on the risk to human life. Managers had normalized
anomaly and resisted data that contradicted their biases in perception,
leading to what one NASA engineer called "worlds of pain."
Sources
Columbia Accident Investigation Board report is available at:
<http://www.spaceflight.nasa.gov/shuttle/investigation/index.html>
Congressional testimony about 9/11 is collected at:
<http://www.fas.org/irp/congress/2002_hr/>
first placed on the web: 03/19/04
last revised: 03/19/04
repaired typos: 09/20/23
maintained by C.M. Rodrigue