An Analysis of the Patriot Missile System

 

            The Patriot Missile System is an example of how a software project can change over time. It was proposed as a way to protect against Soviet planes, then soviet missiles, and twenty years later, was used against Iraqi missiles. The system was meant to be temporary, but was used in permanent settlements. If a system can be patched to perfection, then the Patriot was an attempt to prove this point.

 

            The Patriot Missile project began in the late sixties as a portable system meant to protect local airspace against intrusion by enemy aircraft. One of the selling points of the project was that although it would not be programmed to shoot down missiles, the system could later be altered to serve as an Anti-Ballistic Missile System. The system was first tested against a drone plane in 1974, and later, in 1986, the system’s developers at Raytheon Labs modified it to be used as a portable, short-term defense against Soviet Missiles and Aircraft. [1]

 

            The system was first deployed in 1990, and shot down it’s first Ballistic Missile in January of 1991. The ballistic missile was an Al-Hussein, more commonly known as a SCUD missile. The patriot missile was never designed to handle Scuds, which have an estimated maximum speed that is nearly twice that of the soviet missiles for which the system was designed. This, as well, as several other inconsistencies between the system requirements, and the usage practices should lead a responsible programmer to ask if his or her software will be used for purposes not stated in the requirements, and if so, then what are the unwritten requirements for the system?

 

            In mid-February of 1991, Israeli troops had discovered a defect in the patriot missile system. They discovered that if the system runs for long periods of time, then it becomes inaccurate. They also estimated that after twenty hours or operation, the system would become too inaccurate to successfully target, track, and hit a Ballistic Missile. The U.S. military denied the significance of the discovery, stating that the system is meant to be portable and provide a short-term defense against missiles: that nobody would ever run the system for more twenty hours at a time.

 

            On February 16th, a Bug Fix was released, but could not immediately reach all units because of wartime difficulties in transportation. So, the Army released a memo on February 21st, stating that the system was not to be run for “very long times.” The military did not specify how long a “very long run time” would be.

 

            On February 25th, 1991, a Patriot Missile system that had been running for over 100 hours at Dhahran, Saudi Arabia had failed to intercept a SCUD missile. The SCUD hit an Army Barracks, killing 28 Americans. On the next day, the Bug Fix for the system arrived at Dhahran[2].

 

            The reason this bug occurred is because of a problem with storing time in a 24-bit register. The problem is that time is stored to an accuracy of 1/10th of a second, but a 24-bit register does not have enough precision to store 0.1, so a small fraction of each second is lost. The result is that the register used to keep track of time is off by 0.0001% of the amount of time that the system has been in operation. The problem is that computers do not store information as a standard decimal. Instead, they use binary code, which can not accurately store 1/10th of a second.

 

Figure 1*

 

 

 

Step 1

nTimer represents the amount of time the missile has been in operation.

Time represents the current time.

 

Step 2

1/16th of a secondą is added to nTimer as each 1/10th of a second passes.

 

Step 3

When an enemy missile is spotted, the current time is converted to the format of nTimer.

 

Step 4

The Converted Time from Step 3 is used to calculate the upcoming position of the enemy missile.

 

Step 5

The Patriot Missile is aimed.

 

Step 6

When nTimer and the Converted Time are equal, the Patriot is fired.

 

 

            But the REAL problem was not with inaccuracy, but with inconsistency. During one of the updates, Raytheon Labs, the developer of the patriot missile, had fixed the previously mentioned inaccuracy problem by creating code that used a pair of 24 bit registers to accurately make the time calculations. The problem was that that most, but not all of the time calculations made by the system were replaced by calls to the newer, more accurate function. So, the system was keeping track of the current time using a function that loses time in much the same way that a clock with a weak battery will gradually lose time. But, the system would track missiles, aim itself, and decide exactly when to launch it's own missiles using the internal clock, which was accurate. In effect, the system would use an accurate timepiece to decide where the missile is located and how fast it is moving, and when to fire the defensive missiles. But while waiting to fire the missiles, the system would use the less accurate clock to determine when it should fire. It was estimated that after running the system for twenty hours, the calculations made using the old algorithm and those made by the new algorithm differed by as much as 1/3 of a second[3]. A SCUD missile can travel more than one mile per second.

 

            Had the same piece of code been used for all time conversions, then the inaccuracy of the Patriot Missile would not have increased over time the way it did in this case. Instead, every time calculation would be off by approximately 0.000001 seconds, and the system would be much more likely to have defended against any missiles launched at it. This is a good argument for reuse of code whenever possible. Although the developers at Raytheon Labs had tried to replace all time conversions with calls to the new function, they missed a few and the result was a system that was less reliable than it would have been if they had chosen to ignore the conversion error.

 

            Part of the reason this error was not found sooner was that the program was written in assembly language, 15 years earlier. Over time, it was patched and new things were added. In short, because the system was written in assembly code, it was difficult to understand and maintain. And because the system was fifteen years old and had been patched several times, the very people who had written the code were not as familiar with the code as they would be if it were written more recently. Then, during the gulf war, the system had to be modified to handle the SCUD missiles, and time was a critical factor. The developers could have been influenced by the fact that prolonged testing could have caused a disaster by keeping a necessary system out of the hands of soldiers in a time of war.

            The Software Engineering Code of Ethics And Professional Practice states that a responsible software engineer should "Approve software only if they have well-founded belief that it is safe, meets specifications, passes appropriate tests..." (sub-principle 1.03) and "Ensure adequate testing, debugging, and review of software...on which they work." (sub-principle 3.10). Unfortunately, defects did make their way into the system.

 

            Perhaps one of the lessons to be learned from this case is to write code to be easily maintainable, and to acknowledge the difficulties that may be inherent in the maintenance of the code. For example, the Patriot Missile system was altered in 1986, to be capable of tracking missiles as well as aircraft. During that project, the developers had time to re-code the system in a high level language. If they had done so, then the patches that were required during the gulf war would have been easier and less prone to defects. But, more importantly, by spending extra time to improve the system during times of peace, the designers would have decreased the amount of time needed to update the system during times of war, when it would be needed most.

 

            The software engineering code of ethics also states that a responsible software engineer should "Treat all forms of software maintenance with the same professionalism as new development." The Patriot Missile System is a good example of how a small change can break an existing program. Raytheon Labs should not have been patching and re-patching this code. For a safety critical project, the developer must be familiar with the code with which he is working.

 

            But ethically speaking, the people at Raytheon Labs had some tough decisions to make. How much testing do you perform, when the tests require the destruction of functioning missiles and aircraft? It would be easy for most of us to say that you perform as much as the system requires. You do not stop until you are one hundred percent sure. But if that means re-enacting a twenty-hour battle, using real aircraft and missiles, then it becomes a more difficult decision. And this was the situation faced by the crew of Raytheon labs. During the initial testing, a very long and expensive battle was re-enacted. Had they re-run this test every time the system was patched, then the problem would not have occurred, but the decision to do so would have been a very difficult one to make, and an even more difficult one to justify.

 

            One cannot say that Raytheon was blameworthy because it cannot be said that Raytheon was guilty of negligence or malpractice. They were responsible in a causal sense because they introduced the bug in the system, but the details show that the problem with the system was not necessarily the developers, but that the system was modified often in inconsistent ways

 

 

Copyright 2002 Tom Morgan and Jason Roberts.

This case may be published without permission and at no cost as long as it carries the copyright notice.

 



[1] Team Redstone Patriot Missile System Chronology

http://www.redstone.army.mil/history/systems/PATRIOT.html

 

[2] General Accounting Office Report Number B-247094

http://www.fas.org/spp/starwars/gao/im92026.htm

 

* The numbers used in this table are not meant to represent the exact computations used in the patriot missile system. They are intended to demonstrate the concept of why the patriot system failed.

[3] Robert Skeel “Roundoff Error and the Patriot Missile.

 http://www.siam.org/siamnews/general/patriot.htm