One of the Greybeard Stories.

This is a story about how it's essential that you think of all the possible failure modes when you install a fix. Just because you've fixed the problem, that doesn't mean you haven't introduced a new one.

During World War II there were numerous problems encountered with torpedos. A torpedo is a complex piece of machinery, with steering, depth control, magnetic detectors, impact detection, and explosive warhead, it would be surprising if it always went exactly to plan.

One clever bit was that the ship (or boat, as a submarine was always called) didn't have to be lined up and pointing exactly in the direction the torpedo was to run. When the torpedo was set for launch the gyros were spun up, set to north, and then a desired course relative to gyro was set. For example, the torpedo could be set to run in a westerly direction, even if the submarine or ship was travelling southwards.

One problem, though, was that sometimes the gyro would stick or jam, so that as the torpedo turned, the gyro turned with it. As a result the torpedo would continue to turn, so the gyro would continue to turn, and the torpedo would run in a big circle. That meant the torpedo would always come back to haunt, sorry, hit the ship. Problem.

To solve the problem it was proposed that a second gyro be installed. If the torpedo ever turned through 180 degrees, the second gyro would detect it and cause a self-destruct. It was reasoned that the chances that both gyros would jam was small enough not to worry about, and if just the second gyro jammed then there would be no problem.

Going through all the anticipated failure modes seemed to show that the torpedo couldn't fail in the old way (except in very rare cases) and that there were apparently no new failure modes.

So it was implemented.

As I was told the story, a memo went out to the ships and subs to say that the problem had been solved, but no details were given as to how. And as far as the captains and crew were concerned, all seemed well.

And then came the unanticipated. In action, a torpedo was prepared for launch, the gyros spun up and set, torpedo loaded, outer doors opened, and the order to fire given.

And the torpedo jammed in the tube.

This was a known problem. It wasn't common, but basically it put that tube out of commission until the boat returned to the yard for refit or overhaul. Except in this case there was an additional, unanticipated, and very nasty extra. As the boat made a course change and turned to go back the way it had come, the second gyro detected the 180 degree change of course, and, as it were, hit the self-destruct button.

If the captain had waited long enough, the gyros would've spun down and there wouldn't've been a problem. But the captain didn't know about the second gyro, and probably wouldn't've thought to hold course until the gyros spun down, even assuming he would know how long that would take.

I was never told explicitly, but one has to assume that the nose of the sub was blown off with the loss of vessel, and probably all hands. I've never seen the story written up, and don't have the resources to track down and check.

But even if the story isn't true, the lessons are valid. Unanticipated failure modes are dangerous and hard to guard against, while at the same time, anticipated failure modes can sometimes lead to "fixes" that make the problem worse.

How do you analyse your failure modes?

Comments at Hacker News