How one line of code caused a $60 million loss
▻https://engineercodex.substack.com/p/how-one-line-of-code-caused-a-60
On January 15th, 1990, AT&T’s New Jersey operations center detected a widespread system malfunction, shown by a plethora of red warnings on their network display.
Despite attempts to rectify the situation, the network remained compromised for 9 hours, leading to a 50% failure rate in call connections.
AT&T lost over $60 million as a result with over 60,000 of Americans left with fully disconnected phones.
Furthermore, 500 airline flights were delayed, affecting 85,000 people.
AT&T’s long-distance network was supposedly a paragon of efficiency, handling a substantial portion of the nation’s calls with its advanced electronic switches and signaling system. This system usually completed call routing within seconds.
However, on this day, a fault originating in a New York switch cascaded through the network. This was due to a software bug in a recent update that contained a critical bug affecting the network’s 114 switches. When the New York switch reset itself and sent out signals, this bug caused a domino effect, leading to widespread network disruption.
Interestingly, this small software patch was not tested. Testing was actually bypassed as per management’s request because the code change was small.
#Informatique #Codage #Bug #AT&T #1990