Delta Outage Spotlights Technology Risks


Delta’s computer outage on Jan. 29 was over by midnight, but its effects have extended into the week, not only resulting in 170 cancelations on Sunday, but grounding more than 100 flights on Monday and causing many delays. Adding to the frustration was the fact that the company’s mobile apps were also not working.

This latest incident follows another computer outage for Delta in mid-August, when flights were canceled for two days, leaving thousands of passengers stranded.

Such outages can be costly. A Southwest Airlines outage in July caused more than 2,000 flights to be canceled and cost about $54 million. The August Delta outage, which involved a fire, resulted in cancellation of 2,300 flights over three days and cost the airline $150 million in lost revenue, according to USA Today.

Jim Corridore, an analyst at CFRA Research, told USA Today on Monday that Delta’s computer outage puts a “spotlight on risks of airline technology infrastructure, much of which is old and patched with differing systems.” He said that airlines build new programming over old software, especially after a merger, when computer languages may differ. Programmers’ assumptions about how software will work are sometimes wrong.

While large companies such as Delta would have fewer outages with more testing of their systems, this is an expensive proposition.

According to USA Today:

Gil Hecht, CEO of Continuity Software, which tests computer systems for large banks and insurers, compared the construction of complex computer systems to a layer cake, with web servers, database software, storage and possibly interaction with other systems such as government computers that check whether passengers are allowed to fly.

“Testing should be done by every single layer and every single business service that participates in the critical infrastructure, and some of them are simply not under the airline’s control,” Hecht said.

He compared one way of testing to running a car into a tree to see whether the airbags work, which isn’t possible while keeping a computer system working. Instead, testing for a large financial institution or airline must confirm that each layer is configured to work well with all the others, he said.

“In order to do that, critical infrastructure operators must do much more testing, whether it’s manual by humans or by technology or by any means possible,” Hecht said. “Yes, it costs money. Quite a lot. But if more money and more effort will be driven into testing, we will have far less down time and data-loss events.”

Delta Limping Back to Normalcy

After two days of cancellations due to a system-wide outage, leaving thousands of customers stranded, Delta today announced it will return to normal operation by mid-to-late afternoon. It added a caveat, however, that “a chance of scattered thunderstorms expected in the eastern U.S. may have the potential to slow the recovery.”

Delta said that by late morning on Wednesday it had canceled 255 flights whileDelta 1,500 departed. About 800 flights were canceled on Tuesday and there were around 1,000 cancellations on Monday. It also extended its travel waiver and continued to provide hotel vouchers, of which more than 2,300 were issued Tuesday night in Atlanta alone.

“The technology systems that allow airport customer service agents to process check-ins, conduct boarding and dispatch aircraft are functioning normally with the bulk of delays and cancellations coming as a result of flight crews displaced or running up against their maximum allowed duty period following the outage,” Delta said.

The company’s chief operating officer, Gil West, said on Aug. 9:

Monday morning a critical power control module at our Technology Command Center malfunctioned, causing a surge to the transformer and a loss of power. The universal power was stabilized and power was restored quickly. But when this happened, critical systems and network equipment didn’t switch over to backups. Other systems did. And now we’re seeing instability in these systems. For example we’re seeing slowness in a system that airport customer service agents use to process check-ins, conduct boarding and dispatch aircraft. Delta agents today are using the original interface we designed for this system while we continue with our resetting efforts.

Reuters reported:

Like many large airlines, Delta uses its proprietary computer system for its bookings and operations, and the fact that other airlines appeared unaffected by the outage also pointed to the company’s equipment, said independent industry analyst Robert Mann.

Critical computer systems have backups and are tested to ensure high reliability, he said. It was not clear why those systems had not worked to prevent Delta’s problems, he said.

“That suggests a communications component or network component could have failed,” he said.

The airline has not yet detailed the financial impact of the event.

EgyptAir Flight MS 804 Crash Confirmed, Killing 66

Egyptian authorities believe they have found debris from EgyptAir Flight MS 804, but the search remains on for the wreckage of the Airbus A320 traveling from Paris to Cairo that vanished from the radar and crashed into the Mediterranean early this morning.

According to the Greece’s defense minister, Greek controllers attempted to contact the aircraft when it crossed through the country’s airspace but could not get a response. The plane made “sudden swerves” before dropping from 37,000 to 15,000 feet and disappearing from radar. The small commercial jet was about half full, carrying 66 passengers from a range of nations, including 30 from Egypt, 15 from France, two Iraqis, and one person each from Britain, Belgium, Kuwait, Saudi Arabia, Sudan, Chad, Portugal, Algeria and Canada.

egyptair map reuters

No cause has been officially identified, but many security analysts and government officials believe that an act of terrorism may have downed the plane. There were no documented red flags before the plane disappeared: local weather was good, the plane was on its fifth flight of the day, the pilot and copilot had logged a significant amount of flying experience, and Greek aviation officials said the pilots did not mention any issues.

According to Reuters, Egyptian Prime Minister Sherif Ismail said it was too early to rule out any possible explanation, and French President Francois Hollande told reporters, “No hypothesis can be ruled out, nor can any be favored over another.” Egypt’s civil aviation minister said a terrorist attack was more likely than a technical failure, however. Two U.S. officials told CNN that the government is operating on an initial theory the flight was taken down by a bomb, but cautioned this is not yet supported by a “smoking gun.” No terrorist groups have yet claimed responsibility for the crash.

As Time noted:

Egypt has been the victim of terrorism in the skies relatively recently. Last October, a Metrojet charter plane filled with Russian tourists crashed into the Sinai Desert shortly after taking off from the Egyptian Red Sea resort of Sharm el-Sheikh, headed to St. Petersburg, Russia. All 224 passengers died in the crash. Investigators quickly speculated that a home-made bomb had been placed aboard the aircraft and in February the Islamic State, or ISIS, claimed responsibility, saying that it had indeed smuggled an explosive device aboard the aircraft.

In March, a passenger aboard an EgyptAir plane flying from Alexandria to Cairo hijacked the plane wearing a fake suicide belt, an incident that raised deep concerns among aviation authorities about the anti-terrorist measures in place on EgyptAir flights, and at Egyptian airports.

Beyond the region, a number of high-profile losses have hit the aviation industry as a whole over the past two years, including the disappearance of Malaysia Airlines flight MH370 and the crash of MH17, a Boeing 777 shot down over Ukraine. As we reported at the time, however, crashes actually continue to decrease. While the insured losses from a plane crash can be significant, the capacity in the aviation insurance market has continued to keep rates stable and relatively low.

In the terrorism insurance market, recent losses have also not yet borne out a concrete impact on rates or capacity. While some European markets have recently reduced their underwriting appetite, terrorism coverage has primarily broadened, with significant capacity and rates that remain relatively low.

As Business Insurance recently reported, the terror attacks in Paris and Brussels have prompted an increase in the take-up rate for event coverage to add to buyers’ terrorism insurance programs. Tim Davies, head of sabotage and terrorism at London specialty insurer Sompo Canopius, told the magazine that many buyers have been adding liability and event cancellation coverage, prompted by the continued relatively low rates. Despite the spike in attacks in Europe, Richard Sawyer, director and head of North American terrorism at Aon Risk Solutions, told AM Best last week that rates for terror coverage should remain relatively stable unless the frequency of attacks escalates.

JetBlue Pilot’s Meltdown Tests Emergency Procedures

A JetBlue flight from New York to Las Vegas had to be diverted to Texas yesterday after the plane’s captain had an apparent breakdown, requiring emergency procedures to swing into action that resulted in the pilot being locked out of the cabin and restrained by passengers and crew.

According to reports, the incident began when the co-pliot noticed that Captain Clayton Osbon was “acting erratically” in the cockpit and was flipping switches unnecessarily and seemed incoherent. The co-pilot persuaded Osbon to leave the cockpit and then locked the door behind him and changed the security code. Osbon became more agitated and began running up and down the aisle before banging on the cockpit door demanding to be let back in. Crew members attempted to calm him down but he became more irate and reportedly began screaming about Iran, Iraq, Afghanistan and Al Queda and that the plane was “going to be taken down.” Eventually a group of passengers, led by security personnel who were on their way to a conference in Las Vegas, tackled Osbon, restrained him with seat belts and sat on him for the remainder of the flight. An off-duty pilot who had also been a passenger assisted the co-pilot to safely land the plane in Amarillo, Texas, where Osbon was taken to a local hospital for observation. None of the 131 passengers or six crew members were harmed.

Osbon, who was a 12-year veteran of JetBlue and and a flight standards captain in charge of cockpit and safety procedures was described as a “consummate professional” by company CEO Dave Barger and had no history of incidents in the past. The FAA does require medical checks every year for pilots under the age of 40 and every six months for pilots older than 40. Although there is no formal psychiatric evaluation, these assessments include mental health questions and fellow crew members are trained to be on the lookout for any signs of mental distress.

Judging by the quick-thinking actions of the co-pilot and crew, with a big assist from the passengers, the system worked:

“I’d say the system functioned properly,” said Dave Funk, a retired Northwest Airlines captain and an aviation consultant with Laird & Associates. “There’s a reason we have two pilots. There’s a reason we have flight attendants. … One healthy pilot on the flight deck who’s qualified would have no problem landing the plane.”

This was the second incident this month in which passengers had to subdue unruly airline personnel. On March 9, passengers helped restrain an American Airlines flight attendant who got on the intercom before takeoff and ranted about 9/11 and airline safety before finally being removed from the plane.