Visa claims "rare" failure in networking switch caused payments meltdown

Letter to House of Commons Treasure Committee explains cause of payments glitch earlier this month

The Visa payments system meltdown at the beginning of the month was due to a "rare" failure in a key networking switch.

The admission was made in a letter to Nicky Morgan MP, chair of the House of Commons Treasury Committee in response to a series of questions.

The outage caused Visa credit and debit card payments to be rejected at retailers across the UK throughout the afternoon of Friday 1 June.

Visa Europe now admits that the disruption affected payments for more than ten hours, from 2.35pm to 12.45am in the early hours of Saturday 2 June, although most of the problems were cleared up by 8.15pm.

"Visa connects the financial institutions that issue cards to their customers (issuers) with other financial companies who ensure that merchants are able to safely connect to the network (acquirers)," explained the organisation in its letter.

The company's data centre operations team became aware of what it describes as a "partial degradation" in Visa's processing system at 2.35pm.

"We immediately… initiated a response based on protocols we have in place for addressing any type of critical incident; the first step was a Technical Response Team assessment meeting.

A component within a switch in our primary data centre suffered a very rare partial failure which prevented the backup switch from activating

"Soon thereafter, we escalated the matter in alignment with our crisis management protocol. Ninety minutes after our first indication of a systems issue, and having confirmed the underlying facts as part of our crisis management protocols, we provided a public statement to the media," the letter added.

It continued: "We operate two redundant data centres in the UK, meaning that either one can independently handle 100 per cent of the transactions for Visa in Europe.

"In normal circumstances, the systems are synchronised, and either centre can take over from the other immediately. The centres communicate with each other through messages regarding the system status, in order to remain synchronised.

"Each centre has built into it multiple forms of backup in equipment and controls. Specifically relevant to this incident, each data centre includes two core switches... a primary switch and a secondary switch. If the primary switch fails, in normal operation the backup switch would take over.

"In this instance, a component within a switch in our primary data centre suffered a very rare partial failure which prevented the backup switch from activating.

"As a result, it took far longer than it normally would to isolate the system at the primary data centre; in the interim, the malfunctioning system at the primary data centre continued to try to synchronise messages with the secondary site. This created a backlog of messages at the secondary data centre, which, in turn, slowed down that site's ability to process incoming transactions."

The impact was largely resolved by 20:15, and we were processing at normal service levels in both data centres by Saturday morning

As result, a number of software applications at the primary site needed to be shut down and restarted, and message backlogs cleaned, both manually and via "automatic means".

The explanation concludes: "It took until approximately 19:10 to fully deactivate the system causing the transaction failures at the primary data centre. By that time, the secondary data centre had begun processing almost all transactions normally. The impact was largely resolved by 20:15, and we were processing at normal service levels in both data centres by Saturday morning at 00:45."

The glitch affected a minority of payment attempts, Visa claims. Failure rates fluctuated throughout the ten hour period, but an average of nine per cent of transactions failed to process on the cardholder's first attempt, the organisation writes in its letter.

Disruption peaks, it added, at between 3.05pm and 3.15pm, and 5.40pm to 6.30pm. At those times, an average of 35 per cent of transactions failed.

No cardholder should be charged for a transaction that did not complete

Since the failure, Visa asserted, it had updated its incident response processes "by applying any lessons learned", following a post-mortem conducted to identify "all necessary steps to prevent a reoccurrence".

The organisation adds that no cardholder should be charged for a transaction that did not complete, including instances where the transaction failed to process, but a ‘hold' for a pending transaction was placed on the cardholder's account by the issuer.