During the incident window of 10:18 to 12:00 UTC on 13 May 2020, impacted merchants encountered timeouts or increased latency when attempting to reach the Braintree Gateway API endpoint. Some merchants might have also seen an elevated rate of HTTP 500-level responses for API requests. Control Panel users may have also received error messages or had difficulty downloading reports.
Braintree routes traffic to the Gateway API through multiple internet service providers (ISPs). During the incident, an ISP servicing one of our data centers had an upstream problem that resulted in packet loss for incoming requests. Simultaneously, our engineers were alerted to asymmetric routing issues which also caused timeouts. As there were two ongoing issues, it was not immediately clear that the ISP was the sole cause of the symptoms we observed. It took some time to properly determine the root cause and subsequently mitigate the impact. At 11:58 UTC, Braintree engineers rerouted traffic around the problematic ISP, restoring service for impacted traffic.