Prolonged production downtime / degradation?


Yesterday starting at around 2022-10-10T10:25:00Z (UTC) we started seeing significant performance degradation when trying to request jobs on production, with periodic 503s and 504s, as well as response times longer than 100 seconds (the max timeout duration on our end). The issues continued to persist until around 2022-10-10T11:24:00Z.

Some of the 503 responses included the following message content (note the raw html, as opposed to the expected API error response format):

<h2>This website is under heavy load (queue full)</h2><p>We're sorry, too many people are accessing this website at the same time. We're working on this problem. Please try again later.</p>

Sample 504 response content:

<html> <head><title>504 Gateway Time-out</title></head> <body> <center><h1>504 Gateway Time-out</h1></center> </body> </html>

We’re still diagnosing the magnitude of impact this presumed Stuart downtime had on operations. While we continue to conduct a post mortem, can you please provide some information, confirming the incident on your end and any additional information (e.g. duration, root cause, extent of impact, etc)?

Hi team, just following up here, awaiting your response so that we can provide some clarity to our own customer base.

Hello @mirek,

We apologise for the delayed response.

Indeed, yesterday we experienced issues with our API.

You probably received a notification on the email address you are using with your Stuart account.
If this is not the case you can reach out to and we will be able to add you to the list in order to receive such notifications in the future and to receive the postmortem that we will be releasing in the next few days.

For more information on related Tips & Best practices please see our post Incidents & Outages

Thank you for your understanding

Hi @Adrien, thank you, will do. Looking forward to the post mortem.

Hi @Adrien, we still haven’t seen any postmortem come through. Has one not been sent yet (in which case, when should we expect this to occur), or should we check to see if there was an issue getting added to the notification mailing list?

Hi @mirek,

The postmortem was sent last week.
You are probably not in our list yet. Could you please reach out to,
so that we will send you the postmortem and add you to the list.

Thank you in advance