IATI Discuss Downtime

(IATI Technical Team) #1

At 12pm GMT+1 today (8th May 2019), the IATI technical team ran a maintenance upgrade on the IATI Discuss service. Unfortunately, we experienced some problems with this, leading to a backup being restored so that service could be resumed. Unfortunately, any posts and other changes that would have been made since this time and 2pm GMT+1 will have been lost - we sincerely apologise for any inconvenience this may have caused.

In technical terms, we have isolated the upgrade failure to an issue with an outdated Docker image and a memory leak within that. The memory leak is a common issue with Docker, and was preventing us from interacting with any micro-services that Discuss runs. Pruning stagnant Docker images unfortunately led to further problems, leading us to restore the backup as a last resort in order to get Discuss up and running again.

We will attempt the upgrade again on Monday 13th May at 2pm GMT as a planned maintenance task. In order to ensure a smooth upgrade, we will also carry out some further steps during this time, as defined below. This inevitably means that users will experience downtime.

Next steps include:

  • Increase the memory of our server

  • Explore the Docker memory issue, and ways that we can safely reclaim space

  • Solidify upgrade processes for the service

  • Define an upgrade schedule

  • Upgrade the Docker image

  • Upgrade the Discuss software

The technical team thank you for your patience on this matter.