Storage interruption zone02.ams02 Wednesday 6th October 2021 10:27:00

We're currently experiencing issues on the network storage solution serving our zone02.ams02 cloud infrastructure. Our technicians are investigating the problem and we're working very hard to resolve the issues as quickly as possible. Our sincere apologies for the inconvenience.

UPDATE 1: The implicated server has been fixed and the cluster has been rebalanced. We will investigate this further on our side.

UPDATE 2: We are still experiencing some slow IOPS. We are keeping an eye on the cluster and act when needed.

UPDATE 3: We have decided to bring down the hosts one by one and run levelDB’s compaction algorithm on the RocksDB.

UPDATE 3: All databases have been organized and the cluster is running optimal with very low latency.

UPDATE 4: It seems we are still having issues with random slow IOPS. Our next step is upgrading the cluster and the clients to the newest version.

UPDATE 5: Upgrading the clients does not seem to be enough. We are going to upgrade the whole cluster next.

After finishing the upgrade we will continue with the next phase.

After talking with a consultant we have decided to redeploy all data disks to update the internal "data structure" index. Testing this procedure, we noticed removing a disk from the pool can cause a short hick-up. Consulting with our customers we didnt see any large impact in the servers.

We expect this phase to take the better part of this and next week.

Phase one is complete.

Upgrading the machines is finished and went well.

After consulting we have some possible fixes to apply. We will start applying these in 2 steps.

  • At 1500 we ill begin applying the first part, which we expect to finish around 1600.
  • At 1800 until 18:30 we will apply the second set of changes.

In both cases we don't expect downtime or interruptions. In case we notice any interruptions, we will reschedule these activities to a less busy time.