iwstack outage

Post Reply
Admin
Site Admin
Posts: 490
Joined: Wed Jul 25, 2012 10:54 pm

iwstack outage

Post by Admin » Fri Jul 04, 2014 12:04 pm

Hello!

There was a network blip earlier which triggered massive HA events, the instances being restarted on the nodes unaffected by the outage.
There are hundreds of restarts queued so it may take some time.
We are investigating the issue to find ways on how to avoid it in the future and are deeply sorry for this unfortunate event.

Update: Disabled HA temporarily so this does not happen again while we are investigating and find out why there was a massive disconnect.

Admin
Site Admin
Posts: 490
Joined: Wed Jul 25, 2012 10:54 pm

Re: iwstack outage

Post by Admin » Fri Jul 04, 2014 5:17 pm

RFO sent already to all customers:

Hello!

Today at 10:01 AM CEST the iwStack orchestrator received a disconnection event from several hosts due to a bad network card which flooded the vlan which triggered a massive High Availability recover procedure for more than 600 instances.
At 10:50 while most instances were back running, a couple hundreds were stuck in starting state waiting for the network setup to complete.
At 11:10 in the attempt to speed up the process we forced a network restart (including VR rebuild), but this turned to be a wrong solution causing more delay.
Finally at 13:00 all the queued instances were started.

If your instances are still in stopped state, just start them. Please open a ticket if some instance don't start.

At present we have disabled the HA setting on all the instances while we're investigating on the incident.

We are sorry for any inconvenience this issue may have caused.

Post Reply

Who is online

Users browsing this forum: No registered users and 19 guests