| |
Currently Open Posts
| Email and Web Disk Storage Maintenance - Open |
5 Mar 21:00:00 |
Details 5 Mar 13:49:51 |
Please see http://status.aa.net.uk/1750 for information regarding planned work that will affect web hosting on the evening of March 7th.
|
| Started |
5 Mar 21:00:00 |
| Maidenhead Core Switch Upgrade Sunday 10th Feb - Open |
10 Feb 13:00:00 |
Details 15 Jan 14:19:37 |
We will be performing a swap of the core network switches in our Maidenhead datacentre on the afternoon of Sunday 10th February.
The work is expected to take up to half an hour, during which there will be periods of time when service is affected.
The affected services will be:
Email, both incoming and outgoing VoIP calls Ethernet services terminated in Maidenhead Web page hosting Control panel and Billing system access Customer hosted servers
General internet connectivity over DSL and Ethernet terminated in London will be unaffected by this work.
This staus page will be updated during the work.
|
Update
10 Feb 12:25:39
|
This work will be done a little earlier than initially scheduled. The work will take place between 12:30pm and 2pm, but actual outage is only expected to be a couple of minutes.
|
| Started |
10 Feb 13:00:00 |
| At Risk - Carrier Transit Work - Open |
15 Jun 2012 00:01:00 |
Details 14 Jun 2012 17:54:03 |
One of our carriers that provide us connectivity in Maidenhead have planned (short notice) maintenance on parts of their network that provide us connectivity.
This will happen between midnight and 2pm.
We are not expecting this to impact customers, but is to be considered 'at-risk'.
|
| Started |
15 Jun 2012 00:01:00 by Datacentre |
| Database Server Maintenance - Open |
01 Jun 2012 17:00:00 |
Details 01 Jun 2012 13:29:22 |
We'll be rebooting one of our database servers shortly after 5pm today. This is to replace a faulty disk drive.
There will be a period of a few minutes where some usage/logs won't be viewable from our control pages.
|
Update
01 Jun 2012 17:09:52
|
This is happening now, it should take about 10 minutes.
|
Update
01 Jun 2012 17:12:52
|
The server has booted up ok.
|
| Started |
01 Jun 2012 17:00:00 |
| IPv6 changes - Open |
21 Apr 2012 12:00:00 |
Details 21 Apr 2012 13:55:09 |
Apologies for some minor distruption with IPv6 today in relation to services we run in the Maidenhead Data Centre. There have been a few minutes during today where routing has not been quite right for IPv6 for some of the services.
We have renumbered our office on to a separate /48 to improve native IPv6 routing. (A /48 is visible to more of the Internet).
Over the next few days or weeks we will be renumbering services in the data centre from 2001:8b0::/48 to 2001:8b0:1::/48 addresses. Thanks to a customer (Lawrence) for allowing us to change is allocation to free that up.
If you want to trust an IPv6 as being from A&A infrastructure you should be able to trust 2001:8b0::/47 as this includes the /48 was have in London and the /48 we have in Maidenhead and our offices.
At present there may be a handful of machines which are not technically A&A infrastructure within this space that will be renumbered this week as well. A&A infrastructure should have reverse DNS under aa.net.uk as well.
We aim to parallel run the old and new IPv6 during DNS changes so that there are no issues with any services. If anyone has any issues please contact support.
We are not planning any IPv4 changes.
Note, customers that have IPv6 on fibre from Maidenhead can now request a /48 if you need, routed via your existing /64 link.
|
| Started |
21 Apr 2012 12:00:00 |
| Previously expected |
01 May 2012 |
| IPv6 routing improvements - Info |
18 Apr 2012 14:51:28 |
Details 18 Apr 2012 14:51:28 |
We are pleased to confirm the hosting in the Maidenhead Data Centre now has full 1500 byte MTU native IPv6 transit in operation. This improves the MTU to our various servers, to customer servers, and to Ethernet transit customers. If you have any issues, please do contact support, but this should be completely seamless.
We may also be making some changes to the IPv6 addresses for A&A servers in Maidenhead in due course. We'll post more details when that is all decided. This should not have any impact as we will, of course, overlap assignments and make necessary changes to DNS records.
|
| Started |
18 Apr 2012 14:00:00 |
All Maidenhead Data Centre Posts
| Web and email outage - Closed |
5 Apr 09:14:46 |
Details 4 Apr 15:47:25 |
This is ongoing. We're investigating.
|
Update
4 Apr 16:07:41
|
This should now be fixed. Please let support know if you see any problems, or have any questions.
|
Update
5 Apr 09:15:04
|
This was resolved yesterday afternoon.
|
| Started |
4 Apr 15:46:11 |
| Closed |
5 Apr 09:14:46 |
| At risk - switch upgrade - Closed |
10 Feb 14:00:00 |
Details 7 Feb 08:57:38 |
We have a switch upgrade planned for Sunday - but it seems that somehow it knows!
There was some sort of issue yesterday at 9am and again at 11:20pm. Monitoring suggests something not quite right since 9am yesterday and seems to point to the switch which is going to be upgraded on Sunday.
We are monitoring carefully and if there are any issues during the day we will have staff carry out the upgrade as an emergency during the day.
This could affect several services, including VoIP. Hopefully this will not be an issue, and the upgrade will go ahead as planned on Sunday.
|
Update
10 Feb 12:57:20
|
Work is progressing, some servers are already up and working on the new switch.
|
Update
10 Feb 13:21:42
|
Everything seem sto be working ok - Customers may have seen an outage of around 2 minutes.
|
Update
10 Feb 13:52:56
|
Some customer hosted servers are offline at the moment, checking the configuration for them at the moment.
|
| Started |
7 Feb 09:00:00 |
| Previously expected |
10 Feb 12:00:00 |
Closed 10 Feb 14:00:00 |
The swap over has been sucsessful. Most services had a couple of minutes of outage, and some hosted servers were a longer due to a configuration issue. |
| Connectivity problems in Maidenhead - Closed |
6 Feb 12:49:46 |
Details 6 Feb 09:15:19 |
There's a connectivty problem in Maidenhead. We're investigating.
|
Update
6 Feb 09:34:51
|
Connectivity looks normal again now. We're still investigating the cause of the problem.
|
Closed 6 Feb 12:49:46 |
The problem was resolved by power cycling some hardware. Although we're still not sure exactly what caused this, we are suspicious of one of our switches. We already have planned work to replace the switch on Sunday. |
| Disk server issues - Closed |
22 Aug 2012 21:18:59 |
Details 22 Aug 2012 21:18:04 |
Disk server playing uo affecting web pages and email.
|
Update
22 Aug 2012 21:18:49
|
Staff are working on this.
|
| Started |
22 Aug 2012 21:00:00 |
Closed 22 Aug 2012 21:18:59 |
All sorted |
| Email and Web Server Disk Storage Problem - Closed |
21 Aug 2012 14:50:03 |
Details 21 Aug 2012 14:47:26 |
We are just rebooting one of our Disk storage servers due to a problem affecting email and web page hosting services.
We expect services to be restored in a few minutes
|
Update
21 Aug 2012 14:49:24
|
Service restored.
|
| Started |
21 Aug 2012 14:44:03 |
Closed 21 Aug 2012 14:50:03 |
service restored within a few minutes, appologies for this outage. |
| IPv6 packet loss affecting AAISP offices - at risk period - Closed |
25 Jul 2012 20:45:22 |
Details 25 Jul 2012 19:29:39 |
We're seeing packet loss affecting our offices at the moment.
We are investigating and have an engineer on their way to site, and so this should be considered an at-risk period for connectivity in Maidenhead until we isolate the problem.
|
Update
25 Jul 2012 20:45:34
|
This has now been resolved.
|
| Started |
25 Jul 2012 19:24:31 |
| Closed |
25 Jul 2012 20:45:22 |
| IPv6 Routing Problem in Maidenhead Datacentre - Closed |
27 Jun 2012 13:27:00 |
Details 27 Jun 2012 09:30:41 |
There is currently a problem with IPv6 routing in our maidenhead datacentre. This will be affecting Ethernet customers as well as access to some of our server over IPv6.
|
Update
27 Jun 2012 09:32:01
|
IPv6 is now routing again.
|
Update
27 Jun 2012 09:35:47
|
This looks like one or our transit links was announcing IPv6, but not routing it. We've taken the link down for the time being whilst we investigate further.
|
Update
27 Jun 2012 09:42:25
|
Transit provider confirm that they are investigating.
|
| Started |
27 Jun 2012 09:24:00 |
Closed 27 Jun 2012 13:27:00 |
There was a problem with a link to a transit provider, they have now resolved this, are are looking to how this problem can be prevented in the future. |
| Email & Web Server Problem - Closed |
20 Jun 2012 12:00:09 |
Details 20 Jun 2012 11:54:42 |
One of our stoarge servers used for email and web pages is needing to be rebooted. This is happening now.
Web pages and email access will be affected for a couple of minutes.
|
Update
20 Jun 2012 11:57:32
|
The Disk server is booting back up now, service will be restored very shortly.
|
Update
20 Jun 2012 12:00:03
|
Web and email servers are now back online.
|
| Started |
20 Jun 2012 11:54:02 |
Closed 20 Jun 2012 12:00:09 |
This problem was caused by some kind of file system problem which we will investigate. |
| Power outage in Maidenhead data centre - Closed |
31 May 2012 10:20:57 |
Details 29 May 2012 22:23:38 |
IT looks like a few severs rebooted, so suggests a power problem. Trying to get some more information now.
IT would affect web and email hosted services as well as VOIP.
|
Update
30 May 2012 00:09:53
|
It has been confirmed that this was a problem with power. It is being worked on.
There are a few minor problems, like one of our spam checking servers is offline (the other servers should cope), one of our customer facing SMTP relays died (no longer in DNS so not customer affecting) and customised voicemail messages are not working on our VoIP service.
The accounts system web interface is down too (although the accounts system itself is fine).
These are not major problems, and we'll continue to work on them in the morning.
|
Update
30 May 2012 00:54:11
|
It looks like there are some ongoing connectivity problems with Palsant's transit feed. They are manifesting themselves as brief outages, but it doesn't look too serious.
We'll post details when we know more.
|
Update
30 May 2012 08:58:27
|
Power went off at 22:07:47 and was restored at 22:10:59
Engineers are on site this morning to resolve remaining issues with some servers.
Transit seems to be stable now.
|
Update
30 May 2012 09:55:53
|
Most services are working correctly - some issues with voicemail/announcement messages at present, but this is being worked on. One of the outgoing mail servers has not come back to life and will be worked on this morning so that the mail queue can be sent ASAP.
|
Update
30 May 2012 18:30:13
|
All services are working again now. The data centre have completed their investigations and have determined that the outage was caused by a faulty UPS component which failed during planned maintenance.
The power feed to the data centre is currently running without UPS backup and should be seen as at-risk in the event of a mains power outage. The data centre staff expect to have the UPS back in-line by 22:00 this evening.
|
| Started |
29 May 2012 22:07:47 by AAISP automated checking |
| Closed |
31 May 2012 10:20:57 |
| Maidenhead transit planned work - Completed |
31 May 2012 10:19:05 |
Details 25 May 2012 08:59:04 |
Pulsant are carrying out work on transit in Maidenhead.
This work will affect all services we host in Maidenhead, such as email, web and VOIP.
The work is scheduled for between:
23:00 30/05/2012 and 04:00 31/05/2012
A 10 minute disruption to service is expected.
|
| Started |
30 May 2012 23:00:00 by Datacentre |
| Previously expected |
31 May 2012 04:00:00 |
| Closed |
31 May 2012 10:19:05 |
| Power blip - Closed |
29 May 2012 22:11:08 |
Details 29 May 2012 22:25:56 |
Not sure what happened, but both core routers reported a power outage.
We'll try and find what happened.
|
| Started |
29 May 2012 22:07:47 |
| Closed |
29 May 2012 22:11:08 |
| Intermittent Packet Loss affecting Maidenhead Hosted Services - Closed |
16 May 2012 12:00:00 |
Details 16 May 2012 10:59:36 |
We are are seeing intermittent packet loss affecting services in Maidenhead - this will be most noticeable on VoIP calls where they may be short periods of silence during a call.
The loss is sporadic, it lasts a few seconds and there are long periods of time where the loss is not present, as yet there is no clear pattern to pinpoint the cause -a tricky one to track down!
We are investigating and working with the Datacentre to fix this.
|
| Started |
16 May 2012 09:00:00 |
Closed 16 May 2012 12:00:00 |
We had put in a temporary fix whilst we get to the bottom of the problem. |
| Web & email storage problem - Closed |
04 May 2012 21:15:57 |
Details 04 May 2012 19:00:58 |
There's a problem with web and email disk storage in Maidenhead.
We're investigating.
|
Update
04 May 2012 19:16:08
|
Now back. We're investigating the cause of this.
|
| Closed |
04 May 2012 21:15:57 |
| Routing issue - Closed |
21 Apr 2012 12:05:00 |
Details 21 Apr 2012 12:06:35 |
There was a rather odd issue with routing in Maindenhead affecting several of our servers there. We think this may be an issue with a switch, but we have managed to get things working again. There may be some planned maintenance as a result later.
|
| Started |
21 Apr 2012 12:00:00 |
| Closed |
21 Apr 2012 12:05:00 |
| Email and Web Server Disk Storage Change - Completed |
29 Mar 2012 20:12:03 |
Details 22 Mar 2012 16:35:18 |
Email and web servers will have about 30 minutes of down time as we move the file server that they use.
During this time access to email and web pages will not be available.
Date: Thursday 29th Starting time: 8pm Reason: Moving to secondary file server to allow for software updates
|
Update
29 Mar 2012 19:52:27
|
This work will be starting shortly.
|
Update
29 Mar 2012 20:00:33
|
Web and incoming email (pop3/imap) servers will be offline for about 30 minutes form now.
|
Update
29 Mar 2012 20:08:51
|
Our webserver is now back online.
|
Update
29 Mar 2012 20:11:29
|
Email services are back up now.
|
| Started |
29 Mar 2012 20:00:00 by AAISP Staff |
| Previously expected |
29 Mar 2012 21:00:00 |
Closed 29 Mar 2012 20:12:03 |
The move has been completed as planned, services were restored within about 11 minutes. |
| Maidenhead Datacentre Packetloss - Affecting VoIP/Email/Web Server - Closed |
27 Mar 2012 15:10:25 |
Details 27 Mar 2012 14:54:33 |
There is a network problem in our Maidenhead datacentre at the moment, this will be affecting some of our services, mainly VoIP, email (sending and receiving) our web servers and hosted servers.
VoIP customers will be most affected and may have some calls with audio problems (breaking up)
|
Update
27 Mar 2012 14:55:10
|
We are tracking down the source of the problem
|
Update
27 Mar 2012 15:04:17
|
We have tracked the source and are in the process of shutting down the compromised customer machine.
|
Update
27 Mar 2012 15:12:16
|
The compromised machines have been shutdown.
|
| Started |
27 Mar 2012 14:22:47 |
Closed 27 Mar 2012 15:10:25 |
Compromised customer servers have been shut down. |
| Email and Web Server Problems - Closed |
22 Mar 2012 16:30:00 |
Details 22 Mar 2012 16:13:47 |
Our main disk storage server is having problems at the moment, this is affecting email and web server storage.
Engineers are investigating.
|
Update
22 Mar 2012 16:19:44
|
One of our disk server is being rebooted.
|
Update
22 Mar 2012 16:20:29
|
POP/IMAP services back
WWW Service back
|
Update
22 Mar 2012 16:24:10
|
We believe we know what caused this. We will do planned work to apply an update.
|
Update
27 Mar 2012 10:18:24
|
Please see http://status.aa.net.uk/apost.cgi?incident=1450 for planned works
|
| Started |
22 Mar 2012 16:10:00 |
| Closed |
22 Mar 2012 16:30:00 |
| UPS Work (No user impact) - Completed |
16 Jan 2012 23:59:00 |
Details 09 Jan 2012 12:32:21 |
We have been informed by the Maidenhead datacentre that they have some planned work on their UPS systems.
They will be carrying out a replacement exercise on end-of-life capacitors in the UPS.
The unit will be taken off-line for the duration of the work however due to the configuration of the installation there will be no impact to the electrical supply or resilience to equipment
The work will be carried out on Monday 16th January, starting at 21:00 and should last 5 hours.
This work is factory advised and forms part of their robust Planned and Preventative maintenance regime.
|
| Started |
16 Jan 2012 21:00:00 |
| Closed |
16 Jan 2012 23:59:00 |
| Possible Routing Problem - Closed |
09 Feb 2012 14:57:09 |
Details 09 Feb 2012 14:49:23 |
We're investigating reports of strange routing problems involvng access to/from our Maidenhead data centre
|
| Started |
09 Feb 2012 14:35:00 |
Closed 09 Feb 2012 14:57:09 |
Post moved over to http://status.aa.nu/apost.cgi?incident=1394 |
| Problem in Maidenhead - affecting VoIp/ethernet/email/etc - Closed |
26 Jan 2012 13:10:00 |
Details 24 Jan 2012 21:16:14 |
We are seeing a major issue in Midenhead - high levels of transit packet loss that will be affecting VoIP, email, and Ethernet customers (as well out our offices).
|
Update
24 Jan 2012 21:47:11
|
Looking like some sort of denial of service attack.
|
Update
24 Jan 2012 22:34:04
|
Sorry for the delay posting more details. This affects our links directly and making it difficult. The problem appears to be a huge denial of service attack invovling tens of thousands of sessions and filling gigabit links.
We have identified the target and disconnected it, black holed the target and even tried to divert traffic but to no avail as yet.
We are still working on this.
|
Update
24 Jan 2012 23:03:38
|
Just an update to say that this is still being worked on...
|
Update
24 Jan 2012 23:23:30
|
Still working on this
|
Update
24 Jan 2012 23:28:57
|
The problem is now only affeting Ethernet customers on the same block as the address being DDOS'd. We are still working on the issue. Most other services will be working fine now.
|
Update
24 Jan 2012 23:57:55
|
Some side effects on other services from Maidenhead, but we are still working on narrowing down the issue.
|
Update
25 Jan 2012 00:01:48
|
DOS attacks are, thankfully, rare. This has to be the biggest we have seen.
We will, of course, be talking to the customer who is being DOSed to fine what could have provoked such a major attack. There is usually a reason.
|
Update
25 Jan 2012 13:20:48
|
The blackhole for the target machine was removed at 1pm today, however the traffic was still being sent and affected VoIP, email and Ethernet services.
The block is now in place again, and we'll continue to investigate.
|
| Started |
24 Jan 2012 21:00:00 |
| Closed |
26 Jan 2012 13:10:00 |
| Maidenhead datacentre problems again - Closed |
25 Jan 2012 13:14:06 |
Details 25 Jan 2012 13:12:18 |
Similar to last night, access to servers and services in Maidenhead have high packet loss.
This will affect email, voip and Ethernet services.
Update to follow shortly.
|
Update
25 Jan 2012 13:13:19
|
Datacentre staff are working to blackhole the IP address that is the target of this attack.
|
Update
25 Jan 2012 13:14:43
|
The target IP address has been blackhole'd and service has been restored.
|
| Started |
25 Jan 2012 13:02:00 |
Closed 25 Jan 2012 13:14:06 |
We'll update the initial post from yesterday with further updates to this. http://status.aa.net.uk/apost.cgi?incident=1364 |
| Server Moving - Not Customer Affecting - Completed |
19 Jan 2012 13:00:00 |
Details 18 Jan 2012 14:48:38 |
We'll be moving some servers around between racks in the Maidenhead datacentre on 19th January. These servers are mainly related to email services, but we are not expecting customers to notice anyting whilst the moves take place.
The servers moving will be:
tertiary-mx.co.uk (the backup email relay) A couple of the Spam checking servers (there are many, so one being down at a time won't cause a problem) One of the outgoing SMTP servers (which is offline at the moment)
|
| Started |
19 Jan 2012 11:00:00 |
Closed 19 Jan 2012 13:00:00 |
This work has been completed. |
| Web Services - Closed |
28 Nov 2011 13:46:02 |
| Email and Web Server Work - Completed |
20 Nov 2011 11:29:06 |
Details 17 Nov 2011 16:09:45 |
On Sunday morning we'll be moving over the storage server that the email and web services use. The work should take less than 30 minutes, but as the work is carried out access to email and web sites will be unavailable.
We aim to make this move as quickly as possible.
|
Update
20 Nov 2011 11:01:03
|
FTP & RSYNC access to our web server has been stopped.
The work of moving over the storage servers will begin shortly. We anticipate this to take less than 30 minutes.
|
Update
20 Nov 2011 11:04:20
|
The Work is starting now, access to websites we host and email will be unavailable whilst this is carried out.
|
Update
20 Nov 2011 11:23:16
|
Web pages are now being served again.
|
Update
20 Nov 2011 11:24:28
|
Email is back up
|
Update
20 Nov 2011 11:28:02
|
FTP and rsync access to our web server is now running.
|
Update
20 Nov 2011 11:30:38
|
This work is almost complete, customers should be back on email and their web pages being served again.
|
| Started |
20 Nov 2011 11:00:00 |
| Previously expected |
20 Nov 2011 12:00:00 |
| Closed |
20 Nov 2011 11:29:06 |
| Network glitch affecting voice and ethernet - Closed |
02 Nov 2011 11:25:49 |
Details 02 Nov 2011 11:30:52 |
There appears to have been a severe network glitch affecting both diverse routes out of the Maidenhead data centre. Routing is recovering now, but this would have affected Ethernet customers and VoIP customers the most. Some authentication of DSL lines may have been delayed. Access to our email and web servers and other hosted services would also have been affected.
The incident appears to have lasted a few minutes. We are trying to get more details.
|
Update
02 Nov 2011 11:49:03
|
The carriers have confirmed they had an outage and should send an explanation shortly.
|
Update
02 Nov 2011 21:49:20
|
Carriers explain the fault as:
The cause of this incident was traced to events on the network which caused high CPU load on the transit routers. This then resulted in router protocol instability which affected transit services. We have since stabilised the network and are developing solutions to be implemented which should reduce the impact of such events in the future.
Loss of connectivity was detected at 11:24 with service restored by 11:27.
Please accept our apology for any inconvenience caused.
|
| Started |
02 Nov 2011 11:24:02 |
| Closed |
02 Nov 2011 11:25:49 |
| Moving Servers - Legacy Email, Primary DNS, Wiki - Completed |
23 Sep 2011 15:50:12 |
Details 07 Jun 2011 09:54:26 |
We'll be moving a couple of servers between datacentres. Their IP addresses will not be changing, it's just a physical move.
The servers are:
A.Hopeless - One of our legacy email servers -Access to receiving email during this time will not be possible.
Primary-dns.co.uk - One of our internet facing DNS resolvers We don't anticipate that service will be affected as the secondary DNS server will be available.
Customer Wiki Access to wiki.aaisp.org.uk will be unavailable during this time
A couple of other internal use servers will be moved at the same time.
Start Time: 17:30 Duration: 1 hour
|
Update
10 Jun 2011 18:15:40
|
This work is nearly complete - just waiting for various email services to start up.
|
| Started |
10 Jun 2011 17:30:00 by AAISP Staff |
| Closed |
23 Sep 2011 15:50:12 |
| Minor blip over night - Closed |
08 Sep 2011 03:08:00 |
Details 08 Sep 2011 03:46:51 |
Our routers in the Maidenhead data centre had some issues over night. These may have caused a few seconds outage in some services and Ethernet access. Broadband services not affected.
|
| Started |
08 Sep 2011 03:02:00 |
| Closed |
08 Sep 2011 03:08:00 |
| At risk - router upgrade - Completed |
19 Jul 2011 17:38:58 |
Details 19 Jul 2011 17:25:55 |
We are again upgrading a router this evening - it should have little or not impact as the system should fall back to the secondary router. There is, as always, a risk.
These upgrades, when complete on both routers (not both done together!) will mean faster fallback in the event of a failure in future as we are upgrading to support VRRP3 with sub second timing.
|
Update
19 Jul 2011 17:34:25
|
Arrrg, why is this never simple.
|
Update
19 Jul 2011 17:39:06
|
Abandoned for now - maybe later.
|
Update
19 Jul 2011 18:17:34
|
All sorted
|
| Started |
19 Jul 2011 17:30:00 |
| Previously expected |
19 Jul 2011 17:35:00 |
| Closed |
19 Jul 2011 17:38:58 |
| Router upgrade (short notice) - Completed |
17 Jul 2011 19:08:56 |
Details 17 Jul 2011 18:45:47 |
We will be upgrading one of the main routers this evening to latest release. Sorry for short notice. As usual this hsould have little or no impact on services.
|
Update
17 Jul 2011 19:05:24
|
Not playing the game quite - so may have a few seconds disruption...
|
Update
17 Jul 2011 19:09:04
|
Completed
|
| Started |
17 Jul 2011 19:00:00 |
| Previously expected |
17 Jul 2011 19:05:00 |
| Closed |
17 Jul 2011 19:08:56 |
| Scheduled Power Systems Maintenance 10 March Evening - Completed |
10 Mar 2011 23:00:00 |
Details 09 Mar 2011 11:02:56 |
Please be aware that testing of the redundant power ATS (Automatic Transfer Switch) equipment at the maidenhead datacentre bill be carried out on Thursday 10th March at 19:00 and conclude by 23:00 on the same day.
Whilst the work taking place is non-intrusive and power redundancy will still be available at all times, customers should treat this as an at risk period.
|
| Started |
10 Mar 2011 19:00:00 by Datacentre |
| Closed |
10 Mar 2011 23:00:00 |
| Incident in maidenhead - Closed |
18 Mar 2011 11:54:30 |
Details 17 Mar 2011 10:22:00 |
We have lost comms with Maidenhead and we have an engineer going to site now, we are not sure what the issue is but it may be power related.
Email, VOIP and some others services will be affected.
This is also affectig Ethernet customers and hosted servers in Maidenhaed
There appears to haver been a fire alarm that has gone off and data center has been evacuated. No evidence of a fire though but power is down
|
Update
17 Mar 2011 10:25:11
|
Staff are just approaching the data centre now.
|
Update
17 Mar 2011 10:37:59
|
Power is being restored now
|
Update
17 Mar 2011 10:49:15
|
Our engineers are on site and power has been restored, servers of ours are coming back on line, further updates will be posted when we get them
|
Update
17 Mar 2011 10:56:15
|
Not all power has been resotred yet. Some services (control pages, VOIP, web) are still down. They should be restored shortly.
|
Update
17 Mar 2011 11:13:10
|
VoIP and control pages are back. Email and web should be back soon.
|
Update
17 Mar 2011 11:22:00
|
The A viop server is still down.
|
Update
17 Mar 2011 11:56:02
|
Email servers are mostly back, and web services are back. We've still got some voip problems and are working on it.
|
Update
17 Mar 2011 11:58:56
|
The A voip server has a database problem, and won't let customers register.
|
Update
17 Mar 2011 12:02:12
|
There is now a database problem on C SIP server too. Investigating.
|
Update
17 Mar 2011 12:08:07
|
Database fixed on C.
|
Update
17 Mar 2011 12:21:07
|
Database problems fixed on A and C servers.
|
Update
17 Mar 2011 15:44:43
|
Most services are back up now, we have had a number of hardware fail as part of the power outage incident.
Currently the main problem is our email ticketing server - this is affecting emails to support/sales/accounts etc - and so is causing a delay in email replies.
There are also problems with:
The online ordering system ADSL usage reporting ADSL line status on Clueless
Other servers still have problems which we are working through, but other servers are managing with the load (may services have multiple servers).
|
Update
17 Mar 2011 17:15:23
|
The odd effect with lines not showing as on-line properly on clueless is fixed, and lines will clear properly over night as a result. PPP restarts of lines are needed but this is done automatically in stages to minimise disruption.
|
Update
17 Mar 2011 17:15:37
|
On-line ordering restored a little while ago.
|
Update
17 Mar 2011 17:18:23
|
I would just like to say that I am very pleased with how my staff have handled this today - tackling the issues in a sensible priority and updating status pages. This is a major issue with not just a power outage, but issues with access to the building, and possibly even a power surges as several pieces of equipment have failed totally. The backup arrangements for critical systems have worked as expected as has the maintenance of broadband internet access, DNS, and RADIUS authentication. Well done everyone. We'll try and get a more detailed explanation from the data centre in due course. Staff are working on the last of the issues now.
|
Update
17 Mar 2011 18:38:16
|
thankless (ticketing) still down and being rebuilt now.
|
Update
18 Mar 2011 00:46:47
|
We have now got our email ticketing system back online - we do apologise for the time this has taken, and the delay this has caused to email to support, sales and accounts.
|
Update
18 Mar 2011 11:55:08
|
We'll close this incident for now - but will add the official response fron BlueSquare when they have let us know.
|
Update
21 Mar 2011 11:50:27
|
This is the official report from BlueSquare (Our racks are in the building called BS2)
This is a Reason for Outage Report with details regarding the power supply in BS2/3 with BlueSquare Data Services Ltd.
At 10:06 on Thursday 17th March one of the six UPS modules located in BlueSquare 2/3 suffered a critical component failure which resulted in a dead short on the output side (critical load side) of the UPS. This failure also caused an amount of smoke to be released by the failed UPS system which resulted in the fire alarm activating and the fire service attending. Once the fire service was happy with the situation we were able to restore power to the site via the generators with the UPS system bypassed whilst we investigated the fault further.
Due to the short circuit occurring on the output side of the UPS this meant the other UPS’s immediately went into an overload condition which then switched all modules into bypass mode, as per the design of the system. This overload then transferred to the raw mains and tripped the main incomer to the site. This caused the overload condition to cease and power was lost to the site. The UPS manufactures then worked to check all the remaining UPS modules to ensure the same component was within specification, and to fully test each UPS system, replacing some components where necessary. No further faults were found on the remaining UPS modules, and load was then switched back to full UPS protection at approx 02:15 and building load was transferred back from the generators to utility mains at approx 02:25.
Due to the size of the failure we have commissioned an independent organisation to forensically examine the failed UPS module. This work is scheduled to be completed next week and we will provide further details once we receive their report. This was an extremely unusual type of failure and the manufactures have not experienced such a problem before, despite over 3,000 similar UPS units being deployed. This suggests there isn’t an inherent design problem in the units but we will not reach any conclusions until the forensic examination is complete.
The failed UPS module will be replaced within the next 4 weeks and until that time we will remain on ‘N’ redundancy level at BlueSquare 2 & 3. Further updates will be provided before this replacement work takes place.
A number of customers have asked as to why this failure could occur when we operate an N+1 UPS architecture. The reason for this is that all of the six UPS modules in BlueSquare 2/3 are paralleled together as one large UPS system. BlueSquare 2/3 only requires 5 modules to hold the critical load to the site, however we have an additional unit which provides the redundancy in the event of a UPS module failure. However, as this failure was on the common critical load side of the UPS (the same output that feeds the distribution boards which then in turn feed the racks) and all the UPS systems are paralleled together, this had the effect of causing all UPS modules to go down.
As an example, in a N+N configuration, such as in our Tier IV Milton Keynes site, a failure of this nature would not be possible as two banks of independent UPS systems operate providing true A&B feeds to each rack.
|
| Started |
17 Mar 2011 10:00:20 |
| Previously expected |
17 Mar 2011 11:20:20 |
| Closed |
18 Mar 2011 11:54:30 |
| Urgent router maintenance - Completed |
11 Mar 2011 17:04:54 |
Details 11 Mar 2011 16:31:37 |
We are expecting to do some work on routers in Maidenhead. This should have minimal impact as we can work on one at a time.
Sorry for the very short notice.
|
Update
11 Mar 2011 17:28:54
|
We are actually doing another restart now.
|
| Started |
11 Mar 2011 17:00:00 |
| Closed |
11 Mar 2011 17:04:54 |
| Web Server Disk Storage Work - Completed |
30 Jan 2011 22:54:32 |
Details 26 Jan 2011 11:08:57 |
We will be doing some work on the storage servers that are used by our Web Server on Sunday. This will mean a couple of hours where FTP and RSYNC access will be disabled.
|
Update
30 Jan 2011 21:26:13
|
This work is currently being started. FTP and rsync access to our webserver will be unavalable for the next couple of hours.
|
| Previously expected |
30 Jan 2011 23:00:00 (Last Estimated Resolution Time from AAISP) |
Closed 30 Jan 2011 22:54:32 |
This work has been completed. The webserver is now using new storage severs.
Please report any problems to support. |
| VoIP and Email Problems due to Datacentre Connectivity - Closed |
29 Dec 2010 12:43:00 |
Details 29 Dec 2010 12:35:23 |
We currently have routing problems to our datacentre in Maidenhead, this will be affecting access to:
- Email - incoming and outgoing
- VoIP
- Hosted server
- Control Pages (Clueless)
We have engineers looking in to this at the moment, and will post anohter update shortly.
|
Update
29 Dec 2010 12:47:47
|
This is now working. It seems to be some routing/peering problem outside of our and BlueSquare's network - If we get any more details we'll post an update.
|
| Started |
29 Dec 2010 12:20:00 |
| Closed |
29 Dec 2010 12:43:00 |
| Router restart - at risk - Completed |
11 Oct 2010 17:11:51 |
Details 11 Oct 2010 11:52:45 |
We are restarting one of the routers in Maidenhead this evening. It should be seamless in terms of routing though we have seen blips on IPv6 when doing this in the past so there is a risk of disruption for a few seconds.
Maindenhead handles ethernet customers, our offices, and access to many of our servers including VoIP for call set up and recorded calls.
|
| Closed |
11 Oct 2010 17:11:51 |
| Switch Reboot - Completed |
02 Mar 2010 19:10:00 |
Details 02 Mar 2010 19:05:36 |
We are just about to reboot a switch in our Maidenhead datacentre. This will affect a few hosted customer servers and some AAISP servers. It's expected to only mean a few minutes of downtime. Sorry for the short notice.
|
| Started |
02 Mar 2010 19:06:07 |
| Previously expected |
02 Mar 2010 19:10:00 |
| Closed |
02 Mar 2010 19:10:00 |
| Switch problem in Maidenhead, affecting some Hosted customers - Closed |
02 Mar 2010 21:21:52 |
Details 02 Mar 2010 20:56:44 |
Not the switch that we rebooted earlier this evening, but another switch is currently being rebooted. This was not planned and not related to work we've been doing this evening with email.
This will affect some hosted customers for a few minutes, we do apologise.
|
Update
02 Mar 2010 21:02:49
|
this will also affect outgoing mail (sending mail via smtp.aaisp.net.uk)
It's expected to be resolved in a few minutes though
|
Closed 02 Mar 2010 21:21:52 |
This took rather longer than expected, but has now been done and servers are now visible again. |
| Loss of interconnect routing - Closed |
15 Feb 2010 20:15:55 |
Details 15 Feb 2010 19:50:00 |
Some minor router reconfiguration work this evening to add additional routing resilience resulted in an unexpected side effect that took a few minutes to rectify.
This affected specifically the routing between London and Maidenhead meaning broadband customers lost access to email and VoIP, and Ethernet customers lost access to A&A servers in London and broadband lines.
|
| Started |
15 Feb 2010 19:50:00 by AAISP Staff |
Closed 15 Feb 2010 20:15:55 |
Routing configuration has been corrected. |
|
|