| |
Currently Open Posts
| Per IP Stats - Open |
3 Jan 09:59:26 |
Details 3 Jan 09:59:26 |
The 'per IP stats' usages pages are not working very well at the moment. Some customers are not seeing any usage, and some are seeing very little usage, and others are seeing correct amounts of usage!
This is being looked in to.
This does not affect the 'Line' usage or any billing based usage records.
|
Update
11 Apr 12:25:10
|
We are sorry for how long this has been ongoing. We are still trying to resolve the issue, however it is taking a lot longer than we expected.
We do not currently have an ETA however we will update this status once we have further news.
|
| Started |
3 Jan 09:56:05 |
All Broadband Minor Outage Posts
| BE Outages at Shepherds Bush, Kensington Gardens and Acton Exchanges - Closed |
17 May 09:22:49 |
Details 16 May 15:25:06 |
02 have informed us that there is currently an outage at the Shepherds Bush (LWSHE), Kensington Gardens (WRKGDN) and Acton (LWACT) exchanges.
Engineers are currently investigating and 02 expect to have an update within the next 2 hours.
|
Update
17 May 09:23:23
|
This is now resolved, we have not been given any information as to the cause or fix for this outage.
|
Update
17 May 09:23:29
|
This is now resolved, we have not been given any information as to the cause or fix for this outage.
|
| Started |
16 May 15:23:05 |
| Closed |
17 May 09:22:49 |
| 1/3rd of lines PPP restarted - Closed |
15 May 14:01:02 |
Details 15 May 13:48:38 |
Due to human error, we have managed to clear PPP on around a 1/3 of lines (B.gormless) today.
We are looking in to how procedures can be tightened up to avoid this in future.
|
Update
15 May 13:55:50
|
This was a controlled restart of the LNS not a crash, and as such the PPP restarts were even quicker as devices did not have to wait for LCP timeout to restart.
And yes, we are seriously looking at the best way to avoid this sort of error happening again.
And yes, the human concerned was me - RevK - so I owe a few people a pint...
|
| Started |
15 May 13:45:00 |
Closed 15 May 14:01:02 |
The answer is css - the control pages of the various machines currently all look identical apart from the host name. Using some css we can make all "live" boxes stand out very clearly. |
| Lines dropped: 100% 21CN-REGION-21CN-BRAS-RED10-L-STE and 100% 21CN-REGION-21CN-BRAS-RED13-L-STE and 99% 21CN-REGION-L-STE - Closed |
14 May 09:33:53 |
Details 14 May 02:21:02 |
Lines: 100% 21CN-REGION-21CN-BRAS-RED10-L-STE and 100% 21CN-REGION-21CN-BRAS-RED13-L-STE and 99% 21CN-REGION-L-STE dropped at 2013-05-14 02:19:57 We have advised BT This is likely to have affected multiple internet providers using BT
|
Update
14 May 09:35:21
|
Ater chasing BT on the cause for this outage, it would appear that this was caused by a Planned Works at London Stepney Green.
The aim of these planned works was to increase capacity.
|
Update
14 May 09:35:28
|
Ater chasing BT on the cause for this outage, it would appear that this was caused by a Planned Works at London Stepney Green.
The aim of these planned works was to increase capacity.
|
| Broadband Users Affected |
3% |
| Started |
14 May 02:19:57 by AAISP automated checking |
| Closed |
14 May 09:33:53 |
| Cause |
BT |
| Issues with c.gormless - Closed |
2 May 02:14:25 |
Details 30 Apr 19:18:22 |
We are trying to get one of our LNSs working on latest code, and this is proving a struggle as there are some major upgrades. We hope to finish this as per the planned work during this evening. This may mean loss of graphs and some PPP restarts for some customers.
|
Update
30 Apr 21:32:21
|
This is being an issue, and we will be working on this over night we expect.
|
Update
1 May 00:46:20
|
We think working over night on this may have paid off - we're monitoring now, but we hope we have cracked the issue at last.
|
Update
1 May 01:03:25
|
c.gormless is now running at normal load (1/3 of customers) and showing no leaked RADIUS slots or stuck sessions or processes. This is good news.
|
Update
1 May 06:30:12
|
Monitoring is showing the LNS appears to be stable, but we have seen some issues with stats being recorded on our core RADIUS still. This is somewhat easier to address, and being looked at still.
|
Update
1 May 17:17:34
|
It looks like the work last night as paid off. This evening we have the "D" LNS working on the newest code, and are letting lines trickle over to it when they reconnect for any reason. We plan to switch over more lines over night.
But so far it is looking like it is working as planned. We are checking the performance of the new code as well to make sure this was all worth it 
Thank you all for your patience on this.
|
Update
2 May 02:09:03
|
We have switched traffic to d.gormless over night as planned.
The LNSs are looking stable - we are still seeing a few sessions from the previous few days which have not had proper accounting (so free usage) which are being cleared up.
We are also seeing the odd BT issue from time to time, and it is nice to say that a blip is not, in fact, my fault - more "situation normal".
We're still looking in to why people are getting the the odd up/down emails and tweets as it is all related. Now that the LNSs are being stable we can investigate this somewhat more easily.
|
Update
3 May 04:57:05
|
The scheduled LNS switch overs to move on to the new code base seem to be working as planned. Tonight's has gone through without any apparent issues. We will be rolling out the change over the next couple of nights to cover all LNSs.
Once again, thank you all for your patience and understanding.
|
| Started |
30 Apr 19:17:18 |
| Closed |
2 May 02:14:25 |
| Packet loss on BE lines - Closed |
7 May 21:00:00 |
Details 7 May 13:46:00 |
There is packet loss on BE lines thoughtout the country, we have reported this BE and they are going to investigate.
|
Update
7 May 14:54:25
|
We've had reports that BE Retail customers are also seeing a similar problem, so this seems like a general BE network/backhaul problem.
We have reported this to BE.
|
Update
7 May 15:58:26
|
BE are aware of a possible core network problem and they are investigating.
|
Update
7 May 16:12:51
|
BE have said that this is due to a technical fault on their network which they are investigating.
|
Update
8 May 08:38:02
|
This was resolved just before 9pm - repairs to fibres were made within the BE network.
|
| Started |
7 May 12:18:00 by AAISP Staff |
| Closed |
7 May 21:00:00 |
| Pink status and odd emails - Closed |
2 May 02:14:51 |
Details 18 Apr 18:35:10 |
It looks like we have a blip in RADIUS around 5pm which has resulted in lines showing "pink" or "salmon" on the management pages and some people getting odd status update emails.
Services are working - this is an accounting issue.
We're not sure what happened exactly, but this is a scenario that the systems are designed to cope with relatively sensibly. Some usage may not be metered for the next few hours and some lines will be PPP restarted over night.
We are also doing an LNS switch over tonight, but there are issues with that which means it is likely to all happen later than usual, i.e. around 7am.
As it happens we are working on some updates to RADIUS which is hoped to be phased in over the next few weeks. Incidents like this, whilst minor, and not service affecting, are a nuisance, and they are being incorporated in to the new design to make such issues less likely.
|
Update
18 Apr 18:58:48
|
We're clearing the pink lines now, ppp restart.
|
Update
28 Apr 21:12:03
|
Some customers on C or D gormless will have gaps in their graphs this afternoon and this evening. The lines have been connected through these gaps but have has some PPP restarts.
|
| Started |
18 Apr 17:00:00 |
| Previously expected |
19 Apr 07:00:00 |
| Closed |
2 May 02:14:51 |
| Session restarts - Closed |
29 Apr 08:31:03 |
Details 29 Apr 08:31:52 |
Some customers will have had PPP restarts around 08:30 and possibly moving from the D to the C LNS.
Sorry for any inconvenience.
|
| Started |
29 Apr 08:20:00 |
| Closed |
29 Apr 08:31:03 |
| RADIUS issues - Closed |
28 Apr 18:09:25 |
Details 28 Apr 12:30:40 |
We seem to have an issue with RADIUS accounting this morning. We have restarted things and we are investigating the cause.
There have been some odd tweets and emails about the issue, but it lists the previous "AdminReset" from a previous LNS switch over from earlier in the week. This seems to be an odd side effect of the problem which we also need to investigate.
|
Update
28 Apr 12:33:22
|
P.S. As usual, this also means some free usage, and will mean some PPP restarts later today and over night.
|
Update
28 Apr 12:33:58
|
It also means the lines show "salmon pink" on the control pages. But the lines are working, of course.
|
Update
28 Apr 16:34:57
|
We are manually clearing some of the remaining sessions for which there is no accounting. This means a brief PPP restart for some customers.
|
Update
28 Apr 17:26:29
|
There is definitely something up with RADIUS which we are stil chasing.
Two thirds of lines are all sorted and accounting cleanly now.
One third is still being working on.
|
| Started |
28 Apr 08:00:00 |
Closed 28 Apr 18:09:25 |
We think it is all sorted for now |
| BE Congestion - Closed |
17 Apr 11:18:24 |
Details 5 Apr 12:38:03 |
We have noticed that for the last few days there have been increased latency on quite a few of our BE lines. Initially the latency was occurring between 18:00 - 00:00, however today we have seen the latency start at 09:00.
BE have been informed and this has been passed to O2 to investigate possible network issues.
|
Update
5 Apr 20:51:53
|
Looks like the increased latency suddenly stopped around 1:30. Neither BE or O2 could see a cause for this.
We will montior over the weekend and early next week to make sure the issue is actaully resolved.
|
Update
11 Apr 11:05:47
|
The increased latency has not returned as of yet, however we are still chasing BE as to an explaination as to what the issue was.
|
Update
17 Apr 11:19:12
|
We finally have an update as to what the cause of this issue was:
"I can confirm that the underlying cause had been narrowed down to a network card in our provider's core network. The card was found to be just looping and generating high amounts of traffic. As a countermeasure, an engineer physically rebooted the device and thereby resolved the issue. The card itself has remained stable since and no further concerns have been registered."
|
Update
17 Apr 11:19:20
|
We finally have an update as to what the cause of this issue was:
"I can confirm that the underlying cause had been narrowed down to a network card in our provider's core network. The card was found to be just looping and generating high amounts of traffic. As a countermeasure, an engineer physically rebooted the device and thereby resolved the issue. The card itself has remained stable since and no further concerns have been registered."
|
| Started |
5 Apr 11:58:06 by AAISP Pro Active Monitoring Systems |
Closed 17 Apr 11:18:24 |
Card Reboot |
| Cause |
BE |
| Lines dropped: 100% 21CN-BRAS-RED5-LS-BAS-HUDDERSFIELD - Closed |
11 Apr 11:03:36 |
Details 5 Apr 02:06:02 |
Lines: 100% 21CN-BRAS-RED5-LS-BAS-HUDDERSFIELD dropped at 2013-04-05 02:05:39 We have advised BT This is likely to have affected multiple internet providers using BT
|
Update
5 Apr 03:05:03
|
Lines: 100% 21CN-BRAS-RED5-LS-BAS-HUDDERSFIELD dropped again at 2013-04-05 03:04:35.
|
Update
11 Apr 11:03:52
|
This appears to have been caused by a BT PEW.
|
Update
11 Apr 11:04:09
|
This appears to have been caused by a BT PEW.
|
| Started |
5 Apr 02:05:39 by AAISP automated checking |
| Closed |
11 Apr 11:03:36 |
| Cause |
BT |
| LNS issue - loss of graphs - Closed |
4 Apr 19:45:25 |
Details 4 Apr 15:40:06 |
One of our LNSs had an issue today, lines blipped and reconnected automatically. Graphs are lost from before the incident
This is being investigated.
This only affected one of the three live LNSs.
|
| Broadband Users Affected |
33% |
| Started |
4 Apr 15:03:00 |
Closed 4 Apr 19:45:25 |
This happened again, and we are investigating the cause. Thankfully the fallback systems are very quick and efficient these days, but this should not happen! |
| Packet loss is back on lines on the MANOR PARK exchange - Closed |
1 Apr 10:35:28 |
Details 5 Mar 10:58:09 |
Packet loss is back on lines on the MANOR PARK exchange. We have reported this to BT and BTO are going to investigate.
|
Update
22 Mar 11:18:27
|
BT's IP department are currently investigating and monitoring this issue.
|
Update
22 Mar 11:19:22
|
BT's IP department are currently investigating and monitoring this issue.
|
| Started |
2 Feb 12:57:22 by AAISP Staff |
| Closed |
1 Apr 10:35:28 |
| Lines dropped: 100% 21CN-REGION-21CN-BRAS-RED10-L-NWS - Closed |
1 Apr 10:34:08 |
Details 27 Mar 16:27:02 |
Lines: 100% 21CN-REGION-21CN-BRAS-RED10-L-NWS dropped at 2013-03-27 02:24:36 We have advised BT This is likely to have affected multiple internet providers using BT
|
| Started |
27 Mar 02:24:36 by AAISP automated checking |
| Closed |
1 Apr 10:34:08 |
| Cause |
BT |
| LINX peering loss - Closed |
1 Apr 10:33:41 |
Details 23 Mar 19:02:10 |
An issue was reported this evening with packet loss at LINX. We have suspended LINX peering for the time being until this was resolved.
We'd like to thank customers for alerting us to this with correct use of our MSO text system.
|
| Started |
23 Mar 18:37:00 |
| Previously expected |
23 Mar 18:42:00 |
| Closed |
1 Apr 10:33:41 |
| Home::1 restrictions - Closed |
1 Apr 09:57:09 |
Details 1 Apr 09:47:59 |
Not quite an April fool's joke, sadly, but some Home::1 users have seen their lines restricted this morning. This appears to be due to a delay between bill for this month being issued and DD being scheduled. This is silly! It has been fixed for next month.
Quite separately, something is not right with RADIUS accounting which means a lot of users are not seeing any metered usage, i.e. download is free for many customers this morning, including Home::1 users. This too i sbeing fixed now.
|
| Started |
1 Apr 06:00:00 |
Closed 1 Apr 09:57:09 |
All affected lines have had the restrictionlifted. |
| LNS issue, loss of graphs - Closed |
29 Mar 20:39:42 |
Details 29 Mar 20:42:11 |
One of our LNSs had an issue today, lines blipped and reconnected automatically. Graphs are lost from before the incident, and usage for the hour up to the incident will not have been metered/billed.
We have some clear ideas what caused this and it is being investigated.
This only affected one of the three live LNSs.
|
Update
29 Mar 21:03:53
|
Note we cleared some of the affected lines back to the original LNS at 9pm causing a PPP reconnect.
|
| Broadband Users Affected |
33% |
| Started |
29 Mar 18:42:00 |
| Closed |
29 Mar 20:39:42 |
| Evening Packet loss on BRADFORD-ON-AVON - Closed |
5 Mar 16:11:44 |
Details 14 Jan 15:57:15 |
For the last few nights there appears to have been packet loss on all 21CN BRADFORD-ON-AVON line. The packet loss seems to last between 17:00 - 00:00.
We are currently reporting this to BT.
|
Update
14 Jan 16:43:31
|
BT have raised a works request.
|
Update
15 Jan 10:39:27
|
BT operate are still investigating, we should have an update tomorrow morning.
|
Update
5 Mar 16:11:59
|
This looks to be all fixed.
|
| Started |
14 Jan 15:55:26 by AAISP Staff |
| Closed |
5 Mar 16:11:44 |
| Cause |
BT |
| Packet loss on lines on the MANOR PARK exchange - Closed |
28 Feb 09:03:42 |
Details 27 Feb 13:17:30 |
There is packet loss on lines on the MANOR PARK exchange. We have reported this to BT and BTO are going to investigate.
|
| Started |
2 Feb 12:57:22 by AAISP Staff |
Closed 28 Feb 09:03:42 |
Service was fully restored this morning after loop testing BT performed cleared alarms. |
| Cause |
BT |
| Lines Down on BT BRAS 21CN-BRAS-RED10-SL and 21CN-BRAS-RED9-SL - Closed |
27 Feb 03:03:03 |
Details 27 Feb 02:31:51 |
Since 02:06 today we have all lines that are connecting though the BT BRAS '21CN-BRAS-RED10-SL' and '21CN-BRAS-RED9-SL' are down.
This is due to BT planned work on the Slough BRAS.
This will be affecting other ISPs too, and is affecting about 0.6% of our customers.
|
| Broadband Users Affected |
0.30% |
| Started |
27 Feb 02:06:45 |
Closed 27 Feb 03:03:03 |
Lines came back online at around 3am. |
| At risk - wholesale routing issues - Closed |
14 Feb 20:12:23 |
Details 26 Jan 11:51:26 |
Following loss of one external link at 4:31 this morning we can see that there are some areas which are not quite falling back as planned. This means some of the wholesale customer links have needed some manual reconfiguration. We believe this is all resolved now, but as it is running on only one link the wholesale interconnects are at-risk until the link is fixed. We are chasing this with the suppliers.
This should have no impact on our broadband customers generally, though you may find you switch LNS if you reconnect at all today as part of the work around for this has been a change of active LNSs. The impact of this is loss of graphs during the day, but should be correct historically from tomorrow.
In the longer term we will be contacting wholesale customers to review the fallback arrangements in their network and ours to ensure such a link failure in future would fall-back seamlessly.
|
| Started |
26 Jan 04:31:00 |
| Closed |
14 Feb 20:12:23 |
| Packet loss on FTTP line on BRADWELL ABBEY exchange - Closed |
11 Feb 14:15:43 |
Details 21 Jan 16:24:43 |
There is packet loss on FTTP lines on the BRADWELL ABBEY exchange, we have reported this to BT.
|
Update
22 Jan 09:26:50
|
The packet loss appears to have disappeared at around 22:00 last night. We're awaiting further updates from BT later on today.
|
Update
1 Feb 23:50:25
|
This is still a problem, and we're seeing low levels of packet loss which will affect speed.
The latest from BT is that they have planned work on 6th Feb to increase capacity.
We will try to get more details and a clarification as to wether this upgrade will fix the problems that our monitoring and customers are seeing.
|
| Started |
16 Jan 14:20:02 by AAISP Staff |
Closed 11 Feb 14:15:43 |
BTs planned capacity increase on 6th Feb appears to have resolved the issue. |
| Network issue affecting some DSL and Wholesale customers - Closed |
30 Jan 16:59:47 |
Details 30 Jan 13:19:23 |
Similar to the problem at the weekend, one of our ports to Datahop is flapping - we're disabling the port and moving the traffic over to our other port.
More details to follow
|
Update
30 Jan 16:59:47
|
Datacentre staff are investigating the faulty cable, but service was restored in about 15 minutes of the fault happening.
|
| Started |
30 Jan 13:10:18 |
Closed 30 Jan 16:59:47 |
Service restored, the cable has sinces be connected to a different port at the far end. |
| Home::1/SIM usage graphic - Closed |
26 Jan 18:41:58 |
Details 26 Jan 17:20:26 |
The main www.aa.net.uk web page normally has a graphic on the home page for Home::1 and quota limited SIM users.
This is not working at present for most customers as a result of a link failure.
Once we have sorted the link failure we are going to be making a few changes so this does not happen again.
It also means the top-up screen for users on Home::1 is missing, please text support if you need topup over the weekend.
|
| Started |
26 Jan 04:31:00 |
Closed 26 Jan 18:41:58 |
Resolved by a reconfiguration for now |
| Network blip - Closed |
26 Jan 04:46:10 |
Details 26 Jan 04:45:54 |
Nagios has gone mental telling me about a port blip at 04:31 in London, and I can see that a handfull of customers had an issue at the same time. I'm investigating what is going on, but overall things look fine, traffic is flowing, and not a peep out of customers on irc.
|
Update
30 Jan 13:20:29
|
This port started flapping again on 30th Jan 13:10 more info here: http://status.aa.nu/apost.cgi?incident=1726
|
| Started |
26 Jan 04:31:00 |
Closed 26 Jan 04:46:10 |
The issue relates to a specific link, and the lines that blipped werte mobile data SIMs. The link has been shut down (thanks Paul). We're investigating this further, but panic over for now. |
| LNS restart - Closed |
18 Jan 12:03:24 |
Details 18 Jan 12:05:17 |
The "B" LNS restarted unexpectedly - lines reconnected to other LNSs as expected, but graphs are lost.
|
| Started |
18 Jan 12:03:13 |
| Closed |
18 Jan 12:03:24 |
| HAMPTON exchange - Closed |
16 Jan 11:15:23 |
Details 16 Jan 09:13:58 |
Lines on the Hampton exchange have been down since around 18:04 this is linked to the Southampton 7750 router/Multi-Service Interconnect Link failing, an enigneer is currently working to replace the on-board Flash card.
|
| Started |
15 Jan 18:04:14 by AAISP Staff |
Closed 16 Jan 11:15:23 |
Service was restored at 11:15:55 |
| GISBURN exchange drop - Closed |
14 Jan 18:21:00 |
Details 14 Jan 16:33:45 |
Some lines on the GISBURN exchange dropped at 15:56:52
|
Update
14 Jan 16:47:04
|
A card is failing on MSIP equipment at Preston. A remote reset has been attempted but did not restore service. A BT engineer is being tasked to perform a reseat or change hardware.
|
| Started |
14 Jan by AAISP Staff |
Closed 14 Jan 18:21:00 |
BT Incident Reference:2357. Service was restored at 18:21 following the change of a failed card at Preston.
|
| BE LAC restart? - Closed |
6 Jan 02:00:00 |
Details 6 Jan 01:59:53 |
One of Be's LACs appears to have restarted affecting a hadful of customers. Lines reconnected.
|
| Started |
6 Jan 01:57:00 |
| Closed |
6 Jan 02:00:00 |
| Unexpected LNS resart - Closed |
6 Jan 01:45:00 |
Details 6 Jan 01:56:58 |
The "D" LNS restarted affecting a 1/3 of customers - it is the same cause code as before and we are looking to see if the extra diagnostics have provided more clues and how we can add even more to track this next time. This is very infreqent and we have failed to reproduce this in the lab, so it is taking time to track it down and fix it. Lines reconnected immediately as they are supposed to and graphs are lost before the restart.
|
| Started |
6 Jan 01:41:00 |
| Closed |
6 Jan 01:45:00 |
| Home::1 and Direct Debit - Closed |
1 Jan 09:22:24 |
Details 1 Jan 09:17:47 |
Again we have run in to a snag with Direct Debits and Home::1 which has meant there is a race condition where several Home::1 accounts show as not being paid before the Direct Debit notices are sent out. We believe we have found the underlying cause of this at last, and we are getting lines back on line now. Sorry for the inconvenience.
|
| Started |
1 Jan 09:00:00 |
Closed 1 Jan 09:22:24 |
The start of month bill run took longer than expected which created the race condition, we have adjusted the logic to try and avoid this issue in future. |
| Network blip - Closed |
31 Dec 2012 02:25:00 |
Details 31 Dec 2012 02:32:57 |
There was some sort of blip that caused problems with 3 of our 4 external routers. It is not clear what exactly at the stage. They all recovered automatically.
|
| Started |
31 Dec 2012 02:20:00 |
| Closed |
31 Dec 2012 02:25:00 |
| Be Lines Blip - Closed |
17 Dec 2012 15:49:43 |
Details 14 Dec 2012 14:52:45 |
Some Be lines just blipped, they have come back though. More info to follow.
|
Update
14 Dec 2012 14:54:11
|
This looks like a(n) LAC within the Be network, we're contacting Be for more details, howevr lines came back online within a couple of minutes.
|
Update
14 Dec 2012 15:03:56
|
Another blip happened at 15:02, we are chasing Be.
|
Update
14 Dec 2012 15:14:23
|
Be are aware of the fault and suspect a router on the Be network - they are investigaating the cause.
|
Update
17 Dec 2012 15:49:43
|
The fault has passed now and we've asked Be for an update as to what they have done.
|
| Started |
14 Dec 2012 14:43:00 |
Closed 17 Dec 2012 15:49:43 |
This was caused be a router restart within Be. |
| LNS Blip - Closed |
18 Dec 2012 11:38:31 |
Details 18 Dec 2012 11:37:50 |
We've just had an LNS reboot which has meant about a third of our lines dropped.
They will reconnect in the next minute or so.
|
Update
18 Dec 2012 11:38:31
|
This is similar to the problem yesterday which affected a different LNS.
We are investigating.
|
Update
18 Dec 2012 13:52:33
|
Lines came back online within a minute or two, we are investigating the cause of these blips.
|
| Closed |
18 Dec 2012 11:38:31 |
| Congestion issue on Redruth Exchange - Closed |
20 Dec 2012 18:55:28 |
Details 19 Dec 2012 13:37:44 |
The Redruth Exchange is experiencing packet loss in the evenings from around 7pm to 11pm. It has been reported to BT
|
Update
20 Dec 2012 18:57:02
|
BT say they have resolved this problem and that their links last night were showing good levels of traffic without problems.
We'll continue to monitor, but hopfully the latency problem has now been fixed.
|
| Started |
19 Dec 2012 13:34:53 |
Closed 20 Dec 2012 18:55:28 |
BT say they have resolved this problem and that their links last night were showing good levels of traffic without problems.
We'll continue to monitor, but hopfully the latency problem has now been fixed.
|
| BT Lines on Stepney node up and down all night - Closed |
20 Dec 2012 07:12:11 |
Details 20 Dec 2012 05:27:30 |
Sorry for lack of post earlier, it should have been automatic.
BT clearly have some major issues with their Stepney node which has meant lines have been up and down all night.

|
| Broadband Users Affected |
5% |
| Started |
20 Dec 2012 01:35:26 |
Closed 20 Dec 2012 07:12:11 |
BT say this is fixed, and lines all look stable |
| BT Fibre Blip - Closed |
18 Dec 2012 22:28:00 |
Details 18 Dec 2012 22:26:30 |
One of our links to BT 'blipped' causing around a third of our customers to disconnect. Lines are reconnecting now though.
|
Update
18 Dec 2012 22:30:09
|
Most lines are back online now.
|
| Started |
18 Dec 2012 22:18:00 |
Closed 18 Dec 2012 22:28:00 |
The issue was indeed the link to BT, and we are awaiting an explanation from BT as to what happened. It is possible they will not know either. |
| LNS Blip - Closed |
17 Dec 2012 17:05:00 |
Details 17 Dec 2012 17:01:36 |
One of our LNS's just rebooted causing DSL lines connected to it to blip.
We are looking in to the cause, but line will reconnect within a minute or two.
|
Update
18 Dec 2012 05:59:13
|
Lines did reconnect immediately - if nothing else this is a good test of the fall-back and redundant systems. Only a third of customers were effected and the main problem is loss of the monitoring graphs. We do apologise for any inconvenience.
|
| Started |
17 Dec 2012 16:59:30 |
Closed 17 Dec 2012 17:05:00 |
This is indeed the same issue we saw last time, and is still being investigated. |
| Graphs today - Closed |
15 Dec 2012 23:59:59 |
Details 12 Dec 2012 07:23:44 |
As part of the preparation for tomorrow's planned work, we are running on two LNSs today rather than three. This allows the switch change to go ahead tomorrow.
Normally an LNS change some means people will see their graph start in the early hours, but graph history is correct and shows the graphs join up the next day.
Today is slightly special and some people will change LNS if they re-connect at all during the day. The result is a graph that starts from that first reconnect. Again, over night, the historical graphs will glue it all together. It just means some people with line issues will not see the start of day today on their graphs today.
Anyone with multiple lines bonded that reconnects may change LNS too, in which case the other lines will be moved within a couple of minutes to ensure bonding continues normally.
|
Update
13 Dec 2012 16:44:05
|
We are switching back to three LNSs tonight - but the issue with graphs changing on a reconnect will still apply until we have cleared lines to new LNSs over the next few nights.
|
| Started |
12 Dec 2012 |
| Previously expected |
15 Dec 2012 23:59:59 |
| Closed |
15 Dec 2012 23:59:59 |
| LNS restart - Closed |
16 Dec 2012 18:13:00 |
Details 16 Dec 2012 18:10:28 |
One of our LNSs restarted unexpectedly (c.gormless) and we are looking in to it now. Lines reconnecting automatically as expected.
|
Update
16 Dec 2012 18:12:31
|
The diagnostics are not that clear and this will take a bit longer to get to the bottom of this. The immediate issue is over with an automatic restart, and all four LNSs are in operation.
|
| Started |
16 Dec 2012 18:07:00 |
Closed 16 Dec 2012 18:13:00 |
We are adding more diagnostics and will be including this in LNS updates over coming weeks. This should help track what happened if this ever happens again. |
| BT issues in Manchester - Closed |
16 Dec 2012 16:18:00 |
Details 16 Dec 2012 18:38:29 |
There appear to have been a couple of blips with BT 20CN lines in Manchester. No word from BT.
|
| Started |
16 Dec 2012 15:51:00 |
| Closed |
16 Dec 2012 16:18:00 |
| Router upgrade - Closed |
13 Dec 2012 19:40:00 |
Details 12 Dec 2012 19:28:41 |
We have identified a reason some upgrades have causes a few seconds down time when not expetced to cause any. We have just done upgrades to rectify this before tomorrows switch change over.
Some customers may have seen a few seconds of issues but hopefully we have managed to sort this once and for all now.
Sorry for any inconvenience.
|
| Started |
12 Dec 2012 19:20:00 |
| Closed |
13 Dec 2012 19:40:00 |
| Latency on some BT 21CN lines on Guildford BRASs - Closed |
12 Dec 2012 21:00:00 |
Details 11 Dec 2012 19:31:28 |
Since 7pm there has been notably high latency on many BT 21CN lines on Guildford BRASs. We are investigating with BT.

|
Update
11 Dec 2012 20:02:38
|
Our analysis shows this to be a fualty LAG in BT core network somewhere near the Guildford BRASs. We are chasing this with BT.
|
Update
11 Dec 2012 20:46:21
|
We have managed to manually juggle IP addresses of LNS L2TP enpoints to work around this on most affected lines now.
|
Update
12 Dec 2012 04:06:19
|
Looks to be fixed at 21:00
|
Update
12 Dec 2012 10:17:42
|
Here is some more information about this particualr issue (originally posted to Facebook)
I do think we should write up more about some of this stuff. At 7pm last night we had over 100 customer's Internet start being unusable with massive latency. See graph below, shows it.
Within 20 minutes we hadidentified which lines were affected, and were on to BT (as they were all BT lines). Customers were updates by status pages and on irc.
We managed to work out that there was a pattern in the affected lines.
Not only were they all BT lines, on a specific area metro node (Guildford), and all 21CN lines, there was a pattern where sets of lines on combinations of LNS at our end and and BRASs at BTs end had issues.
We tested changing a line to a different IP on the same LNS our end, and as we expected, the problem went away. We were also able to confirm with affected customers on irc that they saw the fix work as well.
We were able to tell BT that the problem must be in a link aggregation group (LAG) somewhere in the back-haul from Guildford. Basically, this means multiple links are used to handle the total capacity. The equipment picks the link based on a hash from things like IP addresses, so the IP address of the BRAS and LNS together decide the link. One of the links was ill and so affected a set of lines.
We then proceeded to manually change affected lines to a different IP for the LNS (not a change to the customer IP, this is internal). This caused their service to go back to normal.
BT did fix the issue (still waiting for a detailed report from them) at 9pm.
From experience, this is the sort of issue BT have no way to detect themselves. Maybe they have better monitoring now, but when we reported it they did not know about it, and other ISPs had not spotted it.
It is worth bearing in mind, this fault will have affected every ISP using BT 21CN lines, not just us. The fact we got BT on to it so quickly helped all of the affected lines on all affected ISPs. You're welcome!
It is only because we have this detailed monitoring and graphing of every line that we can identify and diagnose issues like this at all, let alone so quickly.
|
| Started |
11 Dec 2012 19:00:00 |
| Closed |
12 Dec 2012 21:00:00 |
| Line Drop and Packet Loss on SHOREDITCH Exchange 21CN Lines - Closed |
11 Dec 2012 16:36:20 |
Details 11 Dec 2012 11:58:37 |
Temporary loss of service followed by packet loss on the Shoreditch Exchange. BT currently investigating the issue.
|
Update
11 Dec 2012 16:36:38
|
IPMB card switchover carried out under ALS 500428 at 12:53, BT fixed fault.
|
| Started |
11 Dec 2012 11:56:17 |
Closed 11 Dec 2012 16:36:20 |
IPMB card switchover carried out under ALS 500428 at 12:53 |
| Some 21CN Guildford RASs down - Closed |
11 Dec 2012 16:32:11 |
Details 11 Dec 2012 15:21:07 |
At roughly 3pm a few guildford RASs went down.
We are rpeorting this to BT at the moment.
|
Update
11 Dec 2012 15:23:19
|
The RASs that are affected are:
21CN-BRAS-RED9-GI-B
21CN-BRAS-RED10-GI-B
21CN-BRAS-RED12-GI-B
|
Update
11 Dec 2012 15:28:46
|
BT have raised an incident.
Trying to get more information.
|
Update
11 Dec 2012 15:35:03
|
BT say diagnostics are still ongoing update should be around 16:00
|
Update
11 Dec 2012 16:06:28
|
Most lines are back up now.
|
Update
11 Dec 2012 16:32:43
|
fault fixed, Missing vlan configuration was reapplied.
|
| Started |
11 Dec 2012 15:00:00 by AAISP Pro Active Monitoring Systems |
Closed 11 Dec 2012 16:32:11 |
Missing vlan configuration was reapplied |
| Cause |
BT |
| Status pages, etc, updating - Closed |
06 Dec 2012 09:39:28 |
Details 19 Nov 2012 16:50:31 |
We carried out some major "behind the scenes" updates to the main control systems over the weekend, and whilst most of these have gone seamlessly, a few have not quite gone to plan.
As such, some of the controls and buttons and text in the information packs are not quite right.
Feel free to point out any issues on irc and we'll catch them all.
Sorry for any inconvenience.
|
| Started |
18 Nov 2012 12:00:00 |
| Closed |
06 Dec 2012 09:39:28 |
| Home::1 users offline briefly - Closed |
06 Dec 2012 09:39:13 |
Details 05 Dec 2012 11:18:12 |
Home::1 users were offline for about 5 minutes from 11:00 this morning.
It's now fixed.
We've given all Home::1 users a 1 gigabyte extra quota this month by way of apology.
|
| Closed |
06 Dec 2012 09:39:13 |
| 21CN Outage at the SOUTHAMPTON Exchange - Closed |
03 Dec 2012 15:02:19 |
| 21CN outage at WANDSWORTH exchange for line on the 21CN-BRAS-RED3-SL RAS - Closed |
03 Dec 2012 15:01:55 |
| 21CN outage at UPPER HOLLOWAY 21CN-BRAS-RED12-L-WAT - Closed |
03 Dec 2012 15:01:41 |
|
|