Order posts by limited to posts

8 Apr 16:58:41
Details
8 Apr 16:58:41
Some lines on the BT LEITH exchange have gone down. BT are aware and are investigating at the moment.
Started 8 Apr 16:30:20 by Customer report
Update was expected 8 Apr 17:40:20

3 Apr 15:57:14
Details
01 Nov 2013 15:05:00
We have identified an issue that appears to be affecting some customers with FTTC modems. The issue is stupidly complex, and we are still trying to pin down the exact details. The symptoms appear to be that some packets are not passing correctly, some of the time.

Unfortunately one of the types of packet that refuses to pass correctly are FireBrick FB105 tunnel packets. This means customers relying on FB105 tunnels over FTTC are seeing issues.

The work around is to remove the ethernet lead to the modem and then reconnect it. This seems to fix the issue, at least until the next PPP restart. If you have remote access to a FireBrick, e.g. via WAN IP, and need to do this you can change the Ethernet port settings to force it to re-negotiate, and this has the same effect - this only works if directly connected to the FTTC modem as the fix does need the modem Ethernet to restart.

We are asking BT about this, and we are currently assuming this is a firmware issue on the BT FTTC modems.

We have confirmed that modems re-flashed with non-BT firmware do not have the same problem, though we don't usually recommend doing this as it is a BT modem and part of the service.

Update
04 Nov 2013 16:52:49
We have been working on getting more specific information regarding this, we hope to post an update tomorrow.
Update
05 Nov 2013 09:34:14
We have reproduced this problem by sending UDP packets using 'Scapy'. We are doing further testing today, and hope to write up a more detailed report about what we are seeing and what we have tested.
Update
05 Nov 2013 14:27:26
We have some quite good demonstrations of the problem now, and it looks like it will mess up most VPNs based on UDP. We can show how a whole range of UDP ports can be blacklisted by the modem somehow on the next PPP restart. It is crazy. We hope to post a little video of our testing shortly.
Update
05 Nov 2013 15:08:16
Here is an update/overview of the situation. (from http://revk.www.me.uk/2013/11/bt-huawei-fttc-modem-bug-breaking-vpns.html )

We have confirmed that the latest code in the BT FTTC modems appears to have a serious bug that is affecting almost anyone running any sort of VPN over FTTC.

Existing modems seem to be upgrading, presumably due to a roll out of new code in BT. An older modem that has not been on-line a while is fine. A re-flashed modem with non-BT firmware is fine. A working modem on the line for a while suddenly stopped working, presumably upgraded.

The bug appears to be that the modem manages to "blacklist" some UDP packets after a PPP restart.

If we send a number of UDP packets, using various UDP ports, then cause PPP to drop and reconnect, we then find that around 254 combinations of UDP IP/ports are now blacklisted. I.e. they no longer get sent on the line. Other packets are fine.

Sending 500 different packets, around 254 of them will not work again after the PPP restart. It is not actually the first or last 254 packets, some in the middle, but it seems to be 254 combinations. They work as much as you like before the PPP restart, and then never work after it.

We can send a batch of packets, wait 5 minutes, PPP restart, and still find that packets are now blacklisted. We have tried a wide range of ports, high and low, different src and dst ports, and so on - they are all affected.

The only way to "fix" it, is to disconnect the Ethernet port on the modem and reconnect. This does not even have to be long enough to drop PPP. Then it is fine until the next PPP restart. And yes, we have been running a load of scripts to systematically test this and reproduce the fault.

The problem is that a lot of VPNs use UDP and use the same set of ports for all of the packets, so if that combination is blacklisted by the modem the VPN stops after a PPP restart. The only way to fix it is manual intervention.

The modem is meant to be an Ethernet bridge. It should not know anything about PPP restarting or UDP packets and ports. It makes no sense that it would do this. We have tested swapping working and broken modems back and forth. We have tested with a variety of different equipment doing PPPoE and IP behind the modem.

BT are working on this, but it is a serious concern that this is being rolled out.
Update
12 Nov 2013 10:20:18
Work on this in still ongoing... We have tested this on a standard BT retail FTTC 'Infinity' line, and the problem cannot be reproduced. We suspect this is because when the PPP re-establishes a different IP address is allocated each time, and whatever is session tracking does not match the new connection.
Update
12 Nov 2013 11:08:17

Here is an update with some a more specific explanation as to what the problem we are seeing is:

On WBC FTTC, we can send a UDP packet inside the PPP and then drop the PPP a few seconds later. After the PPP re-establishes, UDP packets with the same source and destination IP and ports won't pass; they do not reach the LNS at the ISP.

Further to that, it's not just one src+dst IP and port tuple which is affected. We can send 254 UDP packets using different src+dest ports before we drop the PPP. After it comes back up, all 254 port combinations will fail. It is worth noting here that this cannot be reproduced on an FTTC service which allocates a dynamic IP which changes each time PPP re-established.

If we send more than 254 packets, only 254 will be broken and the others will work. It's not always the first 254 or last 254, the broken ones move around between tests.

So it sounds like the modem (or, less likely, something in the cab or exchange) is creating state table entries for packets it is passing which tie them to a particular PPP session, and then failing to flush the table when the PPP goes down.

This is a little crazy in the first place. It's a modem. It shouldn't even be aware that it's passing PPPoE frames, let along looking inside them to see that they are UDP.

This only happens when using an Openreach Huawei HG612 modem that we suspect has been recently remotely and automatically upgraded by Openreach in the past couple of months. Further - a HG612 modem with the 'unlocked' firmware does not have this problem. A HG612 modem that has probably not been automatically/remotely upgraded does not have this problem.

Side note: One theory is that the brokenness is actually happening in the street cab and not the modem. And that the new firmware in the modem which is triggering it has enabled 'link-state forwarding' on the modem's Ethernet interface.

Update
27 Nov 2013 10:09:42
This post has been a little quiet, but we are still working with BT/Openreach regarding this issue. We hope to have some more information to post in the next day or two.
Update
27 Nov 2013 10:10:13
We have also had reports from someone outside of AAISP reproducing this problem.
Update
27 Nov 2013 14:19:19
We have spent the morning with some nice chaps from Openreach and Huawei. We have demonstrated the problem and they were able to do traffic captures at various points on their side. Huawei HQ can now reproduce the problem and will investigate the problem further.
Update
28 Nov 2013 10:39:36
Adrian has posted about this on his blog: http://revk.www.me.uk/2013/11/bt-huawei-working-with-us.html
Update
13 Jan 14:09:08
We are still chasing this with BT.
Update
3 Apr 15:47:59
We have seen this affect SIP registrations (which use 5060 as the source and target)... Customers can contact us and we'll arrange a modem swap.
Resolution BT are testing a fix in the lab and will deploy in due course, but this could take months. However, if any customers are adversely affected by this bug, please let us know and we can arrange for BT to send a replacement ECI modem instead of the Huawei modem. Thank you all for your patience.
Started 25 Oct 2013

22 Mar 07:36:41
Details
22 Mar 07:36:41
We have started to see yet more congestion on BT lines last night. This looks again a bit like a link aggregation issue (where one leg of a multiple link trunk within BT is full). The patten is not as obvious this time. Looking at the history we can see that some of the affected lines have had slight loss in the evenings. We did not spot this with our tools because of the rather odd pattern. Obviously we are trying to get this sorted with BT, but we are pleased to confirm that BT are actually providing more data now that shows where each circuit will use network components within their network. We plan to integrate this soon so that we can correlate some of these newer congestion issues and point BT in the right direction more quickly.
Started 21 Mar 18:00:00

21 Mar 10:19:24
Details
11 Mar 10:11:55
We are seeing multiple exchanges with packet loss over BT wholesale. We are chasing BT on this and will update as and when we have updates. GOODMAYES CANONBURY HAINAULT SOUTHWARK LOUGHTON HARLOW NINE ELMS UPPER HOLLOWAY ABERDEEN DENBURN HAMPTON INGREBOURNE COVENTRY 21CN-BRAS-RED6-SF
Update
14 Mar 12:49:28
This has now been escalated to the next level for further investigation.
Update
17 Mar 15:42:38
BT are now raising faults on each Individual exchange.
Update
21 Mar 10:19:24
Below are the exchanges/RAS which has been fixed by capacity upgrades. We are hoping for the remanding four exchanges to be fixed in the next few days.
HAINAULT
SOUTHWARK
LOUGHTON
HARLOW
ABERDEEN DENBURN
HAMPTON
INGREBOURNE
GOODMAYERS
RAS 21CN-BRAS-RED6-SF
Update
21 Mar 15:52:45
COVENTRY should be resolved later this evening when a new link is installed between Nottingham and Derby. CANONBURY is waiting for CVLAN moves that begin 19/03/2014 and will be competed 01/04/2014.
Update
25 Mar 10:09:23
CANONBURY - Planned Engineering works have taken place on 19.3.14, and there are three more planned 25.3.14 , 26.3.14 and 1.4.14.
COVENTRY - Is now fixed
NINE ELMS and UPPER HOLLOWAY- Still suffering from packet loss and BT are investigating further.
Update
2 Apr 15:27:11
BT are still investigating congestion on Canonbury, Nine Elms and Upper Holloway.
Broadband Users Affected 1%
Started 9 Mar 10:08:25 by AAISP Pro Active Monitoring Systems

20 Mar 11:10:57
Details
17 Feb 20:13:09
We are seeing packet loss at peak times on some lines on the Crouch End exchange. It's a small number of customers, and it looks like a congested SVLAN. This has been reported to BT.
Update
18 Feb 10:52:26
Initially BT were unable to see any problem, their monitoring was not showing any congestion and they wanted us to report individual line faults rather than this being dealt as a specific BT network problem. However we have spoken to another ISP who confirms the problem. BT have now opened an Incident and will be investigating.
Update
18 Feb 11:12:47
We have passed all our circuit details and graphs to proactive to investigate.
Update
18 Feb 16:31:17
TSO will investigate overnight
Update
20 Feb 10:15:02
No updates from TSO, proactive are chasing.
Update
27 Feb 13:24:38
There is still congestion, we are chasing BT again.
Update
28 Feb 09:34:50
Appears the issue is on the MSE router. Lines connected to the MSE are due to be migrated on 21st March however BT are hoping to get this done by 21th March.
Broadband Users Affected 0.10%
Started 17 Feb 20:10:29

3 Apr 12:26:40
Details
25 Mar 09:55:20

We are seeing customer routers being attacked this morning, which is causing them to drop. This was previously reported in the status post http://status.aa.net.uk/1877 where we saw that the attacks were affecting ZyXEL routers, as well as other makes.

Since that post we have updated the configuration of customer ZyXEL routers, where possible and these are no longer being affected. However, these attacks are affecting other types of routers.

We suggest that customers with lines that are dropping to check their router configuration and disable access to the router's web interface from the internet, or at least to change the the port used (eg one in the range of 1024-65535)

Please speak to Support for more information.

Update
28 Mar 10:13:13
This is happening again, do speak to suport if you need help changing the web interface settings.
Customers with ZyXELs can change the port from the control pages.
Started 25 Mar 09:00:40
Closed 3 Apr 12:26:40

1 Apr 10:00:00
Details
1 Apr 12:13:31
Some TalkTalk connected lines dropped at around 09:50 and reconnected a few minutes after. It looks like a connectivity problem between us and TalkTalk on one of our connections to them. We are investigating further.
Started 1 Apr 09:50:00
Closed 1 Apr 10:00:00

31 Mar 15:03:25
Details
31 Mar 09:40:40
Some TalkTalk line diagnostics (Signal graphs and line tests) as available from the Control Pages are not working at the moment. This is being looked in to.
Update
31 Mar 15:03:17
This is resolved. The TalkTalk side appears of have a bug relating to timezones.
Resolution This is resolved. The TalkTalk side appears of have a bug relating to timezones.
Started 31 Mar 09:00:00
Closed 31 Mar 15:03:25

20 Mar 11:17:21
Details
20 Mar 08:38:52
Customers will be seeing what looks like 'duplicated' usage reporting on the control for last night and this morning. This has been caused by a database migration that is taking longer than expected. The usage 'duplication' has been caused by usage reports being missed and so on subsequent hours the usage has been spread equally across missed hours.
This means that overall the usage reporting will be correct, but an individual hour will be incorrect.
This has also affected a few other related things such as the Line Colour states.
Update
20 Mar 11:17:55
Usage reporting is now back to normal.
Started 19 Mar 18:00:00
Closed 20 Mar 11:17:21

2 Mar 11:33:29
Details
1 Mar 04:24:02
Lines: 100% 21CN-REGION-GI-B dropped at 2014-03-01 04:22:17
We have advised BT
This is likely to have affected multiple internet providers using BT
Update
1 Mar 04:25:06
Lines: 100% 21CN-REGION-GI-B dropped again at 2014-03-01 04:23:21.
Broadband Users Affected 2%
Started 1 Mar 04:22:17 by AAISP automated checking
Closed 2 Mar 11:33:29
Cause BT

11 Mar 09:32:42
Details
6 Mar 13:07:51

We have had a small number of reports from customers who have had the DNS settings on their routers altered. The IPs we are seeing set are 199.223.215.157 and 199.223.212.99 (there may be others)

This type of attack is called Pharming. In short, it means that any internet traffic could be redirected to servers controlled by the attacker.

There is more information about pharming on the following pages:

At the moment we are logging when customers try to accesses these IP addresses and we are then contacting the customers to make them aware.

To solve the problem we are suggesting that customers replace the router or speak to their local IT support.

Update
6 Mar 13:33:10
Changing the DNS settings back to auto, changing the administrator password and disabling WAN side access to the router may also prevent this from happening again.
Update
6 Mar 13:48:14
Also reported here: http://www.pcworld.com/article/2104380/
Resolution We have contacted the few affected customers.
Started 6 Mar 09:00:00
Closed 11 Mar 09:32:42

7 Mar 15:08:45
Details
7 Mar 15:10:59
Some broadbands lined blipped at 15:05. This was a result of one of our LNSs restarting. Lines are back online and we'll investigate the cause
Started 7 Mar 15:03:00
Closed 7 Mar 15:08:45

27 Feb 20:40:00
Details
27 Feb 20:29:14
We are seeing some TT lines dropping and a routing problem.
Update
27 Feb 20:39:20
Things are ok now, we're investigating. This looks to have affected some routing for broadband customers and caused some TT lines to drop.
Resolution We are not entirely sure what caused this, however we do believe it to be related to BGP flapping. This also looks to have affected other ISPs and networks too.
Started 27 Feb 20:18:00
Closed 27 Feb 20:40:00

16 Feb 17:59:00
Details
16 Feb 18:12:15
All lines reconnected right away as per normal backup systems, but graphs on the "B" LNS have lost history before the reset. The exact cause is not obvious yet, but at the same time there is yet another of these quite regular attacks on ZyXEL routers which adds to confusion. As advised on another status post there are changes to ZyXEL router config planned to address the issue.
Broadband Users Affected 33.33%
Started 16 Feb 17:58:00
Closed 16 Feb 17:59:00

24 Feb 12:00:00
Details
11 Jan 08:42:32
Since around 2am, as well as a short burst last night around 19:45, we have seen some issues with some lines. This appears to be specific to certain types of router being used on the lines. We are still investigating this.
Update
11 Jan 10:53:53
At the moment, we have managed to identify at least some of the traffic and the affected routers and block it temporarily. We'll be able to provide some more specific advice on the issue and contact affected customers in due course.
Update
13 Jan 14:07:56
We blocked a further IP this morning.
Update
15 Jan 08:17:47
The issue is related to specific routers, and is affecting many ISPs. In our case it is almost entirely zyxel routers that are affected. It appears to be some sort of widespread and ongoing syn flood attack that is causing routers to crash and resulting in loss of sync. We are operating some source IP blocking temporarily to address these issues for the time being, and will have a simple button on our control pages to reconfigure zyxel routers for affected customers shortly.
Update
7 Feb 10:24:07
Last night and this morning there was another flood of traffic causing ZyXELs to restart. We suggest changing the web port to something other than 80, details can be found here: http://wiki.aa.org.uk/Router_-_ZyXEL_P660R-D1#Closing_WAN_HTTP
Update
13 Feb 10:44:41
We will be contacting ZyXEL customers by email over the next few days regarding these problems. Before that though, to verify our records of the router type, we will be performing a 'scan' of customer's WAN IP addresses. This scan will involve downloading the index page from the WAN address.
Update
20 Feb 21:34:54
Customers with ZyXELs online have been contacted this week regarding this issue.
Update
24 Feb 11:17:13
As per email to affected customers, we are updating the http port on ZyXEL routers today - Customers will be emailed as their router is updated.
Resolution Affected customers have been notified, tools in place on the Control Pages for customers to manage the http port and where appropriate ZyXEL routers have had their http port and WAN settings changed.
Broadband Users Affected 5%
Started 11 Jan 02:00:00
Closed 24 Feb 12:00:00

22 Feb 08:00:00
Details
22 Feb 07:56:22
There seems to have been something going on between 2am and 3am. We even had some incidents in BT, but whatever was going on managed to cause an unexpected restart of on of our LNS ("B") at just after 3am. So graphs before then are lost. At 7:55 lines that ended up on the "D" LNS were moved back to the "B" LNS causing a PPP restart.
Broadband Users Affected 33.33%
Started 22 Feb 03:00:00
Closed 22 Feb 08:00:00
Previously expected 22 Feb 08:00:00

20 Feb 18:18:00
Details
20 Feb 09:20:19
We are seeing some lines unable to log in since a blip at 02:49. We are contacting BT. These lines are in sync, but PPP is failing. It looks like a number of BT RASs are affected, including 21CN-BRAS-RED9-GI-B and 21CN-BRAS-RED1-NT-B.
Update
20 Feb 09:31:18
BT were already aware of the problem and are investigating.
Update
20 Feb 12:23:12
These lines are still down, we are chasing BT.
Update
20 Feb 13:21:20
BT believed this issue had been fixed. We have supplied them with all of our circuits that are down. This is being supplied to TSO and we should have an update in the next hour.
Update
20 Feb 14:26:44
A new incident has been raised as BT thought the issue was fixed.
Update
20 Feb 14:27:56
The issue is apparently still being diagnosed.
Update
20 Feb 21:17:48
BT fixed this at 18:18 this evening.
Update
20 Feb 21:34:04
BT say:
BT apologises for the problems experienced today by WMBC customers and are pleased to advise the issue has been fully resolved following the back out of a planned work completed overnight. BT is aware and understands the fault which occurred and have engaged vendor support to commence urgent investigations to identify the root cause.
The BT Technical Services teams have monitored the network since the corrective actions taken at 18:04 and have confirmed the network has remained stable.
Broadband Users Affected 0.20%
Started 20 Feb 03:49:00
Closed 20 Feb 18:18:00

20 Feb 10:00:00
Details
20 Feb 10:24:43
In addition to https://status.aa.net.uk/1891 there is a UK wide problem with lines logging in. This is affecting other ISPs, and affecting a small number of lines. BT are already aware.
Update
20 Feb 11:07:55
BT are saying this is now fixed. We saw affected lines come back online just after 10am. BT say about half of the UK 21CN WBC lines were affected, however, we only saw a few dozen lines affected.
Started 20 Feb 09:00:00
Closed 20 Feb 10:00:00

1 Feb 09:00:00
Details
1 Feb 03:38:03
Lines: 100% 21CN-REGION-PR dropped at 2014-02-01 03:36:28
We have advised BT
This is likely to have affected multiple internet providers using BT
Broadband Users Affected 1%
Started 1 Feb 03:36:28 by AAISP automated checking
Closed 1 Feb 09:00:00
Cause BT

6 Feb 10:00:00
Details
6 Feb 02:07:02
Lines: 100% 21CN-REGION-DY dropped at 2014-02-06 02:05:49
We have advised BT
This is likely to have affected multiple internet providers using BT
Broadband Users Affected 1%
Started 6 Feb 02:05:49 by AAISP automated checking
Closed 6 Feb 10:00:00
Cause BT

11 Feb 22:27:34
Details
3 Feb 16:19:38

We have a fault open with BT regarding the Harvington Exchange. We are seeing packet loss, typically between 8am and 2am, and getting up to 20% at peak times in the evening.

BT have already tried resetting the line card, but this has not worked.

BT are still investigating.

Update
3 Feb 16:22:45
Example graph:
Update
3 Feb 20:34:34
This has been escalated within BT. Other ISPs are seeing a similar issue. Currently, BT's 'Technical Services' are investigating the problem.
Update
5 Feb 10:16:58

BT have worked at the exchange early hours of this morning to try and resolve the issue. We will have to wait until around 3pm today to see if the heavy packet loss has been fixed.

The details from BT are as follows: "The technical team have worked all night on this issue. An engineer was sent to the exchange in the early hours of this morning and has reseated several IML cables in the network to see if this alleviates the issue. Ping testing has been carried out extensively since the reseats and where there was small packet loss seen prior to the reseat these are now proving to be totally clear."

Update
5 Feb 16:43:59
Looks like the amount of loss is increasing. BT are still investigating.
Update
6 Feb 11:30:34
From BT: Will get this info back over now and ensure tech services are involved to get to the bottom of this issue as agree this is really frustrating that we cannot find the route cause here
Update
7 Feb 09:06:05
Chasing BT for an update
Update
7 Feb 09:56:20
The controller card was reset rather than changed at 02:56am this morning and TSO are waiting on confirmation now if this has made a difference.
Update
10 Feb 09:22:15
BT's efforts over the weekend has not fixed the problem. We will be chasing BT again.
Update
10 Feb 10:01:42
BT are looking to see if it is possible to move our affected lines on to a different SVLAN. (in short, a different link out of the Exchange). We'll update this post when we get an update from BT.
Update
10 Feb 17:00:06
BT are planning to move the lines to a different SVLAN, we're not sure when this will be done yet though. We'll update again when we have further information.
Update
11 Feb 22:28:06
BT have moved the lines to a different SVLAN, and the packet loss problem has gone away.
Started 21 Jan 09:00:00
Closed 11 Feb 22:27:34

4 Feb 16:00:00
Details
3 Feb 14:45:01

One of our authoritive DNS servers, secondary-dns.co.uk, has died. We are carrying out an emergency migration on to new hardware.

DNS services are still running, albeit only on the Primary Name server. This time period is considered 'at risk'.

The new server should be up and running in the next hour or so.

Update
3 Feb 22:09:46
The replacement server is serving zones correctly, but customers who use secondary-dns.co.uk as a secondary to their own nameserver may not have their zones served yet. The reason is that the backup of zone files that we slave for customers is out of date. Rather than serve potentially incorrect records, we're simply not serving those zones but this means customers who use secondary-dns.co.uk as a secondary to their own nameserver should send it a notify to trigger a zone transfer. This does not affect domains for which we provide primary DNS.
Started 3 Feb 14:40:41
Closed 4 Feb 16:00:00

21 Jan 12:59:59
Details
21 Jan 09:44:47
As of 8:30 this morning most 20CN lines connected to a Sheffield RAS have either been dropping or showing packet loss. We are reporting this to BT at the moment.
Update
21 Jan 10:45:51
BT Engineer on the way to site.
Update
21 Jan 11:07:59
Example graph:
Update
21 Jan 12:58:09
Lines are looking better now.
Update
21 Jan 12:58:22
Lines are looking stable again. No news yet from BT.
Update
21 Jan 13:00:23
BT have reset a card, to resolve the issue.
Resolution Card reset.
Started 21 Jan 09:43:03 by AAISP Pro Active Monitoring Systems
Closed 21 Jan 12:59:59

8 Jan 10:20:00
Details
8 Jan 10:13:35
We are seeing a small number of lines flapping (dropping and reconnecting).
We are investigating.
Update
8 Jan 10:30:20
The dropping has stopped for the moment, lines are back to normal.
Started 8 Jan 09:49:00
Closed 8 Jan 10:20:00

27 Dec 2013 19:50:00
Details
27 Dec 2013 14:58:36
We're seeing some issues to some of BTs BRASs. It looks like most of those in Slough (the BRASs ending -SL).
Update
27 Dec 2013 16:11:03
BT are investigating.
Update
27 Dec 2013 17:46:20
Graphs look like this:
Update
27 Dec 2013 18:18:33
BT have an incident open for this fault, and are investigating.
Update
27 Dec 2013 21:10:13
Lines cleared up at about 19:50. Looking back to normal now. No word from BT yet though.
Update
27 Dec 2013 21:45:39
Lines now looking normal:
Broadband Users Affected 5%
Started 27 Dec 2013 14:38:00
Closed 27 Dec 2013 19:50:00

27 Dec 2013 16:11:38
Details
12 Dec 2013 04:25:03
Lines: 100% 21CN-REGION-NT-B dropped at 2013-12-12 04:23:00
We have advised BT
This is likely to have affected multiple internet providers using BT
Broadband Users Affected 1%
Started 12 Dec 2013 04:23:00 by AAISP automated checking
Closed 27 Dec 2013 16:11:38
Cause BT

27 Dec 2013 16:11:29
Details
13 Dec 2013 22:36:02
Lines: 100% 21CN-BRAS-RED9-L-FAR-MONUMENT dropped at 2013-12-13 22:34:19
We have advised BT
This is likely to have affected multiple internet providers using BT
Update
14 Dec 2013 03:29:02
Lines: 100% 21CN-BRAS-RED9-L-FAR-MONUMENT dropped again at 2013-12-14 03:27:17.
Started 13 Dec 2013 22:34:19 by AAISP automated checking
Closed 27 Dec 2013 16:11:29
Cause BT

27 Dec 2013 16:11:15
Details
26 Dec 2013 04:41:03
Lines: 50% 21CN-REGION-L-FAR and 100% 21CN-BRAS-RED4-L-FAR-MOORGATE dropped at 2013-12-26 04:39:59
We have advised BT
This is likely to have affected multiple internet providers using BT
Broadband Users Affected 2%
Started 26 Dec 2013 04:39:59 by AAISP automated checking
Closed 27 Dec 2013 16:11:15
Cause BT

20 Nov 2013 06:28:33
Details
20 Nov 2013 06:25:03
Lines: 100% 21CN-REGION-21CN-BRAS-RED10-GI-B and 100% 21CN-REGION-21CN-BRAS-RED11-GI-B and 100% 21CN-REGION-21CN-BRAS-RED12-GI-B and 100% 21CN-REGION-21CN-BRAS-RED13-GI-B and 100% 21CN-REGION-GI-B dropped at 2013-11-20 06:23:33
We have advised BT
This is likely to have affected multiple internet providers using BT
Broadband Users Affected 4%
Started 20 Nov 2013 06:23:33 by AAISP automated checking
Closed 20 Nov 2013 06:28:33
Cause BT

27 Nov 2013 05:10:33
Details
27 Nov 2013 05:07:06
Lines: 100% 21CN-REGION-21CN-BRAS-RED10-WV and 98% 21CN-REGION-WV dropped at 2013-11-27 05:05:33
We have advised BT
This is likely to have affected multiple internet providers using BT
Broadband Users Affected 1%
Started 27 Nov 2013 05:05:33 by AAISP automated checking
Closed 27 Nov 2013 05:10:33
Cause BT

27 Nov 2013 03:18:05
Details
27 Nov 2013 03:15:02
Lines: 100% 21CN-REGION-21CN-BRAS-RED10-WV and 100% 21CN-REGION-WV dropped at 2013-11-27 03:13:05
We have advised BT
This is likely to have affected multiple internet providers using BT
Broadband Users Affected 1%
Started 27 Nov 2013 03:13:05 by AAISP automated checking
Closed 27 Nov 2013 03:18:05
Cause BT

19 Nov 2013 23:24:45
Details
19 Nov 2013 23:21:03
Lines: 94% 21CN-REGION-21CN-BRAS-RED10-WV and 100% 21CN-REGION-WV dropped at 2013-11-19 23:19:45
We have advised BT
This is likely to have affected multiple internet providers using BT
Broadband Users Affected 1%
Started 19 Nov 2013 23:19:45 by AAISP automated checking
Closed 19 Nov 2013 23:24:45
Cause BT

28 Nov 2013 19:30:34
Details
28 Nov 2013 19:27:02
Lines: 100% 21CN-REGION-CF-C dropped at 2013-11-28 19:25:34
We have advised BT
This is likely to have affected multiple internet providers using BT
Broadband Users Affected 2%
Started 28 Nov 2013 19:25:34 by AAISP automated checking
Closed 28 Nov 2013 19:30:34
Cause BT

19 Nov 2013 11:15:05
Details
02 Oct 2013 09:07:54
Our monitoring has picked up evening latency on BT 21CN lines on the Bishopsgate exchange. This is happening between 8pm and 1am. We have opened an Incident with BT. This is not affecting TalkTalk Wholesale lines.
Update
02 Oct 2013 09:51:39
BT will be monitoring and testing these affected lines this evening.
Update
03 Oct 2013 10:34:07
This is still happening.
Update
07 Oct 2013 15:24:31
BT have no updates, but are escalating this issue internally.
Update
08 Oct 2013 12:52:33
BT investigated this last night and advised that they did not observe any excessive utilisation. So they liaising with BT Technical Services today.
Update
09 Oct 2013 14:31:14
Update from BT: BT Technical Services have suggested to BT TSO Surveillance that a Switch of equipment would need to be facilitated and then further tests implemented after the switch has taken place.
Update
11 Oct 2013 08:17:26
Latest update at 8.36am 10.10.2013 is that the equipment switch last night did not resolve your issue and that BT TSO have emailed BT Technical Services to again reinvestigate the Broadband latency problem.
Update
11 Oct 2013 15:53:48
UPdate from BT: The latest update on this is that BT TSO whilst liaising with BT Technical Services have now involved BT Design to try and investigate this issue further. All three organisations are working together now to try and find a way forward to resolve this for you and your customers.
Resolution Issue resolved.
Started 02 Oct 2013 09:00:00
Closed 19 Nov 2013 11:15:05

04 Oct 2013 09:00:00
Details
04 Oct 2013 09:07:46
Lines are down on the Manor Park exchange. BT are aware of a fault affecting lines on the exchange, area code 01928.
Resolution BT fixed by 9am same day.
Started 04 Oct 2013 02:00:00
Closed 04 Oct 2013 09:00:00

04 Oct 2013 08:54:32
Details
02 Oct 2013 14:48:48
We are seeing continuous packet loss (~10-20%) on TalkTalk Wholesale lines on the Kennford Exchange. This has been reported to TalkTalk and they have reported back that they see a possible network problem at their end.
Update
03 Oct 2013 10:33:51
This is still happening.
Update
03 Oct 2013 13:58:06
BT have updated us saying there are no further updates, but they have escalated it and they will send an update tomorrow.
Resolution Fault fixed. Problem was with line card. TalkTalk are still doing some diagnostics on these lines though.
Started 02 Oct 2013 14:46:27
Closed 04 Oct 2013 08:54:32

30 Sep 2013 10:25:21
Details
03 Jan 2013 09:59:26

The 'per IP stats' usages pages are not working very well at the moment. Some customers are not seeing any usage, and some are seeing very little usage, and others are seeing correct amounts of usage!

This is being looked in to.

This does not affect the 'Line' usage or any billing based usage records. 

Update
11 Apr 2013 12:25:10

We are sorry for how long this has been ongoing.
We are still trying to resolve the issue, however it is taking a lot longer than we expected.

We do not currently have an ETA however we will update this status once we have further news.

Update
02 Aug 2013 09:13:39
The IP Stats are now back online. We'd be interested to hear from customers where the per IP stats are reporting very different usage when compared to the usage in the hourly usage tables. Remember that the IP Stats are a rough guide, they are not used for billing, but are useful to see usage on an per IP basis for diagnostics etc.
Update
07 Aug 2013 10:53:32
At the moment, Per IP Stats are only being recorded for customers on half of our LNSs - Customers on A and C Gormless should see stats, but B and D will not. The CQM graph will show which Gormless you are on.
Update
30 Sep 2013 10:26:39
We will close this post for the moment and the job is with our Engineering team. At the moment per IP stats is working on some but not all lines. Rather than withdraw the feature, we'll keep it as it is and this will be further investigated.
Started 03 Jan 2013 09:56:05
Closed 30 Sep 2013 10:25:21

14 Aug 2013 10:00:00
Details
13 Aug 2013 21:26:59
We are investigating...
Update
13 Aug 2013 21:41:56
about 50% or so of our BE lines are seeing a mix of low level packet loss, higher latency and latency spikes. Some lines are also seeing very frequent disconnects. We have informed BE.
Update
13 Aug 2013 21:50:35
BE have reported problems regarding 7 exchanges which may be related to this. They say: "We are currently experiencing an unplanned outage at our Crownhill (WWCRWN), Devonport (WWDPRT), Plympton (WWPTON), Saltash (WWSALT), St Budeaux (WWSBUD), Plymouth (WWPYTH) and Plymstock (WWPSTK) exchanges. Our engineers are informed and investigating. Apologies for the inconvenience caused."
Update
13 Aug 2013 22:14:17
Lines have been back to normal since about 10:05pm.
Update
13 Aug 2013 22:16:38
We suspect BE may have been re-routing traffic in their network as a result of the exchange problems which meant many other lines were being routed over congested links. We've not had this confirmed yet though.
Update
13 Aug 2013 22:17:29
(This was also affecting non-AA BE lines.)
Update
13 Aug 2013 22:59:05
We have a very small number of lines that have an increase in their minimum latency, the latency had jumped up by about 10-20ms. We have passed these lines on to BE for their comment.
Update
13 Aug 2013 23:00:12
An update from BE regarding their problems with the exchanges: "Crownhill (WWCRWN), Devonport (WWDPRT), Plympton (WWPTON), Saltash (WWSALT), St Budeaux (WWSBUD), Plymouth (WWPYTH) and Plymstock (WWPSTK): The links are back restoring all connectivity. Engineers are still investigating for the cause so future failures can be prevented. "
Update
14 Aug 2013 21:15:07
We've had an update from BE as they have been investigating this, here is the information "I have spoken to O2 throughout the day about this case and it is believed that the cause may have been due to the exchange outages in the south-west areas, as you suspected. The case has been elevated to O2's Network team who have since passed it to their Transport Services team that will be responsible for investigating the exact cause of the issue and also reaffirming that the problem has been resolved."
Resolution BE did have an outage affecting a number of exchanges (listed above) in the south of England. We suspect this also caused re-routing of traffic within their network which caused latency and loss on other lines.
Started 13 Aug 2013 21:15:26
Closed 14 Aug 2013 10:00:00

10 Aug 2013 10:00:00
Details
09 Aug 2013 19:43:07
Lines: 100% 21CN-REGION-21CN-BRAS-RED10-L-WAT dropped at 2013-08-09 19:41:19
We have advised BT
This is likely to have affected multiple internet providers using BT
Resolution Lines came up shortly after, we did notify BT.
Started 09 Aug 2013 19:41:19 by AAISP automated checking
Closed 10 Aug 2013 10:00:00
Cause BT

08 Aug 2013 10:10:46
Details
08 Aug 2013 02:54:10
Lines: 100% 21CN-REGION-21CN-BRAS-RED10-MQD and 100% 21CN-REGION-21CN-BRAS-RED11-MQD and 96% 21CN-REGION-MQD dropped at 2013-08-08 02:52:16
We have advised BT
This is likely to have affected multiple internet providers using BT
Resolution Line came back up shortly after incident. We did let BT know.
Broadband Users Affected 4%
Started 08 Aug 2013 02:52:16 by AAISP automated checking
Closed 08 Aug 2013 10:10:46
Cause BT

08 Aug 2013 09:00:00
Details
08 Aug 2013 02:53:10
Lines: 100% 21CN-REGION-21CN-BRAS-RED10-MQD and 100% 21CN-REGION-21CN-BRAS-RED11-MQD and 94% 21CN-REGION-MQD dropped at 2013-08-08 02:52:16
We have advised BT
This is likely to have affected multiple internet providers using BT
Broadband Users Affected 4%
Started 08 Aug 2013 02:52:16 by AAISP automated checking
Closed 08 Aug 2013 09:00:00
Cause BT

12 Jul 2013 06:41:47
Details
12 Jul 2013 05:59:31
The regrades over night were not quite to plan so some graphs will be lost for the start of today and some lines are being moved now (6am)
Update
12 Jul 2013 06:42:38
The upgrade has worked and lines are stable just after 6am, but we will be rolling further upgrades over the weekend over night to address a RADIUS issue affecting distribution of lines between LNSs.
Started 12 Jul 2013 05:57:43
Closed 12 Jul 2013 06:41:47
Previously expected 12 Jul 2013 06:30:00

11 Jul 2013 19:03:11
Details
08 Jul 2013 02:36:02
Lines: 100% 21CN-REGION-21CN-BRAS-RED13-L-FAR and 100% 21CN-REGION-L-FAR dropped at 2013-07-08 02:34:20
We have advised BT
This is likely to have affected multiple internet providers using BT
Broadband Users Affected 5%
Started 08 Jul 2013 02:34:20 by AAISP automated checking
Closed 11 Jul 2013 19:03:11
Cause BT

05 Jul 2013 03:15:00
Details
04 Jul 2013 03:25:12

Our BT links blipped at 3am.

All BT based lines were affected but they started to came back straight away, and most are back on now.

If your line appears to be affedted by this, and is still down, please reboot your router.

Started 04 Jul 2013 03:00:01 by AAISP Pro Active Monitoring Systems
Closed 05 Jul 2013 03:15:00
Previously expected 05 Jul 2013 03:15:00

01 Jul 2013 18:23:00
Details
07 Jun 2013 20:34:02
Lines: 100% 20CN-REGION-EALING dropped at 2013-06-07 20:32:29
We have advised BT
This is likely to have affected multiple internet providers using BT
Resolution

This was a blip of some sort

Started 07 Jun 2013 20:32:00 by AAISP automated checking
Closed 01 Jul 2013 18:23:00
Cause BT

02 Jun 2013 11:40:00
Details
02 Jun 2013 08:04:10

It looks like quite a lot of Be lines are suffering serious latency now. It seems mostly London but some other areas are affected.

Some lines had loss and latency from just before 21:00 to just after 00:00 and then were fine. Some have seen high latency from just after 00:00 until 05:30, and then Ok briefly before latency started again at 06:00.

 

Update
02 Jun 2013 08:12:31

I ahve put this as "minor" as it is only Be lines and probably under half of them. Almost all users with Be lines are using along side a BT line and so not off line. However, such customers may wish to take the Be line out of their bonding until fixed.

Update
02 Jun 2013 09:09:47

Be/O2 are looking in to this.

Update
02 Jun 2013 11:42:25

Looks OK now

Started 01 Jun 2013 21:00:00
Closed 02 Jun 2013 11:40:00

02 Jun 2013 09:09:17
Details
26 May 2013 03:13:06

It seems that some customers have been suffering with severe problems, notably around 8pm to 11pm last night.

This looks to be customers with older zyxel routers. We are still shipping zyxel P660's as PPPoE bridges and that configuration is not affected. However, some years ago, we sold the ZyXELs simply as broadband routers.

Over the last few months these have been the target (well, intermediatory) for DNS amplification attacks resulting in some customers having high usage (and in some cases bills).

Yesterday at around 00:36 we saw an attack start, which is why we did emergency upgrades on our infrastructure over night. It now seems that the attack is either directed at, or co-incidentally affecting, these older ZyXEL routers and causing them to reboot.

The attack is hitting lots of ISPs and appers to be happening in busrts, sometimes lasting many hours.

In the long run the solution to both issues may be customers updating to newer routers. This will have the side effect of also getting customers on to IPv6.

If we find a work around in the mean time, I'll post more details.

Update
26 May 2013 19:31:47

The attack started again at 6pm Sunday.

Update
26 May 2013 19:56:33

The attack appears to be broken TCP port 80 packets. It may be that a config change on affected routers will avoid this specific issue. If we find more details we'll post them.

Update
27 May 2013 10:03:41

Using the web interface on the ZyXEL P660, Advanced>Remote MGMT, set all to LAN only.

Resolution

The attacks seems to have stopped for now.

Started 25 May 2013 20:00:00
Closed 02 Jun 2013 09:09:17

02 May 2013 02:14:51
Details
18 Apr 2013 18:35:10

It looks like we have a blip in RADIUS around 5pm which has resulted in lines showing "pink" or "salmon" on the management pages and some people getting odd status update emails and SMSs.

Services are working - this is an accounting issue.

We're not sure what happened exactly, but this is a scenario that the systems are designed to cope with relatively sensibly. Some usage may not be metered for the next few hours and some lines will be PPP restarted over night.

We are also doing an LNS switch over tonight, but there are issues with that which means it is likely to all happen later than usual, i.e. around 7am.

As it happens we are working on some updates to RADIUS which is hoped to be phased in over the next few weeks. Incidents like this, whilst minor, and not service affecting, are a nuisance, and they are being incorporated in to the new design to make such issues less likely.

Update
18 Apr 2013 18:58:48

We're clearing the pink lines now, ppp restart.

Update
28 Apr 2013 21:12:03

Some customers on C or D gormless will have gaps in their graphs this afternoon and this evening. The lines have been connected through these gaps but have has some PPP restarts.

Started 18 Apr 2013 17:00:00
Closed 02 May 2013 02:14:51
Previously expected 19 Apr 2013 07:00:00

17 May 2013 09:22:49
Details
16 May 2013 15:25:06

02 have informed us that there is currently an outage at the Shepherds Bush (LWSHE), Kensington Gardens (WRKGDN) and Acton (LWACT) exchanges.

Engineers are currently investigating and 02 expect to have an update within the next 2 hours. 

Update
17 May 2013 09:23:23

This is now resolved, we have not been given any information as to the cause or fix for this outage.

Update
17 May 2013 09:23:29

This is now resolved, we have not been given any information as to the cause or fix for this outage.

Started 16 May 2013 15:23:05
Closed 17 May 2013 09:22:49

15 May 2013 14:01:02
Details
15 May 2013 13:48:38

Due to human error, we have managed to clear PPP on around a 1/3 of lines (B.gormless) today.

We are looking in to how procedures can be tightened up to avoid this in future.

Update
15 May 2013 13:55:50

This was a controlled restart of the LNS not a crash, and as such the PPP restarts were even quicker as devices did not have to wait for LCP timeout to restart.

And yes, we are seriously looking at the best way to avoid this sort of error happening again.

And yes, the human concerned was me - RevK - so I owe a few people a pint...

Resolution

The answer is css - the control pages of the various machines currently all look identical apart from the host name. Using some css we can make all "live" boxes stand out very clearly.

Started 15 May 2013 13:45:00
Closed 15 May 2013 14:01:02