I think I have a broken AT&T route?
Posting for ideas/advice, if anyone has any, as I'm unsure of where else to turn. I have a VPS (Named "Bucket") I rent and self host a few services on, along with a home server (Named "Vergil")...
Posting for ideas/advice, if anyone has any, as I'm unsure of where else to turn.
I have a VPS (Named "Bucket") I rent and self host a few services on, along with a home server (Named "Vergil") that lives under my basement stairs and I host many more services on. At 2:01 AM today I got a notification from Bucket that my Plex (hosted on Vergil) was down/unreachable. I'm assuming that's when this issue started.
When investigating I found that Plex wasn't down, but Bucket couldn't reach/talk to Vergil. Further investigation showed that it wasn't just Bucket, but nothing can reach/talk to Vergil. At first I thought it was an issue with my router, as I have my gateway set up in IP bypass mode and manage my network via my third party router (UDM-Pro). But after digging through logs looking for any automated blocks from any misclassified intrusion attempts, I realized that none of my attempts were even reaching the router. So I checked the route, and that's where I found what I think is the problem.
Running mtr
to route from Vergil to Bucket gives full resolution of the route:
mtr -rwzbc 10 45.79.209.169
Start: 2024-12-19T16:49:53-0500
HOST: Vergil.goose.ws Loss% Snt Last Avg Best Wrst StDev
1. AS??? 192.168.2.1 0.0% 10 0.1 0.1 0.1 0.2 0.0
2. AS??? 192.168.99.254 10.0% 10 0.5 0.6 0.4 0.8 0.1
3. AS7018 45-26-156-1.lightspeed.tukrga.sbcglobal.net (45.26.156.1) 0.0% 10 4.4 3.6 2.0 5.9 1.2
4. AS7018 107.212.169.24 0.0% 10 5.2 3.7 1.6 6.1 1.5
5. AS7018 12.242.113.31 0.0% 10 2.2 3.7 2.2 5.3 1.0
6. AS7018 12.247.68.178 0.0% 10 2.8 3.8 2.2 5.8 1.2
7. AS20940 ae6.r21.atl01.mag.netarch.akamai.com (23.192.0.94) 0.0% 10 3.2 4.3 2.3 5.7 1.1
8. AS20940 ae0.r21.atl01.icn.netarch.akamai.com (23.192.0.65) 0.0% 10 3.7 4.1 1.9 6.5 1.5
9. AS20940 ae1.r21.atl01.ien.netarch.akamai.com (23.207.235.35) 0.0% 10 4.2 3.5 1.9 5.6 1.1
10. AS20940 ae22.gw3.atl1.netarch.akamai.com (23.203.144.39) 0.0% 10 5.2 5.0 2.4 8.8 2.0
11. AS??? ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
12. AS??? ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
13. AS??? ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
14. AS63949 bucket.goose.ws (45.79.209.169)
However, routing from Bucket to Vergil does not:
mtr -rwzbc 10 99.42.115.109
Start: 2024-12-19T16:49:13-0500
HOST: Bucket.goose.ws Loss% Snt Last Avg Best Wrst StDev
1. AS??? 10.204.3.155 0.0% 10 0.2 0.3 0.1 0.8 0.2
2. AS??? 10.204.35.16 0.0% 10 0.4 0.4 0.3 0.5 0.1
3. AS??? 10.204.32.2 0.0% 10 0.7 9.4 0.4 74.3 23.2
4. AS63949 lo0-0.gw4.atl1.us.linode.com (74.207.239.106) 0.0% 10 0.7 0.5 0.4 0.7 0.1
5. AS20940 ae45.r22.atl01.ien.netarch.akamai.com (23.203.144.36) 0.0% 10 0.4 0.4 0.4 0.6 0.1
6. AS20940 ae4.r22.atl01.mag.netarch.akamai.com (23.192.0.98) 0.0% 10 0.6 0.7 0.6 0.8 0.1
7. AS20940 ae1.r24.atl01.ien.netarch.akamai.com (23.192.0.103) 0.0% 10 0.5 0.4 0.4 0.6 0.0
8. AS7018 12.247.68.177 0.0% 10 1.0 1.0 0.8 1.2 0.1
9. AS??? ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
10. AS7018 107.212.169.25 0.0% 10 1.4 1.4 1.4 1.5 0.0
11. AS??? ???
Calling the tier 1 support number for AT&T residential support was very less-than-helpful. They kept on wanting to send a tech out to the house claiming there's an issue with the line. I kindly thanked them for their efforts but gave up, and tried emailing the contact email address for the AT&T datacenter/core router from the WHOIS in that last successful hop of the trace from Bucket to Vergil. I doubt I'll hear anything back, but I'm unsure of who else to turn to/what else to try. I've never seen/experienced a route broken in one direction like this. But I'm unable to access any of my devices/services from outside my house, due to it. Hoping someone has an idea or suggestion?
Edit:
Well, after about 38 hours of this issue, the power went out at my house. My networking equipment is on a UPS, so it did not go down. But when the power returned, the route began resolving again, and I am connectable again. Don't know if an area power outage rebooted some AT&T equipment nearby, I would imagine their stuff is also on UPS. But who knows?
For the non-believer about my route previously being complete:
[goose@Bucket: ~ ] $ mtr -rwzbc 10 99.42.115.109
Start: 2024-12-20T15:20:23-0500 HOST: Bucket.goose.ws Loss% Snt Last Avg Best Wrst StDev
1. AS??? 10.204.3.155 0.0% 10 0.1 0.2 0.1 0.2 0.0
2. AS??? 10.204.35.16 0.0% 10 0.2 0.3 0.2 0.4 0.1
3. AS??? 10.204.32.2 0.0% 10 0.6 1.8 0.4 9.9 2.9
4. AS63949 lo0-0.gw4.atl1.us.linode.com (74.207.239.106) 0.0% 10 0.4 2.0 0.3 15.6 4.8
5. AS20940 ae45.r22.atl01.ien.netarch.akamai.com (23.203.144.36) 0.0% 10 0.4 0.4 0.3 0.5 0.1
6. AS20940 ae4.r21.atl01.mag.netarch.akamai.com (23.192.0.90) 0.0% 10 0.8 0.7 0.6 0.9 0.1
7. AS20940 ae0.r24.atl01.ien.netarch.akamai.com (23.192.0.95) 0.0% 10 0.4 0.5 0.4 0.5 0.0
8. AS7018 12.247.68.177 0.0% 10 0.8 0.9 0.8 1.2 0.1
9. AS??? ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
10. AS7018 107.212.169.25 0.0% 10 1.4 1.5 1.4 1.6 0.1
11. AS??? ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
12. AS7018 99-42-115-109.lightspeed.tukrga.sbcglobal.net (99.42.115.109) 0.0% 10 3.6 3.2 2.1 4.9 0.9
[goose@Bucket: ~ ] $