Timeout when connecting to a local webserver through the internet, but only on WiFi
I've recently moved, so I have a new ISP and I've also switched to new network hardware. I've been pulling my hair out trying to understand why I keep getting 100% timeouts when connecting to a locally hosted website. To make it more complicated, it only happens when I’m on WiFi.
Hardware setup is:
ISP router/modem -> Ubiquity Cloud Gateway -> U7 Pro AP -> Laptop
-> Webserver
The issue is opening https://foo.bar.baz:58443 when on WiFi. This domain points to my home (not really bar.baz, but you get the idea). There's is port forwarding rule to get to the local server. With tcpdump, I see the request coming in on that webserver, a SSL handshake is completed and then a bunch of TCP retransmissions.
Some observations:
- If the machine with the browser is connected to a cable and not WiFi, everything is fine, no timeouts.
- Opening https://192.168.1.123:58443 (webserver address) is fine (WiFi or wired).
- Opening https://10.0.1.123:58443 (gateway address) is fine (WiFi or wired).
I thought it would be MTU related, but haven’t had any luck with changing it to a lower size. I’m not positive I’ve done this correctly, though, so it may still be MTU related.
I know there are people here that know way more than I do about networking, so I hope somebody can point me in the right direction.
Check the security settings on your Ubiquity system, this tells me that there is something stopping the connection there as you can hardwire in, and direct local IP works.
Just to confirm, do you mean that’s fine on WiFi as well as wired?
Correct, either works (added it to the initial post)
I’m wondering if the router is getting mixed up between whether it’s an internal or external connection, and that’s interacting in an unexpected way with the firewall zones? The defaults only allow return traffic in a lot of cases, but if the initial connection is external and then the router takes a shortcut once it realises both IPs are internal, that could be changing how it’s hitting the policies (e.g. blocking the internal connection because the allowed return traffic was expected to be on the external connection). It could explain why it’s happy when you use the internal IP directly but not the external URL.
Have you checked the “Flows” panel on the UniFi console? Might be worthwhile to get an idea of how the router is interpreting the connections - whether they’re being blocked, what it thinks the IPs at each end are, which networks/VLANs it thinks it’s routing between.
Dumb question, but where should this panel live? I only see network/default/insights/flows, which has blocked/threats flows. Both are empty.
Nah, not at all - much as I love where they sit on the price vs capabilities vs integration Venn diagram, the ui UI (heh) has enough quirks, inconsistencies, and straight up bugs that finding the right thing can be a crapshoot. I keep meaning to move to managing mine with terraform, actually, just need to find the time...
Anyway, rambling aside, you're in the right place already! For some reason that page doesn't always load the latest data for me even on browser refresh, so first thing I'd try is clearing all filters (bottom left) and then hitting the tiny refresh icon by the search field (top left). Even if the flows to/from your server aren't marked as blocked and are failing for some other reason, selecting the rows individually will give you a lot of additional info about what the router sees as the connection endpoints, VLANs, IPs, security zones, etc. that can be very helpful for debugging this kind of thing.
If those flows aren't showing at all, it probably means the log level isn't turned up high enough to capture what you need; if you go to
/network/default/settings/cybersecure/traffic-logging
you can flip "Flow Logging" to record all traffic, which should start populating the previous page pretty much immediately. Then you can make some test connections to your web server again and inspect them easily from the router's perspective, and you can just disable the logging again as soon as you're done if you don't want it keeping those records more generally.No kidding, the UI is a maze. It's not been a smooth transition to this setup. With this "Flow Logging" setting also: after a good hour of looking it now turns out this flow logging isn't available on the Cloud Gateway Ultra. With all the extra costs, like PoE+ injectors, I ended up getting the most basic model Cloud Gateway. I thought it was expensive enough as it is. Annoying to find out that also means some features are locked away.
Okay, no more negativity, just a little buyers remorse.... I finally threw in the towel and added a DNS Host (A) record to the policy engine pointing to the local server. This way traffic won't go out to the interwebs and we all can access our calendars again inside the house and when we are on the road. Family happy and I might be able to forget about my failure in the future.
Ah crap, sorry about that, I didn’t realise it was a restricted feature - and honestly even if I had I probably wouldn’t have remembered which way around “Ultra” and “Max” sit in the product line!
Glad to hear you’ve got it working though, any fix that does the job without breaking anything else is a good one as far as I’m concerned. In my experience this kind of setup on a new project always turns up the esoteric bits that feel like they’re missing, there’s a ton of second guessing of the choices that got there, and then a week later I forget it was ever an issue and it works totally fine for the next however many years.
Since you say the SSL handshake is completed, my first guess is to look at some kind of filtering or interception occurring on your laptop.
Some troubleshooting questions - are you able to reach other servers or websites on the internet? Including nonstandard ports? Ping?
When you're on wired networking are you sure your traffic is hairpinning out to the internet and back in to your server? Are you sure you aren't just routing internally?
Is anyone else able to reach your server across the internet?
Yes, no other issues.
How would I verify this? Traceroute?
Using the mobile network I can access my server just fine. With a hotspot I can use the same laptop to access the site.
Likely Noise:Double NAT?There is dubble NAT, that’s right. Unfortunately it’s not possible to put the ISPs router in bridge mode. What noise are you thinking of?
As in Tildes-tagged noise, I would guess. Read as: "This is probably bogus, but:"
Yep.
@Greg is giving better advice than I could with the specific hardware involved but I’m getting old double NAT vibes.
Thanks, I think this quote from the help page has broken my will to really figure it out. But without kidding, thanks so much for diving deeper than I did! This might just be the real reason why it's not working.
I've given up and have just added a DNS record to the gateway that points to the local server. This way it works inside the network and outside too. It's not pretty, but it works.
Rule 0: It doesn't need to be pretty as long as it works.
Might be worth checking browser settings. I know most browsers have built-in security features these days, like automatic-VPNs, secure DNS stuff, web filtering and such. I can't say I've had this specific issue, but I've definitely run into issues trying to get to my own internal systems (but like internal to internal, not going out to the net to get back in), ready to perform some "percussive maintenance" on EVERYTHING with a metal bat, only to realize my browser had some stupid security settings enabled.
Though I'm not sure why access via domain would work on hardwire but not WiFi.
Browser caching has definitely tripped me up in the past, so I can totally relate to this. I think we can rule the browser out though. The same issue is also happening with the VPN server I host on a different machine that's sitting besides the webserver. I see traffic coming in, but no connection is established on the client side.
Testing this again now, I see the traffic coming into the VPN server takes forever when on WiFi (haven't checked with a wire). When I'm on mobile, the logging of the VPN server shows instant negotiation. I assumed the issue was something with the return traffic, but that's not the case. Something to look into later.
I would look at the AP, since it is another box between you and the destination on the wifi path, and probably has a web UI and some ability to firewall. What does it think its own IP is? Is it somehow doing triple NAT?
**EDIT: ** I just noticed this comment in your post, and from here this sounds like DNS issues. My original reply is underneath.
Opening https://192.168.1.123:58443 (webserver address) is fine (WiFi or wired).
Opening https://10.0.1.123:58443 (gateway address) is fine (WiFi or wired).
Original reply:
From your updated comments (cited below) it sounds suspiciously like a NAT or port forwarding issue. Or something similar to asymmetric routing issues. Some of your packets are getting through and some aren't.
Using the mobile network I can access my server just fine. With a hotspot I can use the same laptop to access the site.
I've given up and have just added a DNS record to the gateway that points to the local server. This way it works inside the network and outside too. It's not pretty, but it works.
The same issue is also happening with the VPN server I host on a different machine that's sitting besides the webserver. I see traffic coming in, but no connection is established on the client side.
Testing this again now, I see the traffic coming into the VPN server takes forever when on WiFi (haven't checked with a wire). When I'm on mobile, the logging of the VPN server shows instant negotiation. I assumed the issue was something with the return traffic, but that's not the case. Something to look into later.
So my rambling thoughts - when you query dns for your server name (or VPN) do you get the same results on wired and wireless? Different results inside the network vs outside? Because typically, outside and inside DNS results should be separate, and different. If you're trying to reach http://foo.bar and it turns up both an internal and external (NAT) address then you're going to get mixed up. Also you don't want servers overlapping, I assume your servers are all using the same (public) IP and this is why you are using nonstandard ports to differentiate servers?
Your wireless DNS resolution should be identical to your wired DNS (both internal). Also DNS queries are round-robin, meaning, if you have two different DNS server addresses you query, you will variously get results from both. If those servers aren't returning the same answer you will have issues.
On the same lines, is the behavior any different if you only connect to ip_address:port and don't use DNS at all?
For further testing I would recommend disconnecting your internet and make sure all of your traffic is routable internally (wired and wireless both) . Can you hit the VPN or the web server?
With NAT, the nonstandard TCP port (internal) may be getting forwarded out to the internet rather than to internal devices. With your setup I would ensure ALL internal traffic is routed internally first and never go out to the carrier unless that specific traffic needs to get to the internet for some reason.
It really sounds like when you're on wireless, some of your traffic is going out to the internet and coming back in, and some of your traffic is trying to hit the server internally, and all of your devices are getting confused.
Try disconnecting your internet service and see if you can reach everything internally.
If you aren't already set up this way I would consider doing this: (Carrier) <-> (dedicated firewall) <-> (dedicated internal router) == (everything else internal)
Rather than having an all-in-one device to handle all the routing and firewall and wireless etc. Your network is complex enough to justify the separation and this would make it much easier to troubleshoot. In other words, keep the internet traffic routed to a separate dedicated device outside of your internal network. And aside from that, all your traffic should be perfectly routable inside. Your internal router should handle all DNS, DHCP and everything else, and anything that specifically needs internet access (like DNS) should be owned and cached on the router but handed off to the firewall if needed.
One last consideration I would recommend to anyone for internal networking. Consider setting up your internal IP range as 172.16.0.0 to 172.31.255.255 instead of 192.168 or **10.x ** . The reason is because everyone uses 10.x or 192.168, and using differentiation makes it much easier to deflect any potential overlap. Network devices sometimes use 192.168 as the default setup for configuring via network. If that default IP happens to be the same as one of your servers, it's a pain to set it up. But setting up 172.16 right from the get go will preempt overlapping issues. It isn't nearly as popular.