7
votes
Is there a proxy/vpn setup that can compress data in situ?
I've been wondering about this for a while whenever I'm on a metered connection or a capped one.
It'd be cool if I could use my vps to help save data in exchange for latency. Having it download and compress any compressible materials before serving them would be a godsend, but it sounds very edge case-y given how places like youtube deliver videos in bite size peices
Does something like this sound at all possible, or should I just assume it's too niché and look for other data saving ways?
This is quite challenging to do. Let's say you're building such a compression system for web browsing.
Most of the data you download is already compressed well: HTML files are sent over gzipped (or Brotli-compressed), and images and videos are compressed (wherever feasible) in lossy ways that trade off quality for data size.
Some mobile providers on metered plans have tried to add their own compression on top. This typically adds even more lossy compression on top of already compressed images which can noticeably reduce fidelity. The general challenge is: how can your general purpose compression system beat the trade-offs made by the developers of each unique website you visit?
Let's say the developers did a poor job. Can your compression system help? The Cloudflare CDN can improve poorly optimized sites by combining and optimizing their JavaScript files, but this takes time and CPU resources -- which leads to another problem: compression also trades off quality for compression time, with the best results introducing latency (best if done once by a CDN, cached, and reused thousands of times).
One final challenge is that VPNs introduce their own overhead, too. Data packets are split up, padded, and encrypted, which adds its own latency and increased data size (for instance, Google says the VPN they offer in Project Fi adds about 10%).
Given all of this, the potentially most powerful way to save data is to do it in the browser -- run a version of your web browser in a place with better connectivity and have it download everything your browser would, combine, and pre-process it. Loading a webpage often happens in multiple stages, as resources downloaded trigger downloads of additional ones. Your remote browser in a speedy place can do these phases of loading much faster, batch them up, and send them down to you. I believe Chrome offers this kind of functionality with its "Data Saver" service.
Edit: Here's a whitepaper on how Chrome Data Saver works. It describes which optimizations are most effective and the latency impacts in great detail.
If you want to save data install ublock/umatrix. On some websites this can cut out 3/4 of the website size