20
votes
How computationally expensive is filtering and ignoring?
Hello,
I realised that I dont ignore and/or filter out topics and/or tags because I think they are costly for the servers.
Is there something to it or is it almost free?
Can I help alleviate the cost by going back and "unignore" posts that won't clutter my frontpage any longer since they aren't active?
Other comments point out some great things, here's some Tildes specific details:
The way Tildes ignores work is by doing a left join on the Topic and TopicIgnore tables and then filtering that no ignore exists. So it's two primary keys being joined in the database, basically.
If we take this topic's base 36 ID "1u2o" and convert it to base 10 we can know there are at least 85632 topics in the database. Even in the worst case that you're somehow showing all topics at once on a page I don't expect any databases these days to struggle with joining fewer than 100k records. Especially not PostgreSQL which is what Tildes uses.
Tags are more complex but even they are optimized well by using PostgreSQL's ltree data type for all of its functionality, so I can't imagine this being that expensive either.
There's no need to worry about the computation need of this at all. :P Ignore and filter to your heart's content!
Relevant code snippets:
Thank you, this was exactly what I wanted to hear <3
I haven't looked at the tildes codebase or infrastructure so the following is more generalised from my experience as a software developer rather than deep knowledge of this particular stack. So anyone who chimes in with more specialised knowledge is probably more correct than I!
It's going to be a completely negligible difference for any remotely real-world usage scenario you might have. The database queries are still going to need to go and check whether or not you have ignored topics / tags even if you don't use the feature. Because it's the 'expected' way for users to use the site, I'd be quite surprised if those queries weren't fairly well optimised with whatever indexes were needed to make it quick and efficient to do so. Depending on how it's set up, maybe if you did something crazy like ignore every single topic that had ever existed that might have more of an impact? But that's not really what you were asking.
If it'll make your experience more pleasant I'd say use it and don't stress about it, I'd be pretty confident it's absolutely dwarfed by other areas of hosting costs and is as you say almost free.
It is, as most things, proportional to size. However, we have to bear in mind that an unfiltered topic list on tildes is smaller than your average JPEG.
And when it comes to size and sorting data, we don't generally start getting into 'need to optimize' territory until at least 20,000 records. Especially for the homepage which will 100% be in memory.
Performing a full-text search is far more intensive. A single pageload of Reddit is even more.
None of this is remotely close to generative AI, which is using heavy CPU and GPU compute in an exponential way.
I'm fairly certain you can offset your entire Tildes electrical consumption by lowering yourscreen brightness 5%.
Or better yet, remove just 5 minute from your shower daily (or take cold shower), and you'll have compensated most digital activity you could reassonably do as a regular person. Digital tech is actually fairly frugal, on a per user basis so long as you keep your device as long as possible (Might not stay that way forever if we keep using bigger and bigger LLM), and heating water is actually really expensive!
Gotta get one of those heat pump water heaters. Mine consumes about as much electricity per year as my GPU.
You can donate directly to Tildes to help keep the platform running. Notice that there isn't any advertisement here. I give a small amount each month.
The link to the Tildes donation page is here
Thanks, will look into this!
Practically free
I'd go as far as saying that if you ever used AI for anything, you have burned more energy than with all these queries combined.
It may be a bit harder on hardware, but I suppose Tildes is so small that it could run fast on Raspberry Pi - the performance cost for these queries would be negligible at this size of site.
Well, I haven't used an LLM on purpose, anyway. But I was more concerned about tildes and my fellow tildatians rather than resource usage at large.
Even if you haven't used AI for anything, the cost of watching anything on a streaming service would also dwarf anything like this.
It's there for a reason, and I think it's silly to concern oneself with something like that.
It wasn't yesterday that someone called me silly, thanks <3
I do understand that you didn't mean it in the endearing way but I'll take it anyway!