New search capabilities available: phrases, excluding terms, alternatives ("or")
On Sunday, I took the site down for a short downtime to upgrade the database from PostgreSQL version 10 to 12. One of the main reasons I wanted to do that upgrade was to get access to a new search function, and I've updated to using it now, so we have multiple nice new search capabilities available.
These should all be pretty familiar since a lot of other search systems and search engines have similar capabilities with the same syntax:
- As before, by default, searching for multiple words will be treated as "all of these terms". So if you search ~games for steam play, you'll get all topics that have both "steam" and "play" in them.
- Phrases can now be searched for by putting double quotes around them. Searching ~games for "steam play" in quotes will only find topics that specifically have "steam play".
- Excluding terms can be done by putting a minus sign in front of it. For example, if you wanted to try to find ~games posts about Blizzard and exclude the recent China controversy, you could search for blizzard -china.
- Alternatives can be searched for by using "or". This changes to "any of these terms" instead of "all of these terms". For example, searching for overwatch or diablo will find any topic with either of those terms, instead of both.
- These capabilities can be combined, so you can exclude phrases, use "or" with phrases, and so on. For example: blizzard -"hong kong" or diablo.
This all works both through the main site topic search (at the top of the sidebar) as well as the new search for your own topics/comments.
I'm going to write a page for the Docs with info about these capabilities, but I think I want to try to find a full specification of what's supported first to make sure I cover it properly. The PostgreSQL docs are pretty vague about it, so I'll probably need to take a look in the actual code.
Please let me know if you notice any issues with it, or if anything's confusing that I should make sure to document.
And as usual, I've given everyone 10 invites, accessible on the invite page.
Now this place has better search than reddit ever will. :P
Almost! (And being able to search your own comments is already better.)
I think we actually might have superior search overall once we have the ability to search against specific "fields" by doing things like "domain:youtube.com". That's probably the main missing capability now.
Reddit can't even get past 1k results or find anything older than a year or two at this point. :/
I wonder if there's any value in creating a tag-only search mode. Right now it's kinda pointless. If Tildes ever ends up with a pile of data like reddit's history, though, it could come in very handy. A tag-only mechanism could work wonders from a much smaller dataset than comment/title searches require. It could take the load off of doing deep searches that cover a lot of groups or search over longer time periods.
Is tag-based search implemented right now? StackOverflow solves this with
[tagname]
syntax.Tags are searchable, but they're basically just treated as another word in the topic. There's no way to distinguish yet (through search) for looking exclusively for tags. For example, if you search for "social media" you'll find any topic that has the tag "social media", or has "social media" in its title or text.
You can use the specific tag lookup by either clicking the tag on an existing topic or going to an address like https://tildes.net/~tech?tag=social_media though.
What Postgres features specifically did you use for this?
I'm using its general full-text search, but these specific features were easy to add because of the
websearch_to_tsquery()
function that was added in PostgreSQL 11.I can't link directly to it, but the main description of it is at the bottom of this section of the docs: https://www.postgresql.org/docs/12/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES
Here's the commit that added it, I've just started looking through this to try to get a full understanding of what it supports: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1664ae1978bf0f5ee940dc2fc8313e6400a7e7da
Wow, that's basically everything you'll ever need for free.
Great stuff, and now I feel like I should do the 10-to-12 for like 20 projects...
It was easy using the
pg_upgrade
tool that's included with PostgreSQL. I wrote the steps that were needed in the commit message here: https://gitlab.com/tildes/tildes/commit/ca509b220b7db1efc5929f91fc4ded3e52c5a9ddIt'll be a little different if you're not on Ubuntu, and some of it (like the part related to salt/pillar) probably won't be applicable elsewhere, but it was quite straightforward.
They'd have to do that if they want to keep an existing vagrant box and upgrade it from 10 to 12. It's probably simpler to just destroy it and create a new one that will start on 12.
Oh wow, that's awesome.
Most of my projects are in docker containers, but some of them are on Ubuntu - this is actually super helpful. Thanks a ton!