Autocomplete for topic tagging is now available
This is something that's been requested and worked on for a very long time, and should help a lot with the consistency of tags that people use on topics. It's also another significant feature that's been added by an open-source contributor: Shane Moore (whose Tildes username I actually don't even know) has been working on this on and off since last July, and has put up with me being slow to review and requesting some major changes to it over that time.
It applies to both the tagging field for new topics as well as the one for editing existing topics' tags, and the list of tags that show up for autocompletion are the 100 most commonly-used tags in each individual group (so the suggestions are different between ~tech and ~music and so on). This is just based on pure frequency at the moment (as in, the 100 tags that are on the most topics in that group), but in the future we could probably improve this to specifically include tags that have been getting used more recently, instead of looking at all time.
The interface can probably still use some work, and it's likely that there are some bugs and other issues with it, but as I've said before, Tildes is supposed to be in alpha! I haven't been adding nearly enough frustrating issues or breaking things, and we're all getting complacent with having a site where most things work!
Let me know what you think of it, and if you notice any issues. And thanks again for all the work and patience, Shane!
Thanks a ton for your contribution, Shane (and your continued hard work, Deimos). The new tagging interface with the autocomplete is great and I really like the new formatting of
tag 2 X,
etc X; It's nicer looking and IMO way more easily usable than the previous manually formatted comma separated value layout of
tag, tag 2, etc.
I just tripped over this new feature by accident - because it has broken one of my workflows.
Sometimes when I'm posting a current article about an ongoing news item, I'll open a previous topic about that news item, open the tag field, and copy-paste the tags from the previous topic to the current topic (for instance, I was trying to copy the tags from this previous topic for this current topic). This change stops that process: the tags are no longer copyable.
Admittedly, I started doing this workflow in part because it was easier than typing out each tag manually - and now we have this handy new tag-suggestion feature to take care of that.
However, being able to copy tags is helpful when there's a particular set of tags which are relevant, but not common. Uncommon tags are are not going to come up as suggestions.
Oh, and by the way... I do like the feature! I've tested it a little bit since I discovered it, and it works nicely.
Agreed, I've noticed a few inconveniences with it myself too. For example, when I went to change all the "novel" tags to "novels", it was a little annoying to need to X them all and type "novels" in full instead of just being able to add an "s" in the right spot. Fixing typos in tags (which I do fairly often) is more annoying as well.
I wonder if a simple toggle to flip it to "text mode" that just goes back to the old format might be a good way to address all the similar issues easily.
For what it's worth, backspacing into a chip highlights it right now, and another backspace deletes it (which is at least a little easier than clicking the 'x'), but I also really like this idea ^^^
I assumed you would run a script to do this in the database, rather than doing it via the user interface.
Let's not get sentimental about something that was just a temporary placeholder. There's nothing wrong with the new format. Work on that basis.
Or implement tag synonyms like Stack Overflow.
How would that change existing tags of "novel" to "novels"?
It wouldn't. Tag synonyms are often used as a crutch or interim solution on SO when the "burnination" process (deleting a tag and replacing it with more correct ones) would be a large undertaking.
In this instance, if we were to implement StackOverflow style tags (it might be the one thing they do right), I would say a burnination of novel → novels is the correct solution here.
That's what I thought - which is why I was surprised that @haykam821 suggested this as a solution.
I'd appreciate this. I do a lot of typos/brainfarts when typing text, so that is one use. But also, when I did this weeks thread at ~books, I noticed that it took me almost 10x time to add tags to the topic. Till now, I'd just hit edit tags on the last thread and copy the contents over, this time I had to hop from tab to tab and type manually. Not a huge inconvenience, as it is just one 1 seconds job each 14 days becoming 10 seconds long, and elsewhere it is really useful; but a toggle, or even sth like double-click on the tags field would be convenient.
oh also, piggybacking off of this, the tags aren't editable or re-sortable once they go in, so whatever order you type them in is fixed unless you manually delete and retype them. this is mildly annoying if you fuck up a beginning location tag like i did on this article, because i had to retype all the tags just to add the
.wellingtonon the end of the location tag. i'm sure that it's just because the feature is new that things like this haven't been ironed out yet, though.
I get fussy about the order of tags, too, but it's not necessary. It's nice to have the most important tags at the front of the list, and the less important tags at the back of the list, but it doesn't really matter. They're used mostly for filters and searches, so they don't really need to be in any order.
That said, it would be nice to be able to add tags anywhere in the list rather than only at the end (like when I added a "language" tag to that post because of reasons).
incidentally, i think i added that as a tag originally and then forgot to add it back in when i nuked them all to add
Well, there you go, then - another reason to be able to add tags anywhere in the list!
I don't think "tag order" is something I'd ever even want to worry about. I wonder if the site giving them an objective sort (like alphabetical, or length-based) would be a better option.
i mean, we as a collective do seem to have some sort of unspoken hierarchy of where certain tags go and what tags take precedence over others from what i've observed (and which i tend to model my tags after). location tags seem to almost always go first, followed generally by tags pertaining to the discipline the article falls in, followed by main subject and pertinent content within the article, and then whatever else is left over that people feel should be tagged about the article. so i think there is some significance, unconscious or otherwise, to tag order on here and how people use it.
I think that's very interesting. I've been on the site for over six months but had not picked up on that at all.
it's certainly an imperfect thing, to be clear, and i don't think everybody does it (and also my description of it is pretty reductive because the information contained within tags varies from section to section). i also dunno that it's necessarily something that would justify being able to sort tags since i don't know the sort of work that would go into making that a thing. but it does seem like it'd be useful feature for some people, which is why i lumped it in there.
I too try to subscribe to the country/region tag first, content type next (e.g. video, podcast, documentary), etc. And while I definitely agree ordering of tags would be useful to have eventually, and while it wouldn't be much work to initially implement such a system, it would be a lot of work to identify and sort out what tags should go where and then to maintain that sorting as Tildes diversifies its groups and topics.
oh, to be clear i'm not thinking of any automatic sorting system here. i'm literally just thinking of like, the ability to manually sort tags within the little tag bar we're given by dragging them around or something.
Ah... okay, gotcha. Yeah, perhaps a "drag and drop" option on the tags to allow that would be nice. I added a gitlab suggestion for it:
Neither have I. I've just been doing them randomly, lol.
My technique for tagging is to think about sub-groups first. The first tag in my list of tags is the one that represents the future sub-group this topic will appear in. So:
For ~science topics, that might be "biology" (~science.biology) or "physics" (~science.physics).
For ~humanities topics, that might be "history" (~humanities.history) or "language" (~humanities.language).
For ~life topics, that might be "parenting" (~life.parenting) or "working" (~life.working).
For news topics, it's different, because news is often localised, so I tend to use location-based tags first.
But, in most groups, the first tag I apply to a topic is the one that I predict will eventually be a sub-group.
Yeah, that's pretty similar to my reasoning and methodology too... Hence why "history" is often the first tag for me on my ~humanities topics as well. ;) My only exception to that is "videos" which I often apply first so people immediately know the kind of content they are about to view, or the region/location tag so they know where ~news articles relate to.
I don't think the tag bubble style really fits with the rest of Tildes. Maybe a box would work? Other than that, I like the suggestions.
Agreed, I've gotten rid of the rounded corners. Thanks!
No problem. That looks much better.
I'm sure the most popular 100-tag set for each group was used for performance reasons, but I think that removes the potential benefit of discoverability for less frequently used tags. When tags are less frequently used and seen, it's helpful to have them suggested so that future posts can standardize a bit. When suggestions are based on popularity, the most popular will continue to dominate as they get suggested, and the less common ones may continue to splinter variations without suggestions, thus remaining unsuggested. When tag reuse is tied not only to finding posts but, eventually, to creating groups, this becomes more important.
It's not so much for performance reasons, but there needs to be a balance between discoverability and usefulness. It's not very useful to get a giant list of suggestions of tags that have only ever been used once or twice, that'll just be even more confusing for someone that isn't sure how to tag their post.
"100 most common" definitely isn't necessarily the right number or method, but there does need to be some restrictions on it to keep it manageable.
I think group creation would almost always be attached to the most popular tags, so I don't think that should be a concern at all.
It seems like the search-set is limited to the top-100. Most implementations I've seen use the entire search set, order with some weighting (factors may include most frequently used, most recently used by user, etc.), and then display N results. These work well in balancing convenience and usefulness against discoverability. The more someone types, the more niche suggestions they can get (depending on how common the prefix is, of course). Right now, there's many tags that I tried typing exactly as they appear on other topics which instead return zero suggestions.
Right. My point with this was that it will be harder for new tags to grow in popularity because they're less discoverable and, without suggestions, harder to settle on a standard between multiple similar variants.
Can you give some specific examples? It would be helpful for looking at the thresholds.
Not the parent, but I just tested two from a recent submission in ~science.
zoologyis found (six results), but not
That might give some insight into the thresholds?
But zoology is a larger field than herpetology: herpetology is actually a subset of zoology. I would expect the broader field to be tagged more than the narrower field.
Sure. I didn't mean to imply that wasn't the case.
Why would you expect the less frequently used and less broadly focussed tag to be suggested as much as the more frequently used and more broadly focussed tag? A "herpetology" tag is going to be used less often than a "zoology" tag because herpetology is a narrower field which is going to be discussed less often than zoology.
We have a new feature that attempts to autocomplete/suggest tags from what you're typing based on existing tags. The tag "herpetology" already exists as a tag on two posts in ~science. When creating a new post in ~science, it is not unreasonable to expect "herpetology" to be suggested when you start typing. Currently, however, it goes "h" -> ["health", "history"], "he" -> ["health"], "her" -> . It has nothing to do with how the field of herpetology relates to the field of zoology.
One of the points of suggesting tags is to encourage standardisation of those tags. Tagging works best if you can use them to search or filter topics, and you can't use them to search or filter if they're not standardised. Imagine you're interested in the history of World War II. You'd want a way to search for topics about that period. But, if you have to search for "world war ii" and "world war 2" and "wwii", that's just annoying. It's much easier for you to find the content you want if you can use only one tag to search for it. So, at the input end, we want to guide people towards using one standard tag for these topics. When they type in "w", they should see only one tag for this topic (which I've been standardising as "world war ii"). If they also see the "world war 2" and "wwii" tags that have appeared infrequently on some topics, they're just as likely to select those non-standard tags as the standard one we're trying to encourage them to use.
In that context, we don't want a tag-suggestion feature that's going to suggest every single tag ever used in the entire history of Tildes. We want it to suggest a limited selection of tags - preferably the most commonly used tags, as they're more likely to be the standardised tags. "world war ii" is more likely to show up in the top 100 tags in ~humanities than "world war 2" or "wwii", so people will be more likely to select it than the others... and now we're moving towards standardisation.
Of course, this means that low-usage tags like "herpetology" might be selected out. But there's nothing stopping people from still adding that tag regardless.
Also, this is a first iteration of this feature. While future iterations might have more cleverness, involving "some weighting (factors may include most frequently used, most recently used by user, etc.)", that's not happening here and now. Here and now, this is just a proof of concept: can we have a feature which suggests tags, and will people use it? For the purposes of this testing, doing a simple dump of the top 100 tags in each group is sufficient. Let's get the basic feature working first, and add the cleverness later.
I wouldn't. I don't understand why you're saying this.
That's why. You were surprised that "zoology" was suggested, but not "herpetology".
The algorithm that Deimos has implemented will display "the 100 most commonly-used tags in each individual group". Given that zoology is a broader field and therefore likely to have more posts about it than herpetology, it's logical that "zoology" would be used more often as a tag than "herpetology" - so "zoology" is more likely to make the top 100 than "herpetology", meaning that "zoology" will therefore appear in the suggestions list, while "herpetology" will not.
I'm sorry but I wasn't surprised or confused. I was just listing these results as Deimos had asked for examples.
Ah, I see now. It's using a short pre-populated list of tags for suggestions, so it is indeed a very limited search set. That seems like a performance decision vs something like a live search.
Ha, just saw this. I'm Shane! :D I have a one-"o" difference between my GitLab and Tildes usernames. Thanks all for the kudos!
Ah, we were wondering what your Tildes username was. Thanks a lot for your hard work on implementing this feature. And as someone who does a lot of topic tag maintenance, it's very much appreciated! :P
I just tried it out over in ~test and it works great! I tried it on both my ThinkPad and my smartphone, but it didn't show up on the phone. I'm guessing it is, but just to check, is it supposed to be that way?
Oh, another possibility it might be: was your phone capitalizing the tag(s) you were typing? It looks like it's case-sensitive right now so it won't have any suggestions if you start with a capital letter (which phones often do).
I'll try to fix that.Edit: fixed.
Oh that's a great idea! I didn't even consider it. I'm currently in class so I'll have to wait a bit, but I'll check that again when we're done, that might be why.
Hmm, still doesn't work, and I manually cleared the cache and restarted the browser and everything. Field just stays empty and works like before. My phone is ancient, an Epic 4G, but I'm using the built in browser of KitKat, which shouldn't be too old. Oh well, it's not a big deal!
The new system is working fine on my iPad and even my ancient iPhone 4. Have you tried hard-refreshing the submission page and/or clearing your browser cache?
Do we have to do client-side cache clearing? Is there no cache invalidation mechanism? If not, maybe that should be an issue.
In projects I’ve been slightly part of, I thought it usually involved adding a unique suffix to each changed js filename during build. Note: I have very little idea of which I speak.
There is cache-busters already in place AFAIK... but they don't always work for unknown reasons.
There should be, there's a cache-busting param on the script file:
Maybe it just didn't work for you for some reason.
Very nice feature. Thanks for the contribution Shane!
Are there any plans to add a de-dupe step for plurals? eg. I notice both "novel" and "novels" show up in ~books.
We should probably re-tag all the "novel" ones to "novels" (I'll do that now, and the list will fix itself in an hour or so, next time it updates the suggestions). The tagging guidance page says that tags should generally be plural whenever possible.
While you're at it, should
videotags be changed to
videosas well? I am guilty of using
videoand had considered going through and updating all those tags to plural (as I did with all the ~music
cover_songs)... but there is just sooooo many video tagged topics that it was a rather daunting task without DB access. :P
Bah, yeah, I should be able to just do all of those through the backend.
By the way... is this really a "tag auto-complete" feature? I see it more as a "tag suggestion" feature. You type a letter or two, and Tildes suggests a list of tags starting with those letters. Tildes isn't automatically completing the tag I type, it's suggesting tags for me to choose from.
I know: I'm being pedantic. But, like I've said before, I think names matter. They influence how people perceive things. If you call something a dog, they're going to expect it to bark, not purr.
True, "suggestions" might be a better name for it. It's not really named anywhere in the user-facing interface. Spectre.css (which is the "base" for Tildes' CSS) calls these styles "autocomplete", which might be where Shane took the naming from originally: https://picturepan2.github.io/spectre/experimentals/autocomplete.html
“Auto-complete” sounds to me like it would search among all tags. A “suggestion”, on the other hand, may imply less coverage.
You're using my terminology to tell me my terminology is wrong! :P
I still see this feature as suggesting tags, rather than completing tags. But, if it's commonly known as "autocomplete", I suppose that's what people will expect.
Looks fantastic, especially with the non-rounded corners! How are the suggestions being provided? I tried the typical
ask.<x>and didn't see any suggestions.
If you look in the page source under
data-js-autocomplete-input=you can see all the potential suggestions, which appear to be group specific. So the reason you don't see any
ask.<x>is probably because you're using ~test? A bunch of
ask.suggested sub-tags show up in ~tildes submission page:
Ah okay. No, I wasn't in test, but I was in tildes.official, which probably won't have ask tags either. Very clever.
It's the 100 most common tags in the group. Were you in ~test? That won't work very well for testing this since it won't have good suggestions (it'll just be a bunch of random stuff people put on test topics).
If you write a tag, then press backspace, my browser (firefox) asks me if I want to leave the page instead of bringing the tag back into text form. I'd consider that a bug.
Added to gitlab:
I'm using Firefox as well and can't duplicate this. Is this on desktop or mobile, and does it always happen? If yes, can you explain in more detail what exactly you're doing to cause it? For example, are you writing a tag, typing a comma (which converts it to a block with an X button), and then if you push backspace after that it asks about leaving the page?
Edit: weird, definitely happens on Firefox for Windows.
Not OP, but yes, that's exactly what it's doing for me. I can backspace when a tag is incomplete just fine, but as soon as it's converted to the
tag Xformat, backspace results in activating the "back" navigation hotkey in the browser instead of deleting the tag.
p.s. Firefox 66.0.3 on desktop. Haven't tested any other browsers yet though. Lying on my couch so may have to wait til tomorrow. ;)
Chrome and Safari (thankfully) removed the hotkey where Backspace navigated backwards, so I'd bet this is a Firefox exclusive thing. Maybe IE/Edge as well, not sure.
It's an option in Firefox: http://kb.mozillazine.org/Browser.backspace_action
Having it work as "back" is the default on Windows, but not in Linux.
Tested with Firefox on OSX, same issue. Focus on title, link, or text inputs and a backspace works as expected (deletes text until empty, then nothing). Focus on the tags input and a) backspace works when there are letters b) if the tag input is empty OR there are only tag boxes (e.g. no plain text in the input), then it does a browser back. I thought usually browser-back happens if you're not focused on a proper form input/textarea element, but
<input>(and I see the blinking cursor inside it)
Thanks, the extra detail was helpful. I spotted a couple of issues and I think it should be fixed now. Give it a shot again and let me know if it still behaves strangely (@Apos and @cfabbro as well).
Like others have said in other comments, I'd argue it would be better if the tag could be broken back into text though. Then you can remove each character 1 by 1, or edit the tag.
Yep, working fine for me now. Thanks. And bonus... I finally got to see a real world application of
Exactly. Web development is fun.
This is awesome! I feel this will finally make the tagging feature useful. If people start to use tags consistently, following particular tags starts to make sense.
How often (if ever) does this list update? I've just fixed some incorrect tags in one group: there was a mix of two different tags being used which I've standardised into one tag. However, when I test posting a new topic in that group, the now non-existent tag is still showing as a suggestion.
I believe Deimos said it's updating as a background job every hour or so, to prevent it from hammering on the database. The tag changes aren't terribly time-sensitive, best to save the overhead for whatever comes in the future. It's just a form of digital housekeeping.
Huzzah! The list of tags has updated!
Cool. I'll check back later in the day.
Yeah, it updates every hour (on the hour, at X:00).