11 votes

Accentuation on tags? ("á", "ã", "é", "í", etc)

I was trying to add andré bazin as a tag on an article about the film theorist André Bazin, but the box became red and prevented me from submiting. So I had to use the incorrect andre bazin.

I suppose there's a very rational technical reason for not using accentuation on tags -- even so, I believe it would be useful to suport those characters, since many languages use them.

2 comments

  1. [2]
    Comment deleted by author
    Link
    1. stu2b50
      Link Parent
      Depends on your definition of "easy". Of course it can't be a bijective map, so you'll have to pick how "smart" you want it to be. Simplistic, but unlikely to confuse, solutions include simply...

      Depends on your definition of "easy". Of course it can't be a bijective map, so you'll have to pick how "smart" you want it to be. Simplistic, but unlikely to confuse, solutions include simply mapping non-ASCII characters to "?". On the other hand, something like unidecode, a Python library, tries to do as best a job at making replacements that make sense - but edge cases can get weird, and as the library docs mention, possibly even unintentionally offensive (although unlikely - it's a good solution in most cases).

      So it's certainly possible to have a surjective mapping. For something like supporting search, you could run your surjective mapping (i.e unidecode) on all tags going in, and all text going into topic searching. Of course, the person searching will have to fiddle with how the mapping changes out non-ASCII characters in edge cases.

      4 votes
  2. culturedleftfoot
    Link
    I've encountered this a few times, as well as with hyphenated names and terms.

    I've encountered this a few times, as well as with hyphenated names and terms.

    3 votes