6 votes

Invisible Separator

Tags: unicode

4 comments

  1. [3]
    matejc
    Link
    Today I learned that there is a character that has no width and is invisible, but supported by every system that has Unicode support, from browsers to even terminal emulators. Lets investigate, I...

    Today I learned that there is a character that has no width and is invisible, but supported by every system that has Unicode support, from browsers to even terminal emulators.

    Lets investigate, I injected that character between characters of the word "abc":

    • python
    >>> len("abc")
    5
    
    • nodejs
    > "abc".length
    5
    

    Also search engines...
    try this https://www.google.com/search?q=Matej and then https://www.google.com/search?q=M⁣a⁣t⁣e⁣j. First one is my name in plan, but the second has invisible characters embedded between every visible character. If Tildes is not filtering invisible characters from google link then the search results will be quite different.

    Here is my page that helps you inject this special character between every other character

    Why would someone use this?
    Well to be honest, this are just my findings. You can use this method to make words unsearchable in text, you can use it to obfuscate words that would be normally considered abusive and the automatic system would ban you from using it, but people would still see the word itself. Use case would probably be to mess with people. Let's say that you would like to write somewhere that has automatic abusive filter for words. Just write your dirty words into that static page I just shared and click Copy and then paste text to some portal that would normally automatically report/ban you for using abusive words. But I do not recommend this kind of behaviour for obvious reasons.

    1 vote
    1. [2]
      onyxleopard
      Link Parent
      IIRC, a BOM that is not at the beginning of a file is also supposed to be interpreted as a zero width joiner.

      IIRC, a BOM that is not at the beginning of a file is also supposed to be interpreted as a zero width joiner.

      1. matejc
        Link Parent
        Yea there are more characters like this in Unicode table

        Yea there are more characters like this in Unicode table

  2. Emerald_Knight
    Link
    Invisible characters are painful. I remember experiencing a weird bug on the old website I used to maintain. After a lot of digging around through the source code, I took a look at the problematic...

    Invisible characters are painful. I remember experiencing a weird bug on the old website I used to maintain. After a lot of digging around through the source code, I took a look at the problematic page element using Firefox's inspector and noticed that there was a non-breaking space in there that I didn't notice anywhere else. Of course, when I searched for it in the source code, I couldn't find it... because it wasn't inserted as an HTML entity, i.e. as  . No, it was inserted like a normal space character, completely invisible and indistinguishable from all of the other bits of whitespace in the file.

    1 vote