8 votes

IMO the text used for formatting/markdown shouldn't count towards the character limit in user bios

I made a new bio recently and here been tweaking it for a while and hit the 2000 character cap and messed with some of the formatting and wording for it to fit in.

Thing is, you don't read markdown formatting. My bio has quite a bit of formatting and the text with the formatting is ~1960 characters, but the text you read is only ~885 characters (according to word counter), or less than half that. I feel like that's not how it should work.

14 comments

  1. [3]
    Deimos
    (edited )
    Link
    I'm not necessarily opposed to increasing the limit somewhat, but most of the purpose of putting a limit on it is to prevent the length of people's bios from getting out of control by restricting...

    I'm not necessarily opposed to increasing the limit somewhat, but most of the purpose of putting a limit on it is to prevent the length of people's bios from getting out of control by restricting them to a certain amount of "material" to work with. Your bio is already over a full screen height on my monitor, and you feel like you should be able to make it more like 2.5x that long, which would be... an extremely large bio.

    Allowing longer bios wouldn't be harmful or anything, I just don't know if we really want them to turn into massive compilations of things like interesting links, where you click on a user page and there's basically a personal Wikipedia page in their sidebar. The intention is closer to "some relatively brief information about yourself".

    We could always consider doing something separate too, like giving each user their own wiki page that they could keep random info/stuff in, and you could link to that from your bio. I'm not sure if that would be useful for many people though.

    14 votes
    1. [2]
      Kuromantis
      Link Parent
      (I'm not gonna answer the rest of your comment because I feel I might repeat myself.) Personally I feel @Emerald_Knight's idea of a separate limit for formatting characters alongside the normal...

      (I'm not gonna answer the rest of your comment because I feel I might repeat myself.)

      We could always consider doing something separate too.

      Personally I feel @Emerald_Knight's idea of a separate limit for formatting characters alongside the normal limit is a good idea, mainly because it satisfies my wish of formatting characters being treated separately. (And that just letting formatting characters go up to infinity is pretty abusable.)

      What do you think of that? If you like that or find it OK, what do you think the limits should be? I say 1500 characters for visible text and markdown text seems like a good bet.

      2 votes
      1. Emerald_Knight
        Link Parent
        Small correction: I never actually suggested a separate limit for formatting characters, so I'm being incorrectly credited. That's 100% your idea, merely inspired by me perhaps not communicating...

        Small correction: I never actually suggested a separate limit for formatting characters, so I'm being incorrectly credited. That's 100% your idea, merely inspired by me perhaps not communicating my thoughts as well as I'd intended. I do appreciate that you made sure to give credit where you believed it was due, though, so thank you for being considerate. But yeah, you own this idea, not me :)

        3 votes
  2. Emerald_Knight
    Link
    From a purely technical standpoint, a limit to formatting characters is still very much a necessity. Even if you use a DBMS that has non-fixed storage like MongoDB, you would still be opening...

    From a purely technical standpoint, a limit to formatting characters is still very much a necessity. Even if you use a DBMS that has non-fixed storage like MongoDB, you would still be opening yourself up to users potentially filling up the entirety of a 16MB document size limit by including an arbitrarily large amount of formatting characters. That can lead to a host of potential (if perhaps unlikely) problems, among them being any of Tildes' typically very tech literate users having a really bad day and inconveniencing other users with (again, under a MongoDB system) a 16MB chunk of data (painful for mobile users with very limited data) or potentially combining the 16MB document size with a DDoS attack to flood memory utilization.

    Limits are a necessity. What those limits should be are a matter for potential debate, and workarounds for larger limits can certainly be put in place such as @Deimos' mention of personal wiki pages, but reasonable limits are essential for ensuring stability and one of Tildes' goals of keeping the site lightweight.

    7 votes
  3. [3]
    precise
    Link
    Wow, that's an impressively in-depth bio... Just out of curiosity, what else would you like to include?

    Wow, that's an impressively in-depth bio... Just out of curiosity, what else would you like to include?

    5 votes
    1. rish
      Link Parent
      Not op. I want to populate it with links of stuff I like here and elsewhere.

      Not op. I want to populate it with links of stuff I like here and elsewhere.

      3 votes
    2. Kuromantis
      (edited )
      Link Parent
      After reading @rish's comment, adding some things I like (the 2 that come to mind are quotes and music) is definitely something I would do. That being said, I wasn't really thinking about adding...

      After reading @rish's comment, adding some things I like (the 2 that come to mind are quotes and music) is definitely something I would do.

      That being said, I wasn't really thinking about adding anything when I made the post, just not needing to "crunch" my markdown so my bio can fit under the 2000 character limit, even though the actual text you will be reading is almost unchanged.

      Using my table as an example

      Preformatted text, shortened

      |Q|A|
      |:-|-:|
      |Age?|15.|
      |Sex/Gender?|<small>~~I wish, Am I Rig-~~</small> Male.|
      |Gender & Sexual identity/orientation?|Cis-het. Use male pronouns.|
      |Vaguely lives in?|[São Paulo City](https://en.m.wikipedia.org/wiki/S%C3%A3o_Paulo), [São Paulo State](https://en.m.wikipedia.org/wiki/S%C3%A3o_Paulo_(state)), Brazil.|
      |Metaphorical profile pic?|[This, for now.](https://i.redd.it/c56kk2509tm41.jpg)|
      |Anything else that's particularly important?|I'm bilingual ([explanation](https://tildes.net/~talk/irj/tilditors_fluent_in_1_language_how_did_you_get_there#comment-43oo)) and also lightly autistic/have aspergers. I also don't have & never had much socially.|
      
      

      Preformatted text, not shortened

      | Q | A |
      |:----|----:|
      | Age? | 15. |
      | Sex/Gender? | <small><small><small> ~~I wish, amirit-~~ </small></small></small> Male. |
      | Gender and Sexual identity/orientation? | Cis-het. Use male pronouns. |
      | Vaguely lives in? | [São Paulo City](https://en.m.wikipedia.org/wiki/S%C3%A3o_Paulo), [São Paulo State](https://en.m.wikipedia.org/wiki/S%C3%A3o_Paulo_(state)), Brazil. |
      | Metaphorical profile pic? | [This, for now.](https://i.redd.it/c56kk2509tm41.jpg) |
      | Anything else that's particularly important? | I'm bilingual ([explanation](https://tildes.net/~talk/irj/tilditors_fluent_in_1_language_how_did_you_get_there#comment-43oo)) and also lightly autistic/have aspergers. I also don't have & never had much socially. |
      
      
      

      Formatted text, shortened

      Q A
      Age? 15.
      Sex/Gender? I wish, Am I Rig- Male.
      Gender & Sexual identity/orientation? Cis-het. Use male pronouns.
      Vaguely lives in? São Paulo City, São Paulo State, Brazil.
      Metaphorical profile pic? This, for now.
      Anything else that's particularly important? I'm bilingual (explanation) and also lightly autistic/have aspergers. I also don't have & never had much socially.

      Formatted text, not shortened

      Q A
      Age? 15.
      Sex/Gender? I wish, amirit- Male.
      Gender and Sexual identity/orientation? Cis-het. Use male pronouns.
      Vaguely lives in? São Paulo City, São Paulo State, Brazil.
      Metaphorical profile pic? This, for now.
      Anything else that's particularly important? I'm bilingual (explanation) and also lightly autistic/have aspergers. I also don't have & never had much socially.
      2 votes
  4. [6]
    MetArtScroll
    (edited )
    Link
    I agree with the OP and would add that non-ASCII characters most probably count as more than one character to the limit. (Edit: according to the responses, it is characters that count, not bytes,...

    I agree with the OP and would add that non-ASCII characters most probably count as more than one character to the limit. (Edit: according to the responses, it is characters that count, not bytes, so a multi-byte Unicode character counts as one character; still, markup tags should not count.)

    Is it feasible to re-implement user bios as a comment hack to some hidden topic, since comments can be much longer than 2000 characters?

    1 vote
    1. [5]
      Diff
      Link Parent
      why, what's the benefit? a character is a character isn't it?

      non-ASCII characters most probably count as more than one character

      why, what's the benefit? a character is a character isn't it?

      3 votes
      1. [4]
        Weldawadyathink
        Link Parent
        Not a benefit, just a (probable) fact of how Unicode is stored in binary. Most Unicode characters are a byte with a tag saying “look at the next byte too” these can be chained until you can store...

        Not a benefit, just a (probable) fact of how Unicode is stored in binary. Most Unicode characters are a byte with a tag saying “look at the next byte too” these can be chained until you can store many bytes in a system designed to only store one per character. Either tildes has a naive character counter that doesn’t understand Unicode, or the database uses fixed size storage.

        6 votes
        1. [2]
          Liru
          Link Parent
          The cool thing is, we can look at the Tildes source code to find out! The bio info is stored in the user model, and its restriction can be found here. It shows that it's using Postgres' built-in...

          The cool thing is, we can look at the Tildes source code to find out!

          The bio info is stored in the user model, and its restriction can be found here. It shows that it's using Postgres' built-in length function to determine that.

          What does Postgres' documentation say? Found here:

          Returns the number of characters in the string.

          length('jose') → 4

          Well, darn.

          ...what does Postgres itself say?

          liru=# select length('🐺');
           length
          --------
                1
          (1 row)
          

          Neat. For most intents and purposes, it seems like unicode characters count as 1 in length.

          13 votes
          1. Deimos
            Link Parent
            Just wanted to mention that the database check constraint is more of a fallback and makes sure that overly-long data doesn't end up in the column somehow, but the main check is here, just from the...

            Just wanted to mention that the database check constraint is more of a fallback and makes sure that overly-long data doesn't end up in the column somehow, but the main check is here, just from the max_length on that field declaration. The "schemas" are using the Python marshmallow library and used to validate/sanitize data.

            Python checks Unicode character-length properly too though, so it's the same result in the end.

            8 votes
        2. Diff
          Link Parent
          Those are implementation details that should be hidden from the user though, right? Same way that formatting shouldn't count towards the limit, it'd be similarly confusing for users to say "a...

          Those are implementation details that should be hidden from the user though, right? Same way that formatting shouldn't count towards the limit, it'd be similarly confusing for users to say "a character is a character unless it's most emoji or some other multi byte character."