26 votes

How many valid JSON strings are there?

14 comments

  1. [9]
    Crespyl
    Link
    It took me a minute to realize the author was not talking about the values of strings as defined in the JSON spec (anything inside of "..." pairs), but "JSON blobs encoded as strings" also known...

    It took me a minute to realize the author was not talking about the values of strings as defined in the JSON spec (anything inside of "..." pairs), but "JSON blobs encoded as strings" also known as JSON documents.

    The article explores the number of valid JSON documents of several (small) sizes.

    Neat!

    8 votes
    1. [8]
      teaearlgraycold
      (edited )
      Link Parent
      Edit: am wrong Acktshually, a JSON document is something else. In the article the author says: A JSON primitive such as "null" is not a JSON document. A document must be either an array or object...

      Edit: am wrong


      also known as JSON documents

      Acktshually, a JSON document is something else. In the article the author says:

      Some valid JSON strings for N = 4 would be 'null', 'true', '1234', '"ab"' and '[{}]'

      A JSON primitive such as "null" is not a JSON document. A document must be either an array or object at the top level.

      4 votes
      1. [6]
        Johz
        Link Parent
        I'm not sure that's true. According to the spec, a JSON document is an element, which is a value surrounded by optional whitespace, and value can be a primitive, an array, or an object. Unless you...

        I'm not sure that's true. According to the spec, a JSON document is an element, which is a value surrounded by optional whitespace, and value can be a primitive, an array, or an object.

        Unless you are using "JSON document" in a very specific way here, but I've not come across it before like that.

        9 votes
        1. balooga
          Link Parent
          My takeaway was that the author considers it a valid string if JSON.parse() accepts it. Which it does with primitives like “null”.

          My takeaway was that the author considers it a valid string if JSON.parse() accepts it. Which it does with primitives like “null”.

          6 votes
        2. [2]
          teaearlgraycold
          Link Parent
          Oh no. Well I thought I’d read that somewhere.

          Oh no. Well I thought I’d read that somewhere.

          2 votes
          1. Johz
            Link Parent
            I think some parsers only accept objects or arrays (because 99% of the time that's what you want), and some formats like Toml that are essentially based on the JSON data structure might only allow...

            I think some parsers only accept objects or arrays (because 99% of the time that's what you want), and some formats like Toml that are essentially based on the JSON data structure might only allow objects at the top level, but spec JSON is very flexible.

            2 votes
        3. [2]
          smores
          Link Parent
          Wait, is this true? I don't see the word document anywhere in that spec. I agree that a JSON value can be a primitive, array, or object, but I've mostly seen a JSON "document" referred in the...

          Wait, is this true? I don't see the word document anywhere in that spec. I agree that a JSON value can be a primitive, array, or object, but I've mostly seen a JSON "document" referred in the context of either a document store or a JSON schema document. In both of those cases, a document is specifically an object.

          I wouldn't be surprised if there simply is no commonly accepted definition of JSON document, but I don't think it's uncommon to use it to refer to a JSON-serializable object!

          1 vote
          1. Johz
            Link Parent
            I would think of a JSON document as being a document (i.e. file) in JSON format, in which case anything goes according to the spec, but you make a good point about document stores typically having...

            I would think of a JSON document as being a document (i.e. file) in JSON format, in which case anything goes according to the spec, but you make a good point about document stores typically having more restrictions on what's allowed.

            2 votes
      2. lily
        Link Parent
        I think this might have changed at some point. From what I remember when I wrote my own JSON parser a few months ago, the latest JSON specification doesn't restrict the type of top-level values;...

        I think this might have changed at some point. From what I remember when I wrote my own JSON parser a few months ago, the latest JSON specification doesn't restrict the type of top-level values; it only mandates that there be only one. But I know some older JSON parsers only allow objects and arrays to be top-level. It's possible older versions of the spec had such a restriction, but it was removed? (To be fair, there's really not much of a point to a JSON document containing only a single string or number...)

        Honestly, the JSON specification has always been a little wishy-washy. The fact that JSON parsers are allowed to parse anything they want, as long as valid JSON documents still work correctly, feels a little weird to me. It means there are a lot of slightly different, incompatible versions of JSON out there.

        3 votes
  2. [5]
    Notcoffeetable
    Link
    As I was reading I was struggling to figure out why someone would be interested in counting these. Obviously "because it's there" is always a reasonable answer to such questions. But this is a...

    As I was reading I was struggling to figure out why someone would be interested in counting these. Obviously "because it's there" is always a reasonable answer to such questions. But this is a hard enough problem that usually you'd want some justification for this effort.

    • I think it is a great example of how we think about begin counting "uncountably finite" types of things.
    • One would want to do exactly this type of analysis when building a hash function of sorts.

    Any other ideas how one would use this information?

    4 votes
    1. Raspcoffee
      Link Parent
      Personally, I like these sort of 'thought experiments' with lack of better term because of how absurd the scale of these are. A bit like the amount of possible Go games, or solvable Sudoku's, it's...

      Personally, I like these sort of 'thought experiments' with lack of better term because of how absurd the scale of these are. A bit like the amount of possible Go games, or solvable Sudoku's, it's absurd. I like statistical mechanics when I was still a student of physics for that reason too, which is one of those fields that are always even more difficult than you think. Even if you think you got it nailed down there are always ways things can get even more complicated, fast.

      And this then kind of shows another reason someone might want to do things like this: you learn how to approach problems like that. Even getting a close estimate can be useful if you need to tackle a problem in order to get a sense of scale should you work with computing, thermodynamics, or something else that deals with a lot of possibilities.

      4 votes
    2. [2]
      cutmetal
      Link Parent
      Absurd idea: theoretically you could use this work to build a json validator for input strings below a certain length, by generating every valid string and either storing them or hashes of them in...

      Absurd idea: theoretically you could use this work to build a json validator for input strings below a certain length, by generating every valid string and either storing them or hashes of them in a lookup table. The validator would have a ridiculously low computational complexity, though the real-world speed would probably be poor due to the enormous size of the table, and it may take an unrealistic amount of memory to run. But there could be limited useful applications for this, especially if you were able to restrict the input to ascii.

      3 votes
      1. balooga
        Link Parent
        That is an absurd idea, lol. I really don’t think there are any practical use cases for this, but that’s okay. It’s just a fun thing to think about. People, or at least some people with certain...

        That is an absurd idea, lol. I really don’t think there are any practical use cases for this, but that’s okay. It’s just a fun thing to think about. People, or at least some people with certain mathematical inclinations, like to think about navigating impossibly large but finite possibility spaces. It’s a reminder that all the answers to all of the mysteries of the universe are just there, waiting to be found. It doesn’t have to have real world applications, it’s brain candy.

        4 votes
    3. vektor
      Link Parent
      Testing how close to ideal your domain-specific compression algorithm is?

      Testing how close to ideal your domain-specific compression algorithm is?

      2 votes