16 votes

YAML: probably not so great after all

Tags: yaml, markup

23 comments

  1. [3]
    spit-evil-olive-tips
    Link
    My team at work has standardized on TOML for all config files and I've become a big fan. Looks a lot like the classic .ini file structure, but with a more formally defined grammar and parser.

    My team at work has standardized on TOML for all config files and I've become a big fan. Looks a lot like the classic .ini file structure, but with a more formally defined grammar and parser.

    11 votes
    1. calm_bomb
      Link Parent
      I work with Ansible at my job and I also have a small blog that I keep which I moved to hugo and while hugo accepts YAML too, I chose TOML because it looks so much nicer and readable. Sometimes I...

      I work with Ansible at my job and I also have a small blog that I keep which I moved to hugo and while hugo accepts YAML too, I chose TOML because it looks so much nicer and readable. Sometimes I find ansible's playbook files hard to read, especially when I get them from one of the colleagues.

      2 votes
    2. fifthecho
      Link Parent
      I like TOML, but it has its own weirdness. I was building a TOML for configuring an application in Habitat just today and was trying to mirror the config file I was templat-izing in the layout of...

      I like TOML, but it has its own weirdness.

      I was building a TOML for configuring an application in Habitat just today and was trying to mirror the config file I was templat-izing in the layout of the TOML.

      ...but if you try and jump from a TOML array-of-tables to a standalone key-value item is a no-go as no amount of whitespace will convince TOML that the key-value isn't part of the table.

      Is it better than YAML? Yes. Does it require some thinking about how you structure you data first? Absolutely.

  2. [5]
    Deimos
    Link
    I use YAML fairly regularly, and have definitely run into my fair share of issues with it. I'm currently using it on Tildes for the Salt configuration files. For example, here's the YAML file that...

    I use YAML fairly regularly, and have definitely run into my fair share of issues with it. I'm currently using it on Tildes for the Salt configuration files. For example, here's the YAML file that installs the Commonmark parser.

    I also chose it as the way for people to define configuration for AutoModerator on Reddit, which has probably caused a ton of people to write YAML who never would have normally. This has absolutely caused a ton of issues, especially related to the finicky indentation rules and some of the other tricky bits like when quotes are required, multi-line strings (for defining comments/messages), needing to mark some items with - markers, and more. A lot of help and troubleshooting has been needed overall, and people often post in /r/AutoModerator with YAML-based issues that they don't know how to solve.

    I'm still glad overall that I chose a text-based configuration method for AutoModerator (and I think Reddit is making a mistake by working on a replacement tool with GUI configuration), but I think I'd definitely look for options other than YAML if I was going to do it again.

    8 votes
    1. [2]
      Comment deleted by author
      Link Parent
      1. Deimos
        Link Parent
        Yeah, it's not quite as straightforward as shell scripts overall since it lets you define dependencies, ensures the server's all in a particular state, and a lot of other capabilities that would...

        Yeah, it's not quite as straightforward as shell scripts overall since it lets you define dependencies, ensures the server's all in a particular state, and a lot of other capabilities that would be pretty annoying and error-prone to recreate from scratch.

        I've mostly been happy with Salt, but the YAML is definitely a bit annoying sometimes. There are various other options for "renderers" that you can use, but I'm not sure how well they're all supported, and seem to have their own oddities too. I'm not sure if it would be worth switching.

    2. zlsa
      Link Parent
      Like @spit-evil-olive-tips (would make a great password btw!), I use and love TOML whenever I need a human-editable config file, when JSON would be too strict.

      Like @spit-evil-olive-tips (would make a great password btw!), I use and love TOML whenever I need a human-editable config file, when JSON would be too strict.

    3. [2]
      falc0n
      Link Parent
      Thanks again for Automod, really changed the game way back when. Can you expand on why you disagree with the move to a GUI? Is it just a concern about introducing lower flexibility?

      Thanks again for Automod, really changed the game way back when.

      Can you expand on why you disagree with the move to a GUI? Is it just a concern about introducing lower flexibility?

      1. Deimos
        (edited )
        Link Parent
        I think once you get beyond the simplest cases, the GUI will start being more unwieldy than helpful, and it also makes it more difficult for people to get help and assist others. A GUI will be...

        I think once you get beyond the simplest cases, the GUI will start being more unwieldy than helpful, and it also makes it more difficult for people to get help and assist others. A GUI will be better for basic things like "remove posts if they have a word in their title", but lots of AutoMod conditions combine many different checks and it will be difficult to set them up through a bunch of drop-downs and modals.

        Right now, when someone that doesn't really understand AutoMod wants to do something moderately complex with it, they can post an explanation of what they're trying to do, and someone more experienced can just give them a chunk of "code" (YAML) that they paste into their config, and they're done.

        When it's a GUI, that will be replaced by someone needing to post: "okay, you need to click 'add condition' and then choose 'author karma' in the dropdown and then type '100' in the box and then click 'add condition' and then choose 'post flair' in the new dropdown and then check the 'IS NOT' box below that and then...". Setting up the configuration becomes a process that everyone has to go through, instead of just being able to receive a finished behavior.

        Some other issues, just quick thoughts in no particular order:

        • The person asking for help won't easily be able to show their current not-working setup either. Maybe with a screenshot, depending how the config ends up working, but it's not as easy as just pasting the rule (or their whole config) into a text post.
        • You won't be able to copy an AutoMod configuration across multiple subreddits quickly, you'll have to re-configure it for each of them through that whole process.
        • You probably won't be able to search the whole config easily. With text it's easy to go, "I know there's a rule that has to do with 'spoiler' in the title" and just search for the word.
        • The text config naturally keeps a whole change history because it's using the reddit wiki, and some people also use external version-control tools to track the history of their config as well. A GUI version probably won't have a history at all, and it'll be difficult to fix any bad updates (or those cases where a mod gets hacked or otherwise maliciously deletes everything). It's possible to maintain a history of GUI-based changes, but it's hard and I doubt they'll implement it.
        • I think the general flexibility and ability to combine all the capabilities will be reduced with a more restrictive configuration method.
        3 votes
  3. [9]
    joelthelion
    Link
    I just wish JSON had comments.

    I just wish JSON had comments.

    6 votes
    1. [2]
      vakieh
      Link Parent
      JSON works better when you recognise it as the serialisation format that it is. It doesn't mandate comment structure, because that is the job of the serialiser/deserialiser in the (rare if you're...

      JSON works better when you recognise it as the serialisation format that it is. It doesn't mandate comment structure, because that is the job of the serialiser/deserialiser in the (rare if you're using it properly) case where you want comments.

      Me, I use leading underscores in field names as comment indicators. So you can have "foo", which the deserialiser will look at, and "_foo", which it won't.

      4 votes
      1. Litmus2336
        Link Parent
        Totally, but people keep trying to shoehorn JSON as a configuration tool.

        Totally, but people keep trying to shoehorn JSON as a configuration tool.

        2 votes
    2. [4]
      unknown user
      Link Parent
      I've never understood the people who complain about the lack of comments in JSON. If I have a config file with something like: , "db":...

      I've never understood the people who complain about the lack of comments in JSON. If I have a config file with something like:

      , "db": "postgres://postgres:postgres@localhost:5432/test?sslmode=disable"
      

      And I want to comment it out, I just do:

      , "// db": "postgres://postgres:postgres@localhost:5432/test?sslmode=disable"
      

      Which works unless I have a super-strict parser that excludes unknown fields, which in my practice was never the case.

      3 votes
      1. [2]
        spit-evil-olive-tips
        Link Parent
        I want comments in config files not simply for commenting things out, but for adding comments to the config file. Here's some syntactically valid TOML, for example: # if running in foobar mode,...

        I want comments in config files not simply for commenting things out, but for adding comments to the config file. Here's some syntactically valid TOML, for example:

        # if running in foobar mode, you probably want to disable this
        enable_spline_reticulator = true
        
        # only change this if you're sure you know what you're doing. 
        value_of_pi = 3.0
        
        12 votes
        1. unknown user
          Link Parent
          I've always thought that this information belongs in the documentation proper. But I see your point.

          I've always thought that this information belongs in the documentation proper. But I see your point.

          1 vote
      2. whbboyd
        Link Parent
        The combination of allowing unknown fields and allowing optional fields to be dropped is ill-advised because your software no longer has the ability to detect a large class of typos or off-schema...

        The combination of allowing unknown fields and allowing optional fields to be dropped is ill-advised because your software no longer has the ability to detect a large class of typos or off-schema inputs. (Not allowing optional fields to be dropped is just annoying.) If you're using it strictly as a data interchange format, never to be touched by human hands, then it matters less, but then you obviously also don't need comments.

        1 vote
    3. [2]
      mjb
      Link Parent
      Check out JSON5.

      Check out JSON5.

      2 votes
      1. DataWraith
        Link Parent
        There's also Hjson, though I'm not sure I'd want to use either of them. It just seems strange to use JSON like that -- if you need to import a custom library to deal with a file format such as...

        There's also Hjson, though I'm not sure I'd want to use either of them.

        It just seems strange to use JSON like that -- if you need to import a custom library to deal with a file format such as HJSON, you may just as well use a library that reads a file format that is easier to read and modify for humans (INI, TOML, Dhall, etc.).

        2 votes
  4. [6]
    unknown user
    Link
    It's an old article, but nobody seems to have posted it yet. I am honestly baffled at how YAML just keep popping up everywhere. Including places where people would be much better off using an...

    It's an old article, but nobody seems to have posted it yet. I am honestly baffled at how YAML just keep popping up everywhere. Including places where people would be much better off using an imperative language instead of a declarative one. Those include Ansible and GitLab CI configuration files.

    5 votes
    1. [2]
      Akir
      Link Parent
      I would agree; YAML is a poster child for excessive minimalism. And the problem with that is that it falls apart when facing complexity. And to be honest, I want the concept of indentation as...

      I would agree; YAML is a poster child for excessive minimalism. And the problem with that is that it falls apart when facing complexity.

      And to be honest, I want the concept of indentation as structure in programming and data modeling to go away. It's one thing to include them to make them human-readable, but to make it into a vital part of processing when it's not a well-defined function feels like a rather idiotic mistake, and forcing people to use n number of spaces as a standard is a bullheaded fix.

      9 votes
      1. Omnicrola
        Link Parent
        I'm 100% on board with this. Spaces, tabs, braces on same line or next line, etc. These kinds of preferences should be based entirely on how teams prefer to read their code. Having languages that...

        I want the concept of indentation as structure in programming and data modeling to go away.

        I'm 100% on board with this. Spaces, tabs, braces on same line or next line, etc. These kinds of preferences should be based entirely on how teams prefer to read their code. Having languages that use the characters humans need to organize and visually lay out their code as syntax is infuriating.

        looking at you, python

        4 votes
    2. [3]
      skybrian
      Link Parent
      I think a lot of people are wary adding a dependency on a scripting language interpreter when it's just for configuration? You can generate a config file using a script if that's how you want to...

      I think a lot of people are wary adding a dependency on a scripting language interpreter when it's just for configuration? You can generate a config file using a script if that's how you want to do it. The conservative approach seems to be to stick with JSON, perhaps adding some minor conveniences like comments.

      The CUE language Iooks interesting for large configurations, though.

      2 votes
      1. [2]
        unknown user
        Link Parent
        The thing is, advanced-enough configuration is indistinguishable from programming. Just think of all the times you've seen “conditionals” and “loops” in configs? Every time I see them, I can only...

        The thing is, advanced-enough configuration is indistinguishable from programming. Just think of all the times you've seen “conditionals” and “loops” in configs? Every time I see them, I can only think to myself, “Why the hell isn't this written in Lua?”.

        1. skybrian
          Link Parent
          Yes, sometimes people have essentially written programming languages on top of configuration languages, and I agree that it would make more sense at that point to switch to an actual scripting...

          Yes, sometimes people have essentially written programming languages on top of configuration languages, and I agree that it would make more sense at that point to switch to an actual scripting language.

          I don't think that's true of all configurations though. Sometimes it's just data.

          2 votes