8 votes

Weekly Programming Challenge - making our own data format

Hi everyone! There was no coding challenge last week, so I decided to make one this week. If someone wants to make his own challenge, wait few days and post it. I'm running out of ideas and I'd like to keep these challenges running on Tildes.


Everyone here knows data formats - I'm talking about XML or JSON. The task is to make your own format. The format can be as compact as possible, as human-readable as possible, or something that's really unique. Bonus points for writing encoder/decoder for your data format!

How do you handle long texts? Various unicode characters? Complex objects? Cyclic references? It's up to you if you make it fast and simple, or really complex.

I'm looking forward to your data formats. I'm sure they will beat at least csv. Good luck!

2 comments

  1. s4b3r6
    Link
    Unfortunately I don't quite have the time to approach this one this week, but it looks amazing fun... So instead, as a little inspiration for others, what about a minimal INI parser... For sh?...

    Unfortunately I don't quite have the time to approach this one this week, but it looks amazing fun...

    So instead, as a little inspiration for others, what about a minimal INI parser... For sh?

    It's been published before here (about 7 weeks ago?), and well... I created it for a device that could only run busybox with no room for anything else.

    It's kinda awful, but translates INI into variables with a standard name mangling:

    #!/bin/sh
    
    VER='0.1'
    DEBUG=1
    
    hr() {
      # Echo a whole-width line
    
      # Get terminal width
      hr_width=$(tput cols)
    
      # Print a character, multiple times to make our horizontal line
      # We need to declare c, but it is unused.
      #shellcheck disable=SC2034
      for c in $(seq 1 "$hr_width")
      do
        printf '-'
      done
    }
    
    help() {
      # Our helpfile!
      echo "minini v$VER"
      hr
      echo
      echo 'Usage:'
      hr
      echo
      echo 'minini FILE - Read an INI file, and output the generated variables.'
    }
    
    # Choose what to do
    if [ -z "$1" ] || [ "$1" = '-h' ] || [ "$1" = '--help' ]; then
      help
      exit
    fi
    
    # Confirm file is accessible
    if [ ! -f "$1" ]; then
      # True if file cannot be found
    
      echo 'Cannot find file. Does it exist?' >&2
      echo "File: $1" >&2
      exit 1
    
    elif [ ! -r "$1" ]; then
      # True if the user doesn't have the 'read' bit on the file in question
    
      echo 'File is not readable. Do you have permission?' >&2
      echo "File: $1" >&2
      exit 1
    
    fi
    
    # Default section
    # define before loop, so it doesn't get reset
    section='__main__'
    
    # Buffer for final output
    buffer=' '
    
    # Read data line by line
    cat "$1" | while read -r line
    do
      if [ ! -z "$line" ]; then
      # Skip blank lines
    
      # Strip whitespace
      data=$(echo "$line" | sed -e 's/^[[:space:]]*//')
    
      # Get first non-whitespace character
      c=$(echo "$data" | cut -c1)
    
      if [ "$c" = '[' ]; then
        # Parse section title
    
        # grep grabs between brackets, sed strips leading and trailing whitespace
        section=$(echo "$data" | grep -Po '\[\K[^]]*' | sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//')
    
        # make section shellsafe
        section=$(echo "ini_$section" | sed 's/[^a-zA-Z0-9]/_/g')
    
        if [ "$DEBUG" -eq 0 ]; then
          echo "Section: <$section>"
        fi
    
      elif echo "$data" | grep -q '='; then
        # Parse assignment
    
        # get varname and strip whitespace
        varname=$(echo "$data" | sed -e 's/=.*//' -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//')
    
        # get value to be assigned and strip whitespace
        value=$(echo "$data" | sed -e 's/.*=//' -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//')
    
        if [ "$DEBUG" -eq 0 ]; then
          echo "Variable: <$section::$varname> set to <$value>"
        fi
        buffer="${section}__$varname='$value'"
      fi
    
      # Echo built shell var, stripping blank lines
      echo "$buffer" | sed '/^\s*$/d'
      fi
    done
    

    This will translate an INI file like:

    [ Main ]
    x = 34
    y = 3
    
    [Hello]
    z=3
    x=Hello world
    
    [ Hello World!]
    y=Boo!
    
    [ This has numbers 62 ]
    12=twelve
    

    Into this:

    ini_Main__x='34'
    ini_Main__y='3'
    ini_Main__y='3'
    ini_Hello__z='3'
    ini_Hello__x='Hello world'
    ini_Hello__x='Hello world'
    ini_Hello_World___y='Boo!'
    ini_Hello_World___y='Boo!'
    ini_This_has_numbers_62__12='twelve'
    

    Which your script can then eval, and you can check that the appropriate values exist.

    It's slightly insane, but rather fun.

    3 votes
  2. super_james
    Link
    Challenge accepted. At uni after a lecture extolling the virtue of the water language we came up with an xml image schema. It was a long time ago so I can only give you the gist of it but...

    I'm sure they will beat at least csv. Good luck!

    Challenge accepted.

    At uni after a lecture extolling the virtue of the water language we came up with an xml image schema.

    It was a long time ago so I can only give you the gist of it but something like:

    <row number="1">
        <pixel number="1">
            <red value="255"/>
            <green value="30"/>
            <blue value="46"/>
            <hue value="356"/>
            <saturation value="88.2"/>
            <value value="100.0"/>
        </pixel>
        <pixel number="2">
            <red value="1"/>
            <green value="18"/>
            <blue value="46"/>
            <hue value="217"/>
            <saturation value="97.8"/>
            <value value="18.0"/>
        </pixel>
    </row>
    

    Ahh good times.

    2 votes