• Activity
  • Votes
  • Comments
  • New
  • All activity
  • Showing only topics in ~comp with the tag "coding". Back to normal view / Search all groups
    1. Game Development Career Advice

      Hi, I'm curious if anyone in this group has achieved success in game development, whether that's carving out a career or earning any amount of income from it. I'm currently working as a software...

      Hi,

      I'm curious if anyone in this group has achieved success in game development, whether that's carving out a career or earning any amount of income from it.

      I'm currently working as a software developer, but my passion lies in game development. I'm all too aware that achieving any measure of success in this field is next to impossible. Hence, I'm reaching out here, hoping to gather insights and advice from those who have walked this path in the past, or those who are currently walking alongside/behind me.

      One of my specific questions is about the types of games I should focus on creating. Specifically, I've heard differing opinions on whether it's more advantageous to develop a series of small games with advertisements for mobile platforms or to invest in larger, premium games for platforms like Steam. Can anyone share their insights or experiences regarding this dilemma? Is there a clear advantage to one approach over the other?

      Currently I am using godot to make a larger scale game, but I am considering switching to defold and making smaller scale games with ads.

      I saw some folks here discuss making games for the playdate. How much should one consider targeting niche platforms like this? Some of the users I saw discuss this seem to have had good success.

      Some general questions: How did you break into game dev? What were you doing before? Do you see game dev as a viable career, only as a source of side income, or is it just a hobby?

      Any guidance or experiences you can share would be greatly appreciated.

      17 votes
    2. Is there a programming language that brings you joy?

      Just for a moment, forget all of the technical pros and cons, the static typing, just-in-time compilation, operator overloading, object orientation to the max... Is there a programming language...

      Just for a moment, forget all of the technical pros and cons, the static typing, just-in-time compilation, operator overloading, object orientation to the max...

      Is there a programming language that you've just found to be... fun?

      Is there one that you'd pick above all else for personal or company projects, if you had your druthers, because you would simply be so excited to use it?

      And then, is there something missing in that "fun" language that's preventing it from actually becoming a reality (i.e. small community, lack of libraries, maintenance ended in the 80s, etc.)?

      50 votes
    3. How would you structure an Open Collective with the objective of teaching programming to raise money for a cause?

      I am asking as I have just created one. I won't advertise it here, as it feels not in good faith and I don't think Tildes is the right audience (I imagine most of the techies here are probably...

      I am asking as I have just created one. I won't advertise it here, as it feels not in good faith and I don't think Tildes is the right audience (I imagine most of the techies here are probably fairly seasoned).

      I want to offer some kind of programming tuition to people at a good rate (read: affordable to those that might be on a low income but wish to learn). I am doing this to raise money for my local cardiology ward, who have just been told there isn't enough in the budget to cover their Christmas party this year. Morale is low there, and I'd like to help cover the deficit.

      How would you structure something like this?

      Initially, I have written that I have no set fee and am happy to offer services on case-by-case basis (words to that effect). But in a discussion with a friend, they suggested I should do something like:

      • Small donation (£1 - £25): Access to a chatroom (Discord?) where someone can ask questions, and I'll strive to answer and help them as fast as possible)
      • Medium donation (£25 - £50): I will arrange a group session where I cover some basic programming concepts and host a Q&A at the end to help bridge any gaps in understanding.
      • Large donation (£50+): I will arrange a one-to-one session (via call, video or instant messaging) where I will help go more in-depth on a topic or help debug a specific problem.

      If anyone has any experience with this type of thing, I'd appreciate any advice. I have only been a professional software developer for three years, so I am reasonably experienced, but not exactly an industry veteran. I want to set realistic expectations for this service.

      I'm happy to share a link to the open collective via private message if anyone wants to have a look over it and offer any advice.

      9 votes
    4. Why do so many developers provide only 64-bit or x64 builds of their software these days?

      Doesn't it make sense to only provide 32-bit versions to cover maximum user base? Considering that 64-bit operating systems do support 32-bit apps whereas the inverse isn't true. If you release...

      Doesn't it make sense to only provide 32-bit versions to cover maximum user base? Considering that 64-bit operating systems do support 32-bit apps whereas the inverse isn't true.

      If you release for x32, you'd have covered the maximum users whereas if you release for x64, you'd have covered only that block. At least open source developers who intend for maximum coverage or user-base for their apps should support at least 32-bit (if not both).

      Below is a great answer post in this regard, credits Ken Gregg on Quora:

      Yes, there are a lot of 32-bit programs still being developed/sold/distributed. No, not every program is 64-bit.

      64-bit operating systems on 64-bit hardware can run 32-bit applications. And there are lots of computers still running 32-bit operating systems (and will for quite some time). So, an application developer can release one 32-bit product and cover the 32-bit and 64-bit customers. Releasing and maintaining two products, one 32-bit and one 64-bit, incurs some costs. And providing only a 64-bit version leaves all the 32-bit customers in the dust. It is logical and cost-effective to supply only a 32-bit version to cover both groups, as long as the application doesn’t require 64-bit features.

      Of course, the embedded systems arena still has tons of new development of 32-bit, 16-bit, and even 8-bit software/firmware. The choice of an embedded microcontroller is based on cost, availability, and features required, so software developed in this realm runs the gamut of bitness.

      21 votes
    5. Request: Ideas and tips for creating a portfolio to get a web developer job

      Hi everyone — I am trying to get a job in web development after a decade in a mostly unrelated field. I am looking for ideas and tips to create a portfolio to send with applications. All of the...

      Hi everyone — I am trying to get a job in web development after a decade in a mostly unrelated field.

      I am looking for ideas and tips to create a portfolio to send with applications. All of the websites I worked on ages ago have been taken offline or redesigned by someone else. I do have a website I created for my music, but it’s just vanilla HTML. I also have a personal website which is really the only thing I have to show.

      I know HTML/CSS quite well, but that’s basically it. I’ve worked with WordPress for years but only just recently began learning enough PHP to do anything custom. I don’t really know Javascript much at all.

      I have quite a few paid courses through Udemy for all these different areas but even as I have completed them, I don’t feel confident in knowledge of the different languages. These courses nearly always come with projects that the students create with the instructor. Should I use these as part of my portfolio? For some reason I never felt right doing that, since I didn’t build it myself.

      So I guess I’m curious (if any of you are web developers) if you have suggestions for how to fill out a portfolio without any previous work examples.

      Side note: I wasn’t sure how to word the title or my question particularly well so please edit it more clearly, Those Who Can Edit.

      edit: thank you to everyone who took the time to reply to this. it’s all been very helpful and i appreciate everyone’s input immensely!

      23 votes
    6. Programming Challenge: Implementing bitwise operators.

      Background: Bitwise operators are operators that perform conditional operations at the binary level. These operators are bitwise AND &, bitwise OR |, bitwise XOR ^, and bitwise NOT ~, not to be...

      Background: Bitwise operators are operators that perform conditional operations at the binary level. These operators are bitwise AND &, bitwise OR |, bitwise XOR ^, and bitwise NOT ~, not to be confused with their logical operator counterparts, i.e. &&, ||, !=, and ! respectively.

      Specifically, these operations take the binary values of the left- and right-hand terms and perform the conditional operation on each matching bit position between both values.

      For instance, 3 | 4 takes the binary value 010 from 2 and 100 from 4. From left to right, we compare 0 || 1, 1 || 0, and 0 || 0 to get 110 as the resulting binary. This produces the integer value 6.

      Goal: Your challenge is to implement one or more of these bitwise operators without using the native operators provided to you by your language of choice. Logical operators are allowed, however. These operators should work on integer values. These operators will likely take on the form of a function or method that accepts two arguments.

      Bonus challenges:

      • Implement all of the operators.
      • Don't use any native binary conversion utilities.
      • Whether or not you implement all operators, write your code in such a way that allows for good code reusability.
      • For statically typed languages, handle integers of different types (e.g. int vs. uint).

      Edit: Minor correction for the sake of accuracy, courtesy of @teaearlgraycold.

      12 votes
    7. Do C programmers usually create and curate a personal library for their own use?

      I've been using mostly C at my current job for about half a year now, and I find myself reusing some little function that I've written for another code base in current projects. I'm relatively new...

      I've been using mostly C at my current job for about half a year now, and I find myself reusing some little function that I've written for another code base in current projects. I'm relatively new to this, so I'm wondering if it makes sense to have a repertoire of general purpose utility functions and whatnot for future use.

      I mean, the language's pretty established and whatever I think of must have been written by somebody else already, so is there even a need for what I'm talking about? Are there well-known open source libraries that resemble what I am talking about? Should I just include them instead of writing my own?

      Sorry if this is a bit vague. General purpose as in string manipulation, debug output, buffer operations, implementations of data types not in C, etc., just to name a few examples.

      32 votes
    8. Why store code as text files?

      Code is usually version controlled nowadays in git or some other VCS. These typically operate on text files and record the changes applied to the files over their history. One drawback from this...

      Code is usually version controlled nowadays in git or some other VCS. These typically operate on text files and record the changes applied to the files over their history. One drawback from this is that formatting of the code can introduce changesbto the files that make no semantic difference, e.g. newlines are added/removed, indentation is altered etc.

      Consistent formatting makes the code easier to read, but the style used is an aesthetic preference. There might be objective reasons for readability in at least the extreme cases, but in many cases the formatting is purely a preferred style.

      If we instead version controlled code in the form of an abstract syntax tree (AST) (possibly even as just a series of transformations on that tree), we could have any formatting we'd like! When editing the code we would just be changing a projection of the AST and when we've made our changes the transformations could be made to the stored AST. If two languages shared the same AST the choice of language even becomes a choice for the programmer. Sadly this has some limitations since ASTs are usually language specific... But we could possibly take this a step further.

      Could we take a compiled binary and use that as the basis for generating an AST? This is essentially what decompilers do. For heavily optimized code this is severely limited, but for debug builds a lot of extra information is retained in the binary that can be utilized to construct a sensible representation. This way of storing code the language used becomes a style preference! Code compiled from one language might become alien when viewed in another language (thinking of lazy Haskell code viewed in C), but maybe that is a corner case?

      There are issues when considering binaries for different platforms. A binary for the JVM isn't the same as one for ARM64 or one compiled to run on an x86. So there are some limitations there...

      One (very) good thing about storing code as text files is the ubiquity of software capable of viewing and editing text. It would however be cool if we could make programming language a stylistic preference that is compatible with other languages! At least the AST part should be perfectly achievable.

      16 votes
    9. Play Chess against GPT-2

      @theshawwn: I am preparing to release a notebook where you can play chess vs GPT-2. If anyone wants to help beta test it: 1. visit https://t.co/CpWrFvtnY2 2. open in playground mode 3. click Runtime -> Run All 4. Scroll to the bottommost cell and wait 6 minutes If you get stuck, tell me.

      5 votes
    10. Programming Challenge: Convert between units

      Hi everyone! It's been a long time since last programming challenge list, and here's a nice one I've encountered. If you search for something like 7km to AU, you'll get your answer. But how is it...

      Hi everyone! It's been a long time since last programming challenge list, and here's a nice one I've encountered.

      If you search for something like 7km to AU, you'll get your answer. But how is it done? I don't think they hardcoded all 23 units of distance and every conversion factor between them.

      If you were programming a conversion system - how would you do it?

      First of all, you have input in format that you can specify, for example something like this:

      meter kilometer 1000
      mile kilometer 1.609344
      second minute 60
      ...
      

      Then you should be able answer queries. For example 7 mile meter should convert 7 miles to meters, which is 11265.41.

      Can you design an algorithm that will convert any unit into any other unit?

      Edit: Some conversion rates I extracted from wikipedia:

      ångström
      0.1nm
      astronomical unit
      149597870700m
      attometre
      0.000000000000000001m
      barleycorn
      8.4m
      bohr
      0.00846
      cable length (imperial)
      185.3184m
      cable length
      185.2m
      cable length (US)
      219.456m
      chain (Gunters)
      20.11684m
      cubit
      0.5m
      ell
      1.143m
      fathom
      1.8288m
      femtometre
      0.00000000000001m
      fermi
      0.00000000000001m
      finger
      0.022225m
      finger (cloth)
      0.1143m
      foot (Benoit)
      0.304799735m
      foot (Cape) (H)
      0.314858m
      foot (Clarke's) (H)
      0.3047972654m
      foot (Indian) (H)
      0.304799514m
      foot,metric
      0.31622776602m
      foot,metric (long)
      0.3m
      foot,metric (short)
      0.30m
      foot (International)
      0.3048m
      foot (Sear's) (H)
      0.30479947m
      foot (US Survey)
      0.304800610
      french
      0.0003m
      furlong
      201.168m
      hand
      0.1016m
      inch
      0.0254m
      league
      4828m
      light-day
      25902068371200m
      light-hour
      107925284880m
      light-minute
      17987547480
      light-second
      299792458m
      light-year
      31557600light-second
      line
      0.002116m
      link (Gunter's)
      0.2011684m
      link (Ramsden's; Engineer's)
      0.3048m
      metre
      1m
      m
      1metre
      km
      1000m
      mickey
      0.000127
      micrometre
      0.000001
      mil; thou
      0.0000254
      mil
      10km
      mile (geographical)
      6082foot (International)
      quarter
      0.2286m
      rod
      5.0292m
      rope
      6.096m
      shaku
      0.303 0303m
      span (H)
      0.2286m
      stick (H)
      0.0508m
      toise
      1.949 0363m
      twip
      1.76310
      yard
      0.9144m
      
      17 votes
    11. Topic Requests: What subject would you like to see covered in more depth?

      For those who haven't seen my essay-length posts in the past, I occasionally like to delve into explaining different programming concepts, particularly with regards to making your code easier to...

      For those who haven't seen my essay-length posts in the past, I occasionally like to delve into explaining different programming concepts, particularly with regards to making your code easier to manage. Sometimes this has to do with how you structure you code and projects, and at others it has to do with how you think about the problems you're solving. I've been in the mood to write up on yet another programming subject, but nothing in particular has stood out to me lately during the course of my work.

      With that in mind, I figured I would take a different approach and see if anyone here had some specific requests for content they would like to see. Requests from all levels of experience are welcome!

      (And for those who are itching to do a write-up on any of the requests that appear here, feel free to call dibs!)


      Edit

      For those who want to take a dive into my previous submissions, you can now find them in the new wiki entry created by @cfabbro or directly via the programming.code_quality_tips tag here.

      8 votes
    12. Code Quality Tip: The importance of understanding correctness vs. accuracy.

      Preface It's not uncommon for a written piece of code to be both brief and functionality correct, yet difficult to reason about. This is especially true of recursive algorithms, which can require...

      Preface

      It's not uncommon for a written piece of code to be both brief and functionality correct, yet difficult to reason about. This is especially true of recursive algorithms, which can require some amount of simulating the algorithm mentally (or on a whiteboard) on smaller problems to try to understand the underlying logic. The more you have to perform these manual simulations, the more difficult it becomes to track what exactly is going on at any stage of computation. It's also not uncommon that these algorithms can be made easier to reason about with relatively small changes, particularly in the way you conceptualize the solution to the problem. Our goal will be to take a brief tour into what these changes might look like and why they are effective at reducing our mental overhead.


      Background

      We will consider the case of the subset sum problem, which is essentially a special case of the knapsack problem where you have a finite number of each item and each item's value is equal to its weight. In short, the problem is summarized as one of the following:

      • Given a set of numbers, is there a subset whose sum is exactly equal to some target value?

      • Given a set of numbers, what is the subset whose sum is the closest to some target value without exceeding it?

      For example, given the set of numbers {1, 3, 3, 5} and a target value of 9, the answer for both of those questions is {1, 3, 5} because the sum of those numbers is 9. For a target value of 10, however, the first question has no solution because no combination of numbers in the set {1, 3, 3, 5} produces a total of 10, but the second question produces a solution of {1, 3, 5} because 9 is the closest value to 10 that those numbers can produce without going over.


      A Greedy Example

      We'll stick to the much simpler case of finding an exact match to our target value so we don't have to track what the highest value found so far is. To make things even simpler, we'll consider the case where all numbers are positive, non-zero integers. This problem can be solved with some naive recursion--simply try all combinations until either a solution is found or all combinations have been exhausted. While more efficient solutions exist, naive recursion is the easiest to conceptualize.

      An initial assessment of the problem seems simple enough. Our solution is defined as the set of array elements whose total is equal to our target value. To achieve this, we loop through each of the elements in the array, try combinations with all of the remaining elements, and keep track of what the current total is so we can compare it to our target. If we find an exact match, we return an array containing the matching elements, otherwise we return nothing. This gives us something like the following:

      function subsetSum($target_sum, $values, $total = 0) {
          // Base case: a total exceeding our target sum is a failure.
          if($total > $target_sum) {
              return null;
          }
      
          // Base case: a total matching our target sum means we've found a match.
          if($total == $target_sum) {
              return array();
          }
      
          foreach($values as $index=>$value) {
              // Recursive case: try combining the current array element with the remaining elements.
              $result = subsetSum($target_sum, array_slice($values, $index + 1), $total + $value);
      
              if(!is_null($result)) {
                  return array_merge(array($value), $result);
              }
          }
      
          return null;
      }
      

      Your Scope is Leaking

      This solution works. It's functionally correct and will produce a valid result every single time. From a purely functional perspective, nothing is wrong with it at all; however, it's not easy to follow what's going on despite how short the code is. If we look closely, we can tell that there are a few major problems:

      • It's not obvious at first glance whether or not the programmer is expected to provide the third argument. While a default value is provided, it's not clear if this value is only a default that should be overridden or if the value should be left untouched. This ambiguity means relying on documentation to explain the intention of the third argument, which may still be ignored by an inattentive developer.

      • The base case where a failure occurs, i.e. when the accumulated total exceeds the target sum, occurs one stack frame further into the recursion than when the total has been incremented. This forces us to consider not only the current iteration of recursion, but one additional iteration deeper in order to track the flow of execution. Ideally an iteration of recursion should be conceptually isolated from any other, limiting our mental scope to only the current iteration.

      • We're propagating an accumulating total that starts from 0 and increments toward our target value, forcing us to to track two different values simultaneously. Ideally we would only track one value if possible. If we can manage that, then the ambiguity of the third argument will be eliminated along with the argument itself.

      Overall, the amount of code that the programmer needs to look at and the amount of branching they need to follow manually is excessive. The function is only 22 lines long, including whitespace and comments, and yet the amount of effort it takes to ensure you're understanding the flow of execution correctly is pretty significant. This is a pretty good indicator that we probably did something wrong. Something so simple and short shouldn't take so much effort to understand.


      Patching the Leak

      Now that we've assessed the problems, we can see that our original solution isn't going to cut it. We have a couple of ways we could approach fixing our function: we can either attempt to translate the abstract problems into tangible solutions or we can modify the way we've conceptualized the solution. With that in mind, let's take a second crack at this problem by trying the latter.

      We've tried taking a look at this problem from a top-down perspective: "given a target value, are there any elements that produce a sum exactly equal to it?" Clearly this perspective failed us. Instead, let's try flipping the equation: "given an array element, can it be summed with others to produce the target value?"

      This fundamentally changes the way we can think about the problem. Previously we were hung up on the idea of keeping track of the current total sum of the elements we've encountered so far, but that approach is incompatible with the way we're thinking of this problem now. Rather than incrementing a total, we now find ourselves having to do something entirely different: if we want to know if a given array element is part of the solution, we need to first subtract the element from the problem and find out if the smaller problem has a solution. That is, to find if the element 3 is part of the solution for the target sum of 8, then we're really asking if 3 + solutionFor(5) is valid.

      The new solution therefore involves looping over our array elements just as before, but this time we check if there is a solution for the target sum minus the current array element:

      function subsetSum($target_sum, $values) {
          // Base case: the solution to the target sum of 0 is the empty set.
          if($target_sum === 0) {
              return array();
          }
      
          foreach($values as $index=>$value) {
              // Base case: any element larger than our target sum cannot be part of the solution.
              if($value > $target_sum) {
                  continue;
              }
      
              // Recursive case: do the remaining elements create a solution for the sub-problem?
              $result = subsetSum($target_sum - $value, array_slice($values, $index + 1));
      
              if(!is_null($result)) {
                  return array_merge(array($value), $result);
              }
          }
      
          return null;
      }
      

      A Brief Review

      With the changes now in place, let's compare our two functions and, more importantly, compare our new function to the problems we assessed with the original. A few brief points:

      • Both functions are the same exact length, being only 22 lines long with the same number of comments and an identical amount of whitespace.

      • Both functions touch the same number of elements and produce the same output given the same input. Apart from a change in execution order of a base case, functionality is nearly identical.

      • The new function no longer requires thinking about the scope of next iteration of recursion to determine whether or not an array element is included in the result set. The base case for exceeding the target sum now occurs prior to recursion, keeping the scope of the value comparison nearest where those values are defined.

      • The new function no longer uses a third accumulator argument, reducing the number of values to be tracked and removing the issue of ambiguity with whether or not to include the third argument in top-level calls.

      • The new function is now defined in terms of finding the solutions to increasingly smaller target sums, making it easier to determine functional correctness.

      Considering all of the above, we can confidently state that the second function is easier to follow, easier to verify functional correctness for, and less confusing for anyone who needs to use it. Although the two functions are nearly identical, the second version is clearly and objectively better than the original. This is because despite both being functionally correct, the first function does a poor job at accurately defining the problem it's solving while the second function is clear and accurate in its definition.

      Correct code isn't necessarily accurate code. Anyone can write code that works, but writing code that accurately defines a problem can mean the difference between understanding what you're looking at, and being completely bewildered at how, or even why, your code works in the first place.


      Final Thoughts

      Accurately defining a problem in code isn't easy. Sometimes you'll get it right, but more often than not you'll get it wrong on the first go, and it's only after you've had some distance from you original solution that you realize that you should've done things differently. Despite that, understanding the difference between functional correctness and accuracy gives you the opportunity to watch for obvious inaccuracies and keep them to a minimum.

      In the end, even functionally correct, inaccurate code is worth more than no code at all. No amount of theory is a replacement for practical experience. The only way to get better is to mess up, assess why you messed up, and make things just a little bit better the next time around. Theory just makes that a little easier.

      17 votes
    13. Game Frameworks: What are people using for game jams nowadays?

      Hi, I've been mulling ideas about a game for a while now, I'd like to hack out a prototype, and my default would be Love2D. (As an aside: one of the things I like about Love2D was that you could...

      Hi,

      I've been mulling ideas about a game for a while now, I'd like to hack out a prototype, and my default would be Love2D. (As an aside: one of the things I like about Love2D was that you could make a basic 'game' in a couple of LoC, and it was 'efficient enough' for what you got. Perhaps the only gripe I had with it was that it didn't output compiled binaries (I mean, you could make it do that, but it seemed like a hack). I think Polycode seemed to be a semi-serious contender, but last I checked (a year or two ago) it's pretty much as dead as a doornail. Some of the other alternatives I remember seeing (Godot? Unity?) felt too much like Blender.

      So I've been wondering, it's been a while since I've been keeping tabs on the 'gamedev community', so I don't know if there have been any more recent development in that space.

      So I guess my question is: What are people using for game jams nowadays? Preach to me (and everyone else) about your favorite framework and language :)

      15 votes
    14. Programming Challenge: Text compression

      In an effort to make these weekly, I present a new programming challenge. The challenge this week is to compress some text using a prefix code. Prefix codes associate each letter with a given bit...

      In an effort to make these weekly, I present a new programming challenge.

      The challenge this week is to compress some text using a prefix code. Prefix codes associate each letter with a given bit string, such that no encoded bitstring is the prefix of any other. These bit strings are then concatenated into one long integer which is separated into bytes for ease of reading. These bytes can be represented as hex values as well. The provided prefix encoding is as follows:

      char value char value
      ' ' 11 'e' 101
      't' 1001 'o' 10001
      'n' 10000 'a' 011
      's' 0101 'i' 01001
      'r' 01000 'h' 0011
      'd' 00101 'l' 001001
      '~' 001000 'u' 00011
      'c' 000101 'f' 000100
      'm' 000011 'p' 0000101
      'g' 0000100 'w' 0000011
      'b' 0000010 'y' 0000001
      'v' 00000001 'j' 000000001
      'k' 0000000001 'x' 00000000001
      'q' 000000000001 'z' 000000000000

      Challenge

      Your program should accept a lowercase string (including the ~ character), and should output the formatted compressed bit string in binary and hex. Your final byte should be 0 padded so that it has 8 bits as required. For your convenience, here is the above table in a text file for easy read-in.

      Example

      Here is an example:

      $> tildes ~comp
      10010100 10010010 01011010 10111001 00000010 11000100 00110000 10100000
      94 92 5A B9 02 C4 30 A0
      

      Bonuses

      1. Print the data compression ratio for a given compression, assuming the original input was encoded in 8 bit ASCII (one byte per character).
        2. Output the ASCII string corresponding to the encoded byte string in addition to the above outputs.
      2. @onyxleopard points out that many bytes won't actually be valid ASCII. Instead, do as they suggested and treat each byte as an ordinal value and print it as if encoded as UTF-8.
      3. An input prefixed by 'D' should be interpreted as an already compressed string using this encoding, and should be decompressed (by inverting the above procedure).

      Previous Challenges (I am aware of prior existing ones, but it is hard to collect them as they were irregular. Thus I list last week's challenge as 'Week 1')
      Week 1

      13 votes
    15. Programming Challenge: Dice Roller

      Its been a while since we did one of these, which is a shame. Create a program that takes is an input of the type: "d6 + 3" or "2d20 - 5", and return a valid roll. The result should display both...

      Its been a while since we did one of these, which is a shame.

      Create a program that takes is an input of the type: "d6 + 3" or "2d20 - 5", and return a valid roll.
      The result should display both the actual rolls as well as the final result. The program should accept any valid roll of the type 'xdx'
      Bonuses:

      • Multiplication "d6 * 3"
      • Division "d12 / 6"
      • Polish notation "4d6 * (5d4 - 3)"

      As a side note, it would be really cool if weekly programming challenges became a thing

      33 votes
    16. Coding Challenge - Design network communication protocol

      Previous challenges It's time for another coding challenge! This challenge isn't mine, it's this challenge (year 5, season 3, challenge 3) by ČVUT FIKS. The task is to design a network...

      Previous challenges

      It's time for another coding challenge!

      This challenge isn't mine, it's this challenge (year 5, season 3, challenge 3) by ČVUT FIKS.

      The task is to design a network communication protocol. You're sending large amount of bits over the network. The problem is that network is not perfect and the message sometimes arrives corrupted. Design a network protocol, that will guarantee that the decoded message will be exactly same as the message that was encoded.

      MESSAGE => (encoding) => message corrupted => (decoding) => MESSAGE
      

      Corruption

      Transmitting the message might corrupt it and introduce errors. Each error in a message (there might be more than one error in a single message) will flip all following bits of the message.

      Example:

      011101 => 011|010
      

      (| is place where an error occured).

      There might be more than one error in a message, but there are some rules:

      • Minimum distance between two errors in a single message is k

      • Number of bits between two errors is always odd number

      According to these rules, describe a communication protocol, that will encode a message, and later decode message with errors.

      Bonus

      • Guarantee your protocol will work always - even when errors are as common as possible

      • Try to make the protocol as short as possible.

      8 votes
    17. Programming Challenge: Build an Interpreter

      Hello everyone! It has been a while since last programming challenge, it's time for another one! This week's goal would be to build your own interpreter. Interpreter is program that receives input...

      Hello everyone! It has been a while since last programming challenge, it's time for another one!

      This week's goal would be to build your own interpreter.

      Interpreter is program that receives input and executes it. For example Python is interpreted language, meaning you are actually writing instructions for the interpreter, which does the magic.

      Probably the easiest interpereter to write is Brainfuck interpreter. If someone here doesn't know, Brainfuck is programming language, which contains following instructions: ,.<>[]-+. Other characters are ignored. It has memory in form of array of integers. At the start, pointer that points to one specific memory cell points to cell 0. We can use < to move pointer to left (decrement) and > to move pointer to right (increment). . can be used to print value of cell the pointer is currently pointing to (ascii). , can be used to read one character from stdin and write it to memory. [ is beggining of loop and ] is end of loop. Loops can be nested. Loop is terminated when we reach ] character and current value in memory is equal to 0. - can be used to decrement value in memory by 1 and + can be used to increment value in memory by 1. Here's Hello World:

      ++++++++++[>+++++++>++++++++++>+++>+<<<<
      -]>++.>+.+++++++..+++.>++.<<++++++++++++
      +++.>.+++.------.--------.>+.>.
      

      People with nothing to do today can attemp to make an interpreter for the Taxi programming language.

      You can even make your own language! There are no limits for this challenge.

      23 votes
    18. Conceptualizing Data: Simplifying the way we think about complex data structures.

      Preface Conceptual models in programming are essential for being able to reason about problems. We see this through code all the time, with implementation details hidden away behind abstractions...

      Preface

      Conceptual models in programming are essential for being able to reason about problems. We see this through code all the time, with implementation details hidden away behind abstractions like functions and objects so that we can ignore the cumbersome details and focus only on the details that matter. Without these abstractions and conceptual models, we might find ourselves overwhelmed by the size and complexity of the problem we’re facing. Of these conceptual models, one of the most easily neglected is that of data and object structure.


      Data Types Galore

      Possibly one of the most overwhelming aspects of conceptualizing data and object structure is the sheer breadth of data types available. Depending on the programming language you’re working with, you may find that you have more than several dozens of object classes already defined as part of the language’s core; primitives like booleans, ints, unsigned ints, floats, doubles, longs, strings, chars, and possibly others; arrays that can contain any of the objects or primitives, and even other arrays; and several other data structures like queues, vectors, and mixed-type collections, among others.

      With so many types of data, it’s incredibly easy to lose track in a sea of type declarations and find yourself confused and unsure of where to go.


      Tree’s Company

      Let’s start by trying to make these data types a little less overwhelming. Rather than thinking strictly of types, let’s classify them. We can group all data types into one of three basic classifications:

      1. Objects, which contain key/value pairs. For example, an object property that stores a string.
      2. Arrays, which contain some arbitrary number of values.
      3. Primitives, which contain nothing. They’re simply a “flat” data value.

      We can also make a couple of additional notes. First, arrays and objects are very similar; both contain references to internal data, but the way that data is referenced differs. In particular, objects have named keys while arrays have numeric, zero-indexed keys. In a sense, arrays are a special case of objects where the keys are more strictly typed. From this, we can condense the classifications of objects and arrays into the more general “container” classification.

      With that in mind, we now have the following classifications:

      1. Containers.
      2. Primitives.

      We can now generally state that containers may contain other containers and primitives, and primitives may not contain anything. In other words, all data structures are a composition of containers and/or primitives, where containers may accept containers and/or primitives and primitives may not accept anything. More experienced programmers should notice something very familiar about this description--we’re basically describing a tree structure! Primitive types and empty containers act as the leaves in a tree, whereas objects and arrays act as the nodes.


      Trees Help You Breathe

      Okay, great. So what’s the big deal, anyway? We’ve now traded a bunch of concrete data types that we can actually think about and abstracted them away into this nebulous mess of containers and primitives. What do we get out of this?

      A common mistake many programmers make is planning their data types out from the very beginning. Rather than planning out an abstraction for their data and object architecture, it’s easy to immediately find yourself focusing too much on the concrete implementation details.

      Imagine, for example, modeling a user account for an online payment system. A common feature to include is the ability to store payment information for auto-pay, and payment methods typically take the form of some combination of credit/debit cards and bank accounts. If we focus on implementation details from the beginning, then we may find ourselves with something like this in a first iteration:

      UserAccount: {
          username: String,
          password: String,
          payment_methods: PaymentMethod[]
      }
      
      PaymentMethod: {
          account_name: String,
          account_type: Enum,
          account_holder: String,
          number: String,
          routing_number: String?,
          cvv: String?,
          expiration_date: DateString?
      }
      

      We then find ourselves realizing that PaymentMethod is an unnecessary mess of optional values and needing to refactor it. Odds are we would break it off immediately into separate account types and make a note that they both implement some interface. We may also find that, as a result, remodeling the PaymentMethod could result in the need to remodel the UserAccount. For more deeply nested data structures, a single change deeper within the structure could result in those changes cascading all the way to the top-level object. If we have multiple objects, then these changes could propagate to them as well. And what if we decide a type needs to be changed, like deciding that our expiration date needs to be some sort of date object? Or what if we decide that we want to modify our property names? We’re then stuck having to update these definitions as we go along. What if we decide that we don't want an interface for different payment method types after all and instead want separate collections for each type? Then including the interface consideration will have proven to be a waste of time. The end result is that before we’ve even touched a single line of code, we’ve already found ourselves stuck with a bunch of technical debt, and we’re only in our initial planning stages!

      To alleviate these kinds of problems, it’s far better to just ignore the implementation details. By doing so, we may find ourselves with something like this:

      UserAccount: {
          Username,
          Password,
          PaymentMethods
      }
      
      PaymentMethods: // TODO: Decide on this container’s structure.
      
      CardAccount: {
          AccountName,
          CardHolder,
          CardNumber,
          CVV,
          ExpirationDate,
          CardType
      }
      
      BankAccount: {
          AccountName,
          AccountNumber,
          RoutingNumber,
          AccountType
      }
      

      A few important notes about what we’ve just done here:

      1. We don’t specify any concrete data types.
      2. All fields within our models have the capacity to be either containers or primitives.
      3. We’re able to defer a model’s structural definition without affecting the pace of our planning.
      4. Any changes to a particular field type will automatically propagate in our structural definitions, making it trivial to create a definition like ExpirationDate: String and later change it to ExpirationDate: DateObject.
      5. The amount of information we need to think about is reduced down to the very bare minimum.
      6. By deferring the definition of the PaymentMethods structure, we find ourselves more inclined to focus on the more concrete payment method definitions from the very beginning, rather than trying to force them to be compatible through an interface.
      7. We focused only on data representation, ensuring that representation and implementation are both separate and can be handled differently if needed.

      SOLIDifying Our Conceptual Model

      In object-oriented programming (OOP), there’s a generally recommended set of principles to follow, represented by the acronym “SOLID”:

      • Single responsibility.
      • Open/closed.
      • Liskov substitution.
      • Interface segregation.
      • Dependency inversion.

      These “SOLID” principles were defined to help resolve common, recurring design problems and anti-patterns in OOP.

      Of particular note for us is the last one, the “dependency inversion” principle. The idea behind this principle is that implementation details should depend on abstractions, not the other way around. Our new conceptual model obeys the dependency inversion principle by prioritizing a focus on abstractions while leaving implementation details to the future class definitions that are based on our abstractions. By doing so, we limit the elements involved in our planning and problem-solving stages to only what is necessary.


      Final Thoughts

      The consequences of such a conceptual model extend well beyond simply planning out data and object structures. For example, if implemented as an actual programming or language construct, you could make the parsing of your data fairly simple. By implementing an object parser that performs reflection on some passed object, you can extract all of the publicly accessible object properties of the target object and the data contained therein. Thus, if your language doesn’t have a built-in JSON encoding function and no library yet exists, you could recursively traverse your data structure to generate the appropriate JSON with very little effort.

      Many of the most fundamental programming concepts, like data structures ultimately being nothing more than trees at their most abstract representation, are things we tend to take for granted and think very little about. By making ourselves conscious of these fundamental concepts, however, we can more effectively take advantage of them.

      Additionally, successful programmers typically solve a programming problem before they’ve ever written a single line of code. Whether or not they’re conscious of it, the tools they use to solve these problems effectively consist largely of the myriad conceptual models they’ve collected and developed over time, and the experience they’ve accumulated to determine which conceptual models need to be utilized to solve a particular problem.

      Even when you have a solid grasp of your programming fundamentals, you should always revisit them every now and then. Sometimes there are details that you may have missed or just couldn’t fully appreciate when you learned about them. This is something that I’m continually reminded of as I continue on in my own career growth, and I hope that I can continue passing these lessons on to others.

      As always, I'm absolutely open to feedback and questions!

      15 votes
    19. Programming Challenge - Find path from city A to city B with least traffic controls inbetween.

      Previous challenges Hi, it's been very long time from last Programming Challenge, and I'd like to revive the tradition. The point of programming challenge is to create your own solution, and if...

      Previous challenges

      Hi, it's been very long time from last Programming Challenge, and I'd like to revive the tradition.

      The point of programming challenge is to create your own solution, and if you're bored, even program it in your favourite programming language. Today's challenge isn't mine. It was created by ČVUT FIKS (year 5, season 2, challenge #4).

      You need to transport plans for your quantum computer through Totalitatia. The problem is, that Totalitatia's government would love to have the plans. And they know you're going to transport the computer through the country. You'll receive number N, which denotes number of cities on the map. Then, you'll get M paths, each going from one city to another. Each path has k traffic controls. They're not that much effective, but the less of them you have to pass, the better. Find path from city A to city B, so the maximum number of traffic controls between any two cities is minimal. City A is always the first one (0) and city B is always the last one (N-1).

      Input format:

      N
      M
      A1 B1 K1
      A2 B2 K2
      ...
      

      On the first two lines, you'll get numbers N (number of cities) and M (number of paths). Than, on next M lines, you'll get definition of a path. The definition looks like 1 2 6, where 1 is id of first city and 2 is id of second city (delimited by a space). You can go from city 1 to city 2, or from city 2 to city 1. The third number (6) is number of traffic controls.

      Output format:

      Single number, which denotes maximum number of traffic controls encountered on one path.

      Hint: This means, that path that goes via roads with numbers of traffic controls 4 4 4 is better than path via roads with numbers of traffic controls 1 5 1. First example would have output 4, the second one would have output 5.

      Example:

      IN:

      4
      5
      0 1 3
      0 2 2
      1 2 1
      1 3 4
      2 3 5
      

      OUT:

      4
      

      Solution: The optimal path is either 0 2 1 3 or 0 1 3.

      Bonus

      • Describe time complexity of your algorithm.
      • If multiple optimal paths exist, find the shortest one.
      • Does your algorithm work without changing the core logic, if the source city and the target city is not known beforehand (it changes on each input)?
      • Do you use special collection to speed up minimum value search?

      Hints

      Special collection to speed up algorithm

      13 votes
    20. Programming Challenge: Anagram checking.

      It's been over a week since the last programming challenge and the previous one was a bit more difficult, so let's do something easier and more accessible to newer programmers in particular. Write...

      It's been over a week since the last programming challenge and the previous one was a bit more difficult, so let's do something easier and more accessible to newer programmers in particular. Write a function that takes two strings as input and returns true if they're anagrams of each other, or false if they're not.

      Extra credit tasks:

      • Don't consider the strings anagrams if they're the same barring punctuation.
      • Write an efficient implementation (in terms of time and/or space complexity).
      • Minimize your use of built-in functions and methods to bare essentials.
      • Write the worst--but still working--implementation that you can conceive of.
      24 votes
    21. Code Quality Tip: Cyclomatic complexity in depth.

      Preface Recently I briefly touched on the subject of cyclomatic complexity. This is an important concept for any programmer to understand and think about as they write their code. In order to...

      Preface

      Recently I briefly touched on the subject of cyclomatic complexity. This is an important concept for any programmer to understand and think about as they write their code. In order to provide a more solid understanding of the subject, however, I feel that I need to address the topic more thoroughly with a more practical example.


      What is cyclomatic complexity?

      The concept of "cyclomatic complexity" is simple: the more conditional branching and looping in your code, the more complex--and therefore the more difficult to maintain--that code is. We can visualize this complexity by drawing a diagram that illustrates the flow of logic in our program. For example, let's take the following toy example of a user login attempt:

      <?php
      
      $login_data = getLoginCredentialsFromInput();
      
      $login_succeeded = false;
      $error = '';
      if(usernameExists($login_data['username'])) {
          $user = getUser($login_data['username']);
          
          if(!isDeleted($user)) {
              if(!isBanned($user)) {
                  if(!loginRateLimitReached($user)) {
                      if(passwordMatches($user, $login_data['password'])) {
                          loginUser($user);
                          $login_succeeded = true;
                      } else {
                          $error = getBadPasswordError();
                          logBadLoginAttempt();
                      }
                  } else {
                      $error = getLoginRateLimitError($user);
                  }
              } else {
                  $error = getUserBannedError($user);
              }
          } else {
              $error = getUserDeletedError($user);
          }
      } else {
          $error = getBadUsernameError($login_data['username']);
      }
      
      if($login_succeeded) {
          sendSuccessResponse();
      } else {
          sendErrorResponse($error);
      }
      
      ?>
      

      A diagram for this logic might look something like this:

      +-----------------+
      |                 |
      |  Program Start  |
      |                 |
      +--------+--------+
               |
               |
               v
      +--------+--------+    +-----------------+
      |                 |    |                 |
      |    Username     +--->+    Set Error    +--+
      |    Exists?      | No |                 |  |
      |                 |    +-----------------+  |
      +--------+--------+                         |
               |                                  |
           Yes |                                  |
               v                                  |
      +--------+--------+    +-----------------+  |
      |                 |    |                 |  |
      |  User Deleted?  +--->+    Set Error    +->+
      |                 | Yes|                 |  |
      +--------+--------+    +-----------------+  |
               |                                  |
            No |                                  |
               v                                  |
      +--------+--------+    +-----------------+  |
      |                 |    |                 |  |
      |  User Banned?   +--->+    Set Error    +->+
      |                 | Yes|                 |  |
      +--------+--------+    +-----------------+  |
               |                                  |
            No |                                  |
               v                                  |
      +--------+--------+    +-----------------+  |
      |                 |    |                 |  |
      |   Login Rate    +--->+    Set Error    +->+
      | Limit Reached?  | Yes|                 |  |
      |                 |    +-----------------+  |
      +--------+--------+                         |
               |                                  |
            No |                                  |
               v                                  |
      +--------+--------+    +-----------------+  |
      |                 |    |                 |  |
      |Password Matches?+--->+    Set Error    +->+
      |                 | No |                 |  |
      +--------+--------+    +-----------------+  |
               |                                  |
           Yes |                                  |
               v                                  |
      +--------+--------+    +----------+         |
      |                 |    |          |         |
      |   Login User    +--->+ Converge +<--------+
      |                 |    |          |
      +-----------------+    +---+------+
                                 |
                                 |
               +-----------------+
               |
               v
      +--------+--------+
      |                 |
      |   Succeeded?    +-------------+
      |                 | No          |
      +--------+--------+             |
               |                      |
           Yes |                      |
               v                      v
      +--------+--------+    +--------+--------+
      |                 |    |                 |
      |  Send Success   |    |   Send Error    |
      |    Message      |    |    Message      |
      |                 |    |                 |
      +-----------------+    +-----------------+
      

      It's important to note that between nodes in this directed graph, you can find certain enclosed regions being formed. Specifically, each conditional branch that converges back into the main line of execution generates an additional region. The number of these distinct enclosed regions is directly proportional to the level of cyclomatic complexity of the system--that is, more regions means more complicated code.


      Clocking out early.

      There's an important piece of information I noted when describing the above example:

      . . . each conditional branch that converges back into the main line of execution generates an additional region.

      The above example is made complex largely due to an attempt to create a single exit point at the end of the program logic, causing these conditional branches to converge and thus generate the additional enclosed regions within our diagram.

      But what if we stopped trying to converge back into the main line of execution? What if, instead, we decided to interrupt the program execution as soon as we encountered an error? Our code might look something like this:

      <?php
      
      $login_data = getLoginCredentialsFromInput();
      
      if(!usernameExists($login_data['username'])) {
          sendErrorResponse(getBadUsernameError($login_data['username']));
          return;
      }
      
      $user = getUser($login_data['username']);
      if(isDeleted($user)) {
          sendErrorResponse(getUserDeletedError($user));
          return;
      }
      
      if(isBanned($user)) {
          sendErrorResponse(getUserBannedError($user));
          return;
      }
      
      if(loginRateLimitReached($user)) {
          logBadLoginAttempt($user);
          sendErrorResponse(getLoginRateLimitError($user));
          return;
      }
      
      if(!passwordMatches($user, $login_data['password'])) {
          logBadLoginAttempt($user);
          sendErrorResponse(getBadPasswordError());
          return;
      }
      
      loginUser($user);
      sendSuccessResponse();
      
      ?>
      

      Before we've even constructed a diagram for this logic, we can already see just how much simpler this logic is. We don't need to traverse a tree of if statements to determine which error message has priority to be sent out, we don't need to attempt to follow indentation levels, and our behavior on success is right at the very end and at the lowest level of indentation, where it's easily and obviously located at a glance.

      Now, however, let's verify this reduction in complexity by examining the associated diagram:

      +-----------------+
      |                 |
      |  Program Start  |
      |                 |
      +--------+--------+
               |
               |
               v
      +--------+--------+    +-----------------+
      |                 |    |                 |
      |    Username     +--->+   Send Error    |
      |    Exists?      | No |    Message      |
      |                 |    |                 |
      +--------+--------+    +-----------------+
               |
           Yes |
               v
      +--------+--------+    +-----------------+
      |                 |    |                 |
      |  User Deleted?  +--->+   Send Error    |
      |                 | Yes|    Message      |
      +--------+--------+    |                 |
               |             +-----------------+
            No |
               v
      +--------+--------+    +-----------------+
      |                 |    |                 |
      |  User Banned?   +--->+   Send Error    |
      |                 | Yes|    Message      |
      +--------+--------+    |                 |
               |             +-----------------+
            No |
               v
      +--------+--------+    +-----------------+
      |                 |    |                 |
      |   Login Rate    +--->+   Send Error    |
      | Limit Reached?  | Yes|    Message      |
      |                 |    |                 |
      +--------+--------+    +-----------------+
               |
            No |
               v
      +--------+--------+    +-----------------+
      |                 |    |                 |
      |Password Matches?+--->+   Send Error    |
      |                 | No |    Message      |
      +--------+--------+    |                 |
               |             +-----------------+
           Yes |
               v
      +--------+--------+
      |                 |
      |   Login User    |
      |                 |
      +--------+--------+
               |
               |
               v
      +--------+--------+
      |                 |
      |  Send Success   |
      |    Message      |
      |                 |
      +-----------------+
      

      Something should immediately stand out here: there are no enclosed regions in this diagram! Furthermore, even our new diagram is much simpler to follow than the old one was.


      Reality is rarely simple.

      The above is a really forgiving example. It has no loops, and loops are going to create enclosed regions that can't be broken apart so easily; it has no conditional branches that are so tightly coupled with the main path of execution that they can't be broken up; and the scope of functionality and side effects are minimal. Sometimes you can't break those regions up. So what do we do when we inevitably encounter these cases?

      High cyclomatic complexity in your program as a whole is inevitable for sufficiently large projects, especially in a production environment, and your efforts to reduce it can only go so far. In fact, I don't recommend trying to remove all or even most instances of cyclomatic complexity at all--instead, you should just be keeping the concept in mind to determine whether or not a function, method, class, module, or other component of your system is accumulating technical debt and therefore in need of refactoring.

      At this point, astute readers might ask, "How does refactoring help if the cyclomatic complexity doesn't actually go away?", and this is a valid concern. The answer to that is simple, however: we're hiding complexity behind abstractions.

      To test this, let's forget about cyclomatic complexity for a moment and instead focus on simplifying the refactored version of our toy example using abstraction:

      <?php
      
      function handleLoginAttempt($login_data) {
          if(!usernameExists($login_data['username'])) {
              sendErrorResponse(getBadUsernameError($login_data['username']));
              return;
          }
      
          $user = getUser($login_data['username']);
          if(isDeleted($user)) {
              sendErrorResponse(getUserDeletedError($user));
              return;
          }
      
          if(isBanned($user)) {
              sendErrorResponse(getUserBannedError($user));
              return;
          }
      
          if(loginRateLimitReached($user)) {
              logBadLoginAttempt($user);
              sendErrorResponse(getLoginRateLimitError($user));
              return;
          }
      
          if(!passwordMatches($user, $login_data['password'])) {
              logBadLoginAttempt($user);
              sendErrorResponse(getBadPasswordError());
              return;
          }
      
          loginUser($user);
          sendSuccessResponse();
      }
      
      $login_data = getLoginCredentialsFromInput();
      
      handleLoginAttempt($login_data);
      
      ?>
      

      The code above is functionally identical to our refactored example from earlier, but has an additional abstraction via a function. Now we can diagram this higher-level abstraction as follows:

      +-----------------+
      |                 |
      |  Program Start  |
      |                 |
      +--------+--------+
               |
               |
               v
      +--------+--------+
      |                 |
      |  Attempt Login  |
      |                 |
      +-----------------+
      

      This is, of course, a pretty extreme example, but this is how we handle thinking about complex program logic. We abstract it down to the barest basics so that we can visualize, in its simplest form, what the program is supposed to do. We don't actually care about the implementation unless we're digging into that specific part of the system, because otherwise we would be so bogged down by the details that we wouldn't be able to reason about what our program is supposed to do.

      Likewise, we can use these abstractions to hide away the cyclomatic complexity underlying different components of our software. This keeps everything clean and clutter-free in our head. And the more we do to keep our smaller components simple and easy to think about, the easier the larger components are to deal with, no matter how much cyclomatic complexity all of those components share as a collective.


      Final Thoughts

      Cyclomatic complexity isn't a bad thing to have in your code. The concept itself is only intended to be used as one of many tools to assess when your code is accumulating too much technical debt. It's a warning sign that you may need to change something, nothing more. But it's an incredibly useful tool to have available to you and you should get comfortable using it.

      As a general rule of thumb, you can usually just take a glance at your code and assess whether or not there's too much cyclomatic complexity in a component by looking for either of the following:

      • Too many loops and/or conditional statements nested within each other, i.e. you have a lot of indentation.
      • Many loops in the same function/method.

      It's not a perfect rule of thumb, but it's useful for at least 90% of your development needs, and there will inevitably be cases where you will prefer to accept some greater cyclomatic complexity because there is some benefit that makes it a better trade-off. Making that judgment is up to you as a developer.

      As always, I'm more than willing to listen to feedback and answer any questions!

      25 votes
    22. A Brief Look at Webhook Security

      Preface Software security is one of those subjects that often gets overlooked, both in academia and in professional projects, unless you're specifically working with some existing security-related...

      Preface

      Software security is one of those subjects that often gets overlooked, both in academia and in professional projects, unless you're specifically working with some existing security-related element (e.g. you're taking a course on security basics, or updating your password hashing algorithm). As a result, we frequently see stories of rather catastrophic data leaks from otherwise reputable businesses, leaks which should have been entirely preventable with even the most basic of safeguards in place.

      With that in mind, I thought I would switch things up and discuss something security-related this time.


      Background

      It's commonplace for complex software systems to avoid unnecessarily large expenses, especially in terms of technical debt and the capital involved in the initial development costs of building entire systems for e.g. geolocation or financial transactions. Instead of reinventing the wheel and effectively building a parallel business, we instead integrate with existing third-party systems, typically by using an API.

      The problem, however, is that sometimes these third-party systems process requests over a long period of time, potentially on the order of minutes, hours, days, or even longer. If, for example, you have users who want to purchase something using your online platform, then it's not a particularly good idea to having potentially thousands of open connections to that third-party system all sitting there waiting multiple business days for funds to clear. That would just be stupid. So, how do we handle this in a way that isn't incredibly stupid?

      There are two commonly accepted methods to avoid having to wait around:

      1. We can periodically contact the third-party system and ask for the current status of a request, or
      2. We can give the third-party system a way to contact us and let us know when they're finished with a request.

      Both of these methods work, but obviously there will be a potentially significant delay in #1 between when a request finishes and when we know that it has finished (with a maximum delay of the wait time between status updates), whereas in #2 that delay is practically non-existent. Using #1 is also incredibly inefficient due to the number of wasted status update requests, whereas #2 allows us to avoid that kind of waste. Clearly #2 seems like the ideal option.

      Method #2 is what we call a webhook.


      May I see your ID?

      The problem with webhooks is that when you're implementing one, it's far too easy to forget that you need to restrict access to it. After all, that third-party system isn't a user, right? They're not a human. They can't just give us a username and password like we want them to. They don't understand the specific requirements for our individual, custom-designed system.

      But what happens if some malicious actor figures out what the webhook endpoint is? Let's say that all we do is log webhook requests somewhere in a non-capped file or database table/collection. Barring all other possible attack vectors, we suddenly find ourselves susceptible to that malicious actor sending us thousands, possibly millions of fraudulent data payloads in a small amount of time thanks to a botnet, and now our server's I/O utilization is spiking and the entire system is grinding to a halt--we're experiencing a DDoS!

      We don't want just anyone to be able to talk to our webhook. We want to make sure that anyone who does is verified and trusted. But since we can't require a username and password, since we can't guarantee that the third-party system will even know how to make use of them, what can we do?

      The answer is to use some form of token-based authentication--we generate a unique token, kind of like an ID card, and we attach it to our webhook endpoint (e.g. https://example.com/my_webhook/{unique_token}). We can then check that token for validity every time someone touches our webhook, ensuring that only someone we trust can get in.


      Class is in Session

      Just as there are two commonly accepted models for how to handle receiving updates from third-party systems, there are also two common models for how to assign a webhook to those systems:

      1. Hard-coding the webhook in your account settings, or
      2. Passing a webhook as part of request payload.

      Model #1 is, in my experience, the most common of the two. In this model, our authentication token is typically directly linked to some user or user-like object in our system. This token is intended to be persisted and reused indefinitely, only scrapped in the event of a breach or a termination of integration with the service that uses it. Unfortunately, if the token is present within the URL, it's possible for your token to be viewed in plaintext in your logs.

      In model #2, it's perfectly feasible to mirror the behavior of model #1 by simply passing the same webhook endpoint with the same token in every new request; however, there is a far better solution. We can, instead, generate a brand new token for each new request to the third-party system, and each new token can be associated with the request itself on our own system. Rather than only validating the token itself, we then validate that the token and the request it's supposed to be associated with are both valid. This ensures that even in the event of a breach, a leaked authentication token's extent of damage is limited only to the domain of the request it's associated with! In addition, we can automatically expire these tokens after receiving a certain number of requests, ensuring that a DDoS using a single valid token and request payload isn't possible. As with model #1, however, we still run into problems of token exposure if the token is present in the URL.

      Model #2 treats each individual authentication token not as a session for an entire third-party system, but as a session for a single request on that system. These per-request session tokens require greater effort to implement, but are inherently safer due to the increased granularity of our authentication and our flexibility in allowing ourselves to expire the tokens at will.


      Final Thoughts

      Security is hard. Even with per-request session tokens, webhooks still aren't as secure as we might like them to be. Some systems allow us to define tokens that will be inserted into the request payload, but more often than not you'll find that only a webhook URL is possible to specify. Ideally we would stuff those tokens right into the POST request payload for all of our third-party systems so they would never be so easily exposed in plaintext in log files, but legacy systems tend to be slow to catch up and newer systems often don't have developers with the security background to consider it.

      Still, as far as securing webhooks goes, having some sort of cryptographically secure authentication token is far better than leaving the door wide open for any script kiddie having a bad day to waltz right in and set the whole place on fire. If you're integrating with any third-party system, your job isn't to make it impossible for them to get their hands on a key, but to make it really difficult and to make sure you don't leave any gasoline lying around in case they do.

      8 votes
    23. Programming Challenge - It's raining!

      Hi everyone, it's been 12 days since last programming challenge. So here's another one. The task is to make an algorithm that'll count how long would it take to fill system of lakes with water....

      Hi everyone, it's been 12 days since last programming challenge. So here's another one. The task is to make an algorithm that'll count how long would it take to fill system of lakes with water.

      It's raining in the forest. The forest is full of lakes, which are close to each other. Every lake is below the previous one (so 1st lake is higher than 2nd lake, which is higher than 3rd lake). Lakes are empty at the beginning, and they're filling at rate of 1l/h. Once a lake is full, all water that'd normally fall into the lake will flow to the next lake.

      For example, you have lakes A, B, and C. Lake A can hold 1 l of water, lake B can hold 3 l of water and lake C can hold 5 l of water. How long would it take to fill all the lakes?
      After one hour, the lakes would be: A (1/1), B (1/3), C(1/5). After two hours, the lakes would be: A(1/1), B(3/3), C(2/5) (because this hour, B received 2l/h - 1l/h from the rain and 1l/h from lake A). After three hours, the lakes would be: A(1/1), B(3/3), C(5/5). So the answer is 3. Please note, that the answer can be any rational number. For example if lake C could hold only 4l instead of 5, the answer would be 2.66666....

      Hour 0:

      
      \            /
        ----(A)----
                             \                /
                              \              /
                               \            /
                                ----(B)----
                                                   \           /
                                                    \         /
                                                     \       /
                                                     |       |
                                                     |       |
                                                      --(C)--
      

      Hour 1:

      
      \============/
        ----(A)----
                             \                /
                              \              /
                               \============/
                                ----(B)----
                                                   \           /
                                                    \         /
                                                     \       /
                                                     |       |
                                                     |=======|
                                                      --(C)--
      

      Hour 2:

                  ==============
      \============/           |
        ----(A)----            |
                             \================/
                              \==============/
                               \============/
                                ----(B)----
                                                   \           /
                                                    \         /
                                                     \       /
                                                     |=======|
                                                     |=======|
                                                      --(C)--
      

      Hour 3:

                  ==============
      \============/           |
        ----(A)----            |             ========
                             \================/       |
                              \==============/        |
                               \============/         |
                                ----(B)----           |
                                                   \===========/
                                                    \=========/
                                                     \=======/
                                                     |=======|
                                                     |=======|
                                                      --(C)--
      

      Good luck everyone! Tell me if you need clarification or a hint. I already have a solution, but it sometimes doesn't work, so I'm really interested in seeing yours :-)

      21 votes
    24. An Alternative Approach to Configuration Management

      Preface Different projects have different use cases that can ultimately result in common solutions not suiting your particular needs. Today I'm going to diverging a bit from my more abstract,...

      Preface

      Different projects have different use cases that can ultimately result in common solutions not suiting your particular needs. Today I'm going to diverging a bit from my more abstract, generalized topics on code quality and instead focus on a specific project structure example that I encountered.


      Background

      For a while now, I've found myself being continually frustrated with the state of my project configuration management. I had a single configuration file that would contain all of the configuration options for the various tools I've been using--database, API credentials, etc.--and I kept running into the problem of wanting to test these tools locally while not inadvertently committing and pushing sensitive credentials upstream. For me, part of my security process is ensuring that sensitive access credentials never make it into the repository and to limit access to these credentials to only people who need to be able to access them.


      Monolithic Files Cause Monolithic Pain

      The first thing I realized was that having a single monolithic configuration file was just terrible practice. There are going to be common configuration options that I want to have in there with default values, such as local database configuration pointing to a database instance running on the same VM as the application. These should always be in the repo, otherwise any dev who spins up an instance of the VM will need to manually tread documentation and copy-paste the missing options into the configuration. This would be incredibly time-consuming, inefficient, and stupid.

      I also use different tools which have different configuration options associated with them. Having to dig through a single file containing configuration options for all of these tools to find the ones I need to modify is cumbersome at best. On top of that, having those common configuration options living in the same place that sensitive access credentials do is just asking for a rogue git commit -A to violate the aforementioned security protocol.


      Same Problem, Different Structure

      My first approach to resolving this problem was breaking the configuration out into separate files, one for each distinct tool. In each file, a "skeleton" config was generated, i.e. each option was given a default empty value. The main config would then only contain config options that are common and shared across the application. To avoid having the sensitive credentials leaked, I then created rules in the .gitignore to exclude these files.

      This is where I ran into problem #2. I learned that this just doesn't work. You can either have a file in your repo and have all changes to that file tracked, have the file in your repo and make a local-only change to prevent changes from being tracked, or leave the file out of the repo completely. In my use case, I wanted to be able to leave the file in the repo, treat it as ignored by everyone, and only commit changes to that file when there was a new configuration option I wanted added to it. Git doesn't support this use case whatsoever.

      This problem turned out to be really common, but the solution suggested is to have two separate versions of your configuration--one for dev, and one for production--and to have a flag to switch between the two. Given the breaking up of my configuration, I would then need twice as many files to do this, and given my security practices, this would violate the no-upstream rule for sensitive credentials. Worse still, if I had several different kinds of environments with different configuration--local dev, staging, beta, production--then for m such environments and n configuration files, I would need to maintain n*m separate files for configuration alone. Finally, I would need to remember to include a prefix or postfix to each file name any time I needed to retrieve values from a new config file, which is itself an error-prone requirement. Overall, there would be a substantial increase in technical debt. In other words, this approach would not only not help, it would make matters worse!


      Borrowing From Linux

      After a lot of thought, an idea occurred to me: within Linux systems, there's an /etc/skel/ directory that contains common files that are copied into a new user's home directory when that user is created, e.g. .bashrc and .profile. You can make changes to these files and have them propagate to new users, or you can modify your own personal copy and leave all other new users unaffected. This sounds exactly like the kind of behavior I want to emulate!

      Following their example, I took my $APPHOME/config/ directory and placed a skel/ subdirectory inside, which then contained all of the config files with the empty default values within. My .gitignore then looked something like this:

      $APPHOME/config/*
      !$APPHOME/config/main.php
      !$APPHOME/config/skel/
      !$APPHOME/config/skel/*
      # This last one might not be necessary, but I don't care enough to test it without.
      

      Finally, on deploying my local environment, I simply include a snippet in my script that enters the new skel/ directory and copies any files inside into config/, as long as it doesn't already exist:

      cd $APPHOME/config/skel/
      for filename in *; do
          if [ ! -f "$APPHOME/config/$filename" ]; then
              cp "$filename" "$APPHOME/config/$filename"
          fi
      done
      

      (Note: production environments have a slightly different deployment procedure, as local copies of these config files are saved within a shared directory for all releases to point to via symlink.)

      All of these changes ensure that only config/main.php and the files contained within config/skel/ are whitelisted, while all others are ignored, i.e. our local copies that get stored within config/ won't be inadvertently committed and pushed upstream!


      Final Thoughts

      Common solutions to problems are typically common for a good reason. They're tested, proven, and predictable. But sometimes you find yourself running into cases where the common, well-accepted solution to the problem doesn't work for you. Standards exist to solve a certain class of problems, and sometimes your problem is just different enough for it to matter and for those standards to not apply. Standards are created to address most cases, but edge cases will always exist. In other words, standards are guidelines, not concrete rules.

      Sometimes you need to stop thinking about the problem in terms of the standard approach to solving it, and instead break it down into its most abstract, basic form and look for parallels in other solved problems for inspiration. Odds are the problem you're trying to solve isn't as novel as you think it is, and that someone has probably already solved a similar problem before. Parallels, in my experience, are usually a pretty good indicator that you're on the right track.

      More importantly, there's a delicate line to tread between needing to use a different approach to solving an edge case problem you have, and needing to restructure your project to eliminate the edge case and allow the standard solution to work. Being able to decide which is more appropriate can have long-lasting repercussions on your ability to manage technical debt.

      16 votes