• Activity
  • Votes
  • Comments
  • New
  • All activity
  • Showing only topics in ~comp with the tag "programming". Back to normal view / Search all groups
    1. Web developers - What is your stack?

      As someone who is not mainly a web developer, I can barely grasp the immensity of options when it comes to writing a web application. So far everything I've written has been using PHP and the Slim...

      As someone who is not mainly a web developer, I can barely grasp the immensity of options when it comes to writing a web application.

      So far everything I've written has been using PHP and the Slim microframework. PHP because I don't use languages like Python/Ruby/JS that much so I didn't have any prior knowledge of those, and I've found myself to be fairly productive with it. Slim because I didn't want a full-blown framework with 200 files to configure.

      I've tried Go because I've used it in the past but I don't see it to be very fit when it comes to websites, I think it's fine for small microservices but doing MVC was a chore, maybe there's a framework out there that solves this.

      As for the frontend I've been trying to use as little JavaScript as possible, always vanilla. As of HTML and CSS I'm no designer so I kind of get by copying code and tweaking things here and there.

      However I've started a slightly bigger project and I don't fancy myself writing everything from scratch (specially security) besides, ORMs can be useful. Symfony4 is what I've been using for a couple of days, but I've had trouble setting up debugging, and the community/docs don't seem that great since this version is fairly new; so I'm considering trying out something more popular like Django.

      So this is why I created the post, I know this will differ greatly depending on the use-case. But I would like to do a quick survey and hear some of your recommendations, both on the backend and frontend. Besides I think it's a good topic for discussion.

      Cheers!

      20 votes
    2. Ian Lance Taylor: “Go intentionally has a weak type system, (…)”

      Recently, Ian Lance Taylor, one of the most productive contributors to Go and, IIRC, the original author of gccgo, has written a very interesting comment on his view of the language: (…) Go...

      Recently, Ian Lance Taylor, one of the most productive contributors to Go and, IIRC, the original author of gccgo, has written a very interesting comment on his view of the language:

      (…) Go intentionally has a weak type system, and there are many restrictions that can be expressed in other languages but cannot be expressed in Go. Go in general encourages programming by writing code rather than programming by writing types. (…)

      I found this distinction, writing code vs. writing types, very insightful. In my experience, in a language like Rust or (modern fancy) C++ the programmer is constantly forced to think about types, while when I program in Go or C, I almost never think about them. Types are, in fact, almost always obvious. It is also interesting that languages like Haskell and Idris explicitly expect the programmer to program with types.

      What do you think?

      9 votes
    3. A Brief Look at Webhook Security

      Preface Software security is one of those subjects that often gets overlooked, both in academia and in professional projects, unless you're specifically working with some existing security-related...

      Preface

      Software security is one of those subjects that often gets overlooked, both in academia and in professional projects, unless you're specifically working with some existing security-related element (e.g. you're taking a course on security basics, or updating your password hashing algorithm). As a result, we frequently see stories of rather catastrophic data leaks from otherwise reputable businesses, leaks which should have been entirely preventable with even the most basic of safeguards in place.

      With that in mind, I thought I would switch things up and discuss something security-related this time.


      Background

      It's commonplace for complex software systems to avoid unnecessarily large expenses, especially in terms of technical debt and the capital involved in the initial development costs of building entire systems for e.g. geolocation or financial transactions. Instead of reinventing the wheel and effectively building a parallel business, we instead integrate with existing third-party systems, typically by using an API.

      The problem, however, is that sometimes these third-party systems process requests over a long period of time, potentially on the order of minutes, hours, days, or even longer. If, for example, you have users who want to purchase something using your online platform, then it's not a particularly good idea to having potentially thousands of open connections to that third-party system all sitting there waiting multiple business days for funds to clear. That would just be stupid. So, how do we handle this in a way that isn't incredibly stupid?

      There are two commonly accepted methods to avoid having to wait around:

      1. We can periodically contact the third-party system and ask for the current status of a request, or
      2. We can give the third-party system a way to contact us and let us know when they're finished with a request.

      Both of these methods work, but obviously there will be a potentially significant delay in #1 between when a request finishes and when we know that it has finished (with a maximum delay of the wait time between status updates), whereas in #2 that delay is practically non-existent. Using #1 is also incredibly inefficient due to the number of wasted status update requests, whereas #2 allows us to avoid that kind of waste. Clearly #2 seems like the ideal option.

      Method #2 is what we call a webhook.


      May I see your ID?

      The problem with webhooks is that when you're implementing one, it's far too easy to forget that you need to restrict access to it. After all, that third-party system isn't a user, right? They're not a human. They can't just give us a username and password like we want them to. They don't understand the specific requirements for our individual, custom-designed system.

      But what happens if some malicious actor figures out what the webhook endpoint is? Let's say that all we do is log webhook requests somewhere in a non-capped file or database table/collection. Barring all other possible attack vectors, we suddenly find ourselves susceptible to that malicious actor sending us thousands, possibly millions of fraudulent data payloads in a small amount of time thanks to a botnet, and now our server's I/O utilization is spiking and the entire system is grinding to a halt--we're experiencing a DDoS!

      We don't want just anyone to be able to talk to our webhook. We want to make sure that anyone who does is verified and trusted. But since we can't require a username and password, since we can't guarantee that the third-party system will even know how to make use of them, what can we do?

      The answer is to use some form of token-based authentication--we generate a unique token, kind of like an ID card, and we attach it to our webhook endpoint (e.g. https://example.com/my_webhook/{unique_token}). We can then check that token for validity every time someone touches our webhook, ensuring that only someone we trust can get in.


      Class is in Session

      Just as there are two commonly accepted models for how to handle receiving updates from third-party systems, there are also two common models for how to assign a webhook to those systems:

      1. Hard-coding the webhook in your account settings, or
      2. Passing a webhook as part of request payload.

      Model #1 is, in my experience, the most common of the two. In this model, our authentication token is typically directly linked to some user or user-like object in our system. This token is intended to be persisted and reused indefinitely, only scrapped in the event of a breach or a termination of integration with the service that uses it. Unfortunately, if the token is present within the URL, it's possible for your token to be viewed in plaintext in your logs.

      In model #2, it's perfectly feasible to mirror the behavior of model #1 by simply passing the same webhook endpoint with the same token in every new request; however, there is a far better solution. We can, instead, generate a brand new token for each new request to the third-party system, and each new token can be associated with the request itself on our own system. Rather than only validating the token itself, we then validate that the token and the request it's supposed to be associated with are both valid. This ensures that even in the event of a breach, a leaked authentication token's extent of damage is limited only to the domain of the request it's associated with! In addition, we can automatically expire these tokens after receiving a certain number of requests, ensuring that a DDoS using a single valid token and request payload isn't possible. As with model #1, however, we still run into problems of token exposure if the token is present in the URL.

      Model #2 treats each individual authentication token not as a session for an entire third-party system, but as a session for a single request on that system. These per-request session tokens require greater effort to implement, but are inherently safer due to the increased granularity of our authentication and our flexibility in allowing ourselves to expire the tokens at will.


      Final Thoughts

      Security is hard. Even with per-request session tokens, webhooks still aren't as secure as we might like them to be. Some systems allow us to define tokens that will be inserted into the request payload, but more often than not you'll find that only a webhook URL is possible to specify. Ideally we would stuff those tokens right into the POST request payload for all of our third-party systems so they would never be so easily exposed in plaintext in log files, but legacy systems tend to be slow to catch up and newer systems often don't have developers with the security background to consider it.

      Still, as far as securing webhooks goes, having some sort of cryptographically secure authentication token is far better than leaving the door wide open for any script kiddie having a bad day to waltz right in and set the whole place on fire. If you're integrating with any third-party system, your job isn't to make it impossible for them to get their hands on a key, but to make it really difficult and to make sure you don't leave any gasoline lying around in case they do.

      8 votes
    4. Anyone here using Flutter?

      In the rare chance you haven't heard of Flutter, here's the link: https://flutter.io Flutter just officially left beta with v1.0 December 4, last year. The code is written in Dart, and deploys on...

      In the rare chance you haven't heard of Flutter, here's the link: https://flutter.io

      Flutter just officially left beta with v1.0 December 4, last year. The code is written in Dart, and deploys on Android, and iOS (and will run natively on the rumored Fuchsia OS).

      So for those of you that have used Flutter or are currently using Flutter.

      • What are you working on?
      • Why'd you choose Flutter?
      • What do you like about Flutter?
      • And what do you dislike about Flutter?

       

      I'll start:

      I'm working on a niche art app. I myself do not do that type of art, but knowing people that do, I wanted to create a tool to fill in the lackluckster market for Chromebooks and Android.
      I chose Flutter because:

      • I wanted to try something new, and what newer than something that was (at the time) in beta?
      • Custom Views in Android are a hassle.
      • I will be able to release on both Android and iOS (semi-)natively without having to code it twice.

      Here's what I like about Flutter:

      • Layouts are really simple.
        (though you can easily let it get clustered if you don't think too much about it.)
      • Design isn't an afterthought.
        Animations are built in (and simple), themes aren't hard-coded, and Material Components get more attention here. (Still waiting for Shapes on Android)
      • It's fast by design.
        Flutter uses its own custom rendering engine (Skia). I've never experienced any stutter with the built-in components, and when I caused lag (with heavy I/O) Flutter/Dart had tools in place for me to narrow down exactly what was causing it.

      What I don't like about Flutter:

      • It has poor mouse/trackpad support.
        Right clicks, not a thing. I can workaround this with a double-click/long-click, but for a desktop OS, this isn't optimal. Scrolling, that's panning, this should be differentiated. There's a difference between using a scrollwheel and moving finger around on the screen. According to Flutter there is not. There's also currently no support for mouse hovers which I have needed very much.
        There is a pull-request for adding support for all of these, but the developer hasn't done anything since code review.
      • Keyboard support, while there, is lackluster.
        Ctrl, Shift, Alt. These have to be gotten with the meta code. There's no built-in function for checking those. Text fields don't support the tab key to navigate. And text formatting (bold, italic, etc.) isn't possible with text fields without the use of a library (or making it yourself).

      I was trying to think of a third dislike, but I can't. My complaints are on missing APIs for Chromebooks. That's it. I really like Flutter, I plan on using it more, and if they won't add support for mouse/keyboard, maybe I'll have to contribute.

      I'd love to hear what your thoughts about it is.

      12 votes
    5. Programming Challenge - It's raining!

      Hi everyone, it's been 12 days since last programming challenge. So here's another one. The task is to make an algorithm that'll count how long would it take to fill system of lakes with water....

      Hi everyone, it's been 12 days since last programming challenge. So here's another one. The task is to make an algorithm that'll count how long would it take to fill system of lakes with water.

      It's raining in the forest. The forest is full of lakes, which are close to each other. Every lake is below the previous one (so 1st lake is higher than 2nd lake, which is higher than 3rd lake). Lakes are empty at the beginning, and they're filling at rate of 1l/h. Once a lake is full, all water that'd normally fall into the lake will flow to the next lake.

      For example, you have lakes A, B, and C. Lake A can hold 1 l of water, lake B can hold 3 l of water and lake C can hold 5 l of water. How long would it take to fill all the lakes?
      After one hour, the lakes would be: A (1/1), B (1/3), C(1/5). After two hours, the lakes would be: A(1/1), B(3/3), C(2/5) (because this hour, B received 2l/h - 1l/h from the rain and 1l/h from lake A). After three hours, the lakes would be: A(1/1), B(3/3), C(5/5). So the answer is 3. Please note, that the answer can be any rational number. For example if lake C could hold only 4l instead of 5, the answer would be 2.66666....

      Hour 0:

      
      \            /
        ----(A)----
                             \                /
                              \              /
                               \            /
                                ----(B)----
                                                   \           /
                                                    \         /
                                                     \       /
                                                     |       |
                                                     |       |
                                                      --(C)--
      

      Hour 1:

      
      \============/
        ----(A)----
                             \                /
                              \              /
                               \============/
                                ----(B)----
                                                   \           /
                                                    \         /
                                                     \       /
                                                     |       |
                                                     |=======|
                                                      --(C)--
      

      Hour 2:

                  ==============
      \============/           |
        ----(A)----            |
                             \================/
                              \==============/
                               \============/
                                ----(B)----
                                                   \           /
                                                    \         /
                                                     \       /
                                                     |=======|
                                                     |=======|
                                                      --(C)--
      

      Hour 3:

                  ==============
      \============/           |
        ----(A)----            |             ========
                             \================/       |
                              \==============/        |
                               \============/         |
                                ----(B)----           |
                                                   \===========/
                                                    \=========/
                                                     \=======/
                                                     |=======|
                                                     |=======|
                                                      --(C)--
      

      Good luck everyone! Tell me if you need clarification or a hint. I already have a solution, but it sometimes doesn't work, so I'm really interested in seeing yours :-)

      21 votes
    6. An Alternative Approach to Configuration Management

      Preface Different projects have different use cases that can ultimately result in common solutions not suiting your particular needs. Today I'm going to diverging a bit from my more abstract,...

      Preface

      Different projects have different use cases that can ultimately result in common solutions not suiting your particular needs. Today I'm going to diverging a bit from my more abstract, generalized topics on code quality and instead focus on a specific project structure example that I encountered.


      Background

      For a while now, I've found myself being continually frustrated with the state of my project configuration management. I had a single configuration file that would contain all of the configuration options for the various tools I've been using--database, API credentials, etc.--and I kept running into the problem of wanting to test these tools locally while not inadvertently committing and pushing sensitive credentials upstream. For me, part of my security process is ensuring that sensitive access credentials never make it into the repository and to limit access to these credentials to only people who need to be able to access them.


      Monolithic Files Cause Monolithic Pain

      The first thing I realized was that having a single monolithic configuration file was just terrible practice. There are going to be common configuration options that I want to have in there with default values, such as local database configuration pointing to a database instance running on the same VM as the application. These should always be in the repo, otherwise any dev who spins up an instance of the VM will need to manually tread documentation and copy-paste the missing options into the configuration. This would be incredibly time-consuming, inefficient, and stupid.

      I also use different tools which have different configuration options associated with them. Having to dig through a single file containing configuration options for all of these tools to find the ones I need to modify is cumbersome at best. On top of that, having those common configuration options living in the same place that sensitive access credentials do is just asking for a rogue git commit -A to violate the aforementioned security protocol.


      Same Problem, Different Structure

      My first approach to resolving this problem was breaking the configuration out into separate files, one for each distinct tool. In each file, a "skeleton" config was generated, i.e. each option was given a default empty value. The main config would then only contain config options that are common and shared across the application. To avoid having the sensitive credentials leaked, I then created rules in the .gitignore to exclude these files.

      This is where I ran into problem #2. I learned that this just doesn't work. You can either have a file in your repo and have all changes to that file tracked, have the file in your repo and make a local-only change to prevent changes from being tracked, or leave the file out of the repo completely. In my use case, I wanted to be able to leave the file in the repo, treat it as ignored by everyone, and only commit changes to that file when there was a new configuration option I wanted added to it. Git doesn't support this use case whatsoever.

      This problem turned out to be really common, but the solution suggested is to have two separate versions of your configuration--one for dev, and one for production--and to have a flag to switch between the two. Given the breaking up of my configuration, I would then need twice as many files to do this, and given my security practices, this would violate the no-upstream rule for sensitive credentials. Worse still, if I had several different kinds of environments with different configuration--local dev, staging, beta, production--then for m such environments and n configuration files, I would need to maintain n*m separate files for configuration alone. Finally, I would need to remember to include a prefix or postfix to each file name any time I needed to retrieve values from a new config file, which is itself an error-prone requirement. Overall, there would be a substantial increase in technical debt. In other words, this approach would not only not help, it would make matters worse!


      Borrowing From Linux

      After a lot of thought, an idea occurred to me: within Linux systems, there's an /etc/skel/ directory that contains common files that are copied into a new user's home directory when that user is created, e.g. .bashrc and .profile. You can make changes to these files and have them propagate to new users, or you can modify your own personal copy and leave all other new users unaffected. This sounds exactly like the kind of behavior I want to emulate!

      Following their example, I took my $APPHOME/config/ directory and placed a skel/ subdirectory inside, which then contained all of the config files with the empty default values within. My .gitignore then looked something like this:

      $APPHOME/config/*
      !$APPHOME/config/main.php
      !$APPHOME/config/skel/
      !$APPHOME/config/skel/*
      # This last one might not be necessary, but I don't care enough to test it without.
      

      Finally, on deploying my local environment, I simply include a snippet in my script that enters the new skel/ directory and copies any files inside into config/, as long as it doesn't already exist:

      cd $APPHOME/config/skel/
      for filename in *; do
          if [ ! -f "$APPHOME/config/$filename" ]; then
              cp "$filename" "$APPHOME/config/$filename"
          fi
      done
      

      (Note: production environments have a slightly different deployment procedure, as local copies of these config files are saved within a shared directory for all releases to point to via symlink.)

      All of these changes ensure that only config/main.php and the files contained within config/skel/ are whitelisted, while all others are ignored, i.e. our local copies that get stored within config/ won't be inadvertently committed and pushed upstream!


      Final Thoughts

      Common solutions to problems are typically common for a good reason. They're tested, proven, and predictable. But sometimes you find yourself running into cases where the common, well-accepted solution to the problem doesn't work for you. Standards exist to solve a certain class of problems, and sometimes your problem is just different enough for it to matter and for those standards to not apply. Standards are created to address most cases, but edge cases will always exist. In other words, standards are guidelines, not concrete rules.

      Sometimes you need to stop thinking about the problem in terms of the standard approach to solving it, and instead break it down into its most abstract, basic form and look for parallels in other solved problems for inspiration. Odds are the problem you're trying to solve isn't as novel as you think it is, and that someone has probably already solved a similar problem before. Parallels, in my experience, are usually a pretty good indicator that you're on the right track.

      More importantly, there's a delicate line to tread between needing to use a different approach to solving an edge case problem you have, and needing to restructure your project to eliminate the edge case and allow the standard solution to work. Being able to decide which is more appropriate can have long-lasting repercussions on your ability to manage technical debt.

      16 votes
    7. As someone with ADHD, I hate the "RTFM" motto

      I'm a student of software engineering. I'm not a programmer yet, but I use software that is common among this crowd, like i3wm, Neovim and Emacs. I know how to find and read documentation. I've...

      I'm a student of software engineering. I'm not a programmer yet, but I use software that is common among this crowd, like i3wm, Neovim and Emacs. I know how to find and read documentation. I've read the obnoxious How To Ask Questions the Smart Way. Every time I encounter an issue, I do my diligence. I go through the manuals, I google, I read the docs. My main editor, Emacs, has an extensive manual, with plenty of accurate details. I get that's a huge program (more like a platform, really), but let's just say that a black-and-white 650 pages PDF is not the most ADHD friendly thing in the world.

      I'm aware that I chose a career that requires plenty of reading, but I happen to like it a lot and it seems like I have some aptitude for it. I had similar issues in my previous activities anyway. But it's discouraging trying to understand programming and complex software, only to be repelled by people who think everyone has their ability for concentration. Sometimes I completely lose track of time. I can sit on my computer and hyperfocus for up to 48 hours with 20 Chrome tabs open non-stop and Netflix on the background. I may seem productive, but I'm not reading anything. Maybe I read one paragraph or two, and 30 seconds later I can't remember what I was doing. But I still have tasks to accomplish, and sometimes I need help to find useful information on a 700 pages manual.

      Luckily I have a great support and determination and have accomplished a lot, but my peers have no idea what I went through to get to where I am. What I don't have in natural born skills I compensate with a lot of raw effort. Everyone has their difficulties and I'm not seeking compassion, but I'd like to suggest people think twice before dismissing as "lazy" someone you know nothing about. That person might have a mental disorder, a reading disorder or even an intellectual disability. Do you wanna be the guy who told a dyslexic to just read the fucking manual?

      EDIT: of course I get that time and energy are limited commodities... my point is: don't be an asshole about it. Do what you can and you wanna do, but there's no need to use hostile buzzwords when you communicate with less knowledgeable people. You're not even forced to answer... I much prefer not getting an answer than getting a hostile one.

      26 votes
    8. Meta Discussion: Is there interest in topics concerning code quality?

      I've posted a few lengthy topics here outside of programming challenges, and I've noticed that the ones that seem to have spurred the most interest and generated some discussion were ones that...

      I've posted a few lengthy topics here outside of programming challenges, and I've noticed that the ones that seem to have spurred the most interest and generated some discussion were ones that were directly related to code quality. To avoid falling for confirmation bias, though, I thought I would ask directly.

      Is there generally a greater interest in code quality discussions? If so, then what kind of things are you interested in seeing in those discussions? What do you prefer not to see? If not, then what kinds of programming-related discussions would you prefer to see more of? What about non-programming discussions?

      Also, is there any interest in an informal series of topics much like the programming challenges or the a layperson's introduction to... series (i.e. decentralized and available for anyone to participate whenever)? Personally, I'd be interested in seeing more on the subject from others!

      17 votes
    9. Code Quality Tip: Wrapping external libraries.

      Preface Occasionally I feel the need to touch on the subject of code quality, particularly because of the importance of its impact on technical debt, especially as I continue to encounter the...

      Preface

      Occasionally I feel the need to touch on the subject of code quality, particularly because of the importance of its impact on technical debt, especially as I continue to encounter the effects of technical debt in my own work and do my best to manage it. It's a subject that is unfortunately not emphasized nearly enough in academia.


      Background

      As a refresher, technical debt is the long-term cost of the design decisions in your code. These costs can manifest in different ways, such as greater difficulty in understanding what your code is doing or making non-breaking changes to it. More generally, these costs manifest as additional time and resources being spent to make some kind of change.

      Sometimes these costs aren't things you think to consider. One such consideration is how difficult it might be to upgrade a specific technology in your stack. For example, what if you've built a back-end system that integrates with AWS and you suddenly need to upgrade your SDK? In a small project this might be easy, but what if you've built a system that you've been maintaining for years and it relies heavily on AWS integrations? If the method names, namespaces, argument orders, or anything else has changed between versions, then suddenly you'll need to update every single reference to an AWS-related tool in your code to reflect those changes. In larger software projects, this could be a daunting and incredibly expensive task, spanning potentially weeks or even months of work and testing.

      That is, unless you keep those references to a minimum.


      A Toy Example

      This is where "wrapping" your external libraries comes into play. The concept of "wrapping" basically means to create some other function or object that takes care of operating the functions or object methods that you really want to target. One example might look like this:

      <?php
      
      class ImportedClass {
          public function methodThatMightBecomeModified($arg1, $arg2) {
              // Do something.
          }
      }
      
      class ImportedClassWrapper {
          private $class_instance = null;
      
          private function getInstance() {
              if(is_null($this->class_instance)) {
                  $this->class_instance = new ImportedClass();
              }
      
              return $this->class_instance;
          }
      
          public function wrappedMethod($arg1, $arg2) {
              return $this->getInstance()->methodThatMightBecomeModified($arg1, $arg2);
          }
      }
      
      ?>
      

      Updating Tools Doesn't Have to Suck

      Imagine that our ImportedClass has some important new features that we need to make use of that are only available in the most recent version, and we're several versions behind. The problem, of course, is that there were a lot of changes that ended up being made between our current version and the new version. For example, ImportedClass is now called NewImportedClass. On top of that, methodThatMightBecomeModified is now called methodThatWasModified, and the argument order ended up getting switched around!

      Now imagine that we were directly calling new ImportedClass() in many different places in our code, as well as directly invoking methodThatMightBecomeModified:

      <?php
      
      $imported_class_instance = new ImportedClass();
      $imported_class_instance->methodThatMightBeModified($val1, $val2);
      
      ?>
      

      For every single instance in our code, we need to perform a replacement. There is a linear or--in terms of Big-O notation--a complexity of O(n) to make these replacements. If we assume that we only ever used this one method, and we used it 100 times, then there are 100 instances of new ImportClass() to update and another 100 instances of the method invocation, equaling 200 lines of code to change. Furthermore, we need to remember each of the replacements that need to be made and carefully avoid making any errors in the process. This is clearly non-ideal.

      Now imagine that we chose instead to use the wrapper object:

      <?php
      
      $imported_class_wrapper = new ImportedClassWrapper();
      $imported_class_wrapper->wrappedMethod($val1, $val2);
      
      ?>
      

      Our updates are now limited only to the wrapper class:

      <?php
      
      class ImportedClassWrapper {
          private $class_instance = null;
      
          private function getInstance() {
              if(is_null($this->class_instance)) {
                  $this->class_instance = new NewImportedClass();
              }
      
              return $this->class_instance;
          }
      
          public function wrappedMethod($arg1, $arg2) {
              return $this->getInstance()->methodThatWasModified($arg2, $arg1);
          }
      }
      
      ?>
      

      Rather than making changes to 200 lines of code, we've now made changes to only 2. What was once an O(n) complexity change has now turned into an O(1) complexity change to make this upgrade. Not bad for a few extra lines of code!


      A Practical Example

      Toy problems are all well and good, but how does this translate to reality?

      Well, I ran into such a problem myself once. Running MongoDB with PHP requires the use of an external driver, and this driver provides an object representing a MongoDB ObjectId. I needed to perform a migration from one hosting provider over to a new cloud hosting provider, with the application and database services, which were originally hosted on the same physical machine, hosted on separate servers. For security reasons, this required an upgrade to a newer version of MongoDB, which in turn required an upgrade to a newer version of the driver.

      This upgrade resulted in many of the calls to new MongoId() failing, because the old version of the driver would accept empty strings and other invalid ID strings and default to generating a new ObjectId, whereas the new version of the driver treated invalid ID strings as failing errors. And there were many, many cases where invalid strings were being passed into the constructor.

      Even after spending hours replacing the (literally) several dozen instances of the constructor calls, there were still some places in the code where invalid strings managed to get passed in. This made for a very costly upgrade.

      The bugs were easy to fix after the initial replacements, though. After wrapping new MongoId() inside of a wrapper function, a few additional conditional statements inside of the new function resolved the bugs without having to dig around the rest of the code base.


      Final Thoughts

      This is one of those lessons that you don't fully appreciate until you've experienced the technical debt of an unwrapped external library first-hand. Code quality is an active effort, but a worthwhile one. It requires you to be willing to throw away potentially hours or even days of work when you realize that something needs to change, because you're thinking about how to keep yourself from banging your head against a wall later down the line instead of thinking only about how to finish up your current task.

      "Work smarter, not harder" means putting in some hard work upfront to keep your technical debt under control.

      That's all for now, and remember: don't be fools, wrap your external tools.

      23 votes
    10. Programming Challenge: Shape detection.

      The programming challenges have kind of come to a grinding halt recently. I think it's time to get another challenge started! Given a grid of symbols, representing a simple binary state of...

      The programming challenges have kind of come to a grinding halt recently. I think it's time to get another challenge started!

      Given a grid of symbols, representing a simple binary state of "filled" or "unfilled", determine whether or not a square is present on the grid. Squares must be 2x2 in size or larger, must be completely solid (i.e. all symbols in the NxN space are "filled"), and must not be directly adjacent to any other filled spaces.

      Example, where 0 is "empty" and 1 is "filled":

      000000
      011100
      011100
      011100
      000010
      
      // Returns true.
      
      000000
      011100
      011100
      011110
      000000
      
      // Returns false.
      
      000000
      011100
      010100
      011100
      000000
      
      // Returns false.
      

      For those who want a greater challenge, try any of the following:

      1. Get a count of all squares.
      2. Detect squares that are touching (but not as a rectangle).
      3. Detect other specific shapes like triangles or circles (you will need to be creative).
      4. If doing (1) and (3), count shapes separately based on type.
      5. Detect shapes within unfilled space as well (a checkerboard pattern is a great use case).
      13 votes
    11. Light Analysis of a Recent Code Refactor

      Preface In a previous topic, I'd covered the subject of a few small lessons regarding code quality. Especially important was the impact on technical debt, which can bog down developer...

      Preface

      In a previous topic, I'd covered the subject of a few small lessons regarding code quality. Especially important was the impact on technical debt, which can bog down developer productivity, and the need to pay down on that debt. Today I would like to touch on a practical example that I'd encountered in a production environment.


      Background

      Before we can discuss the refactor itself, it's important to be on the same page regarding the technologies being used. In my case, I work with PHP utilizing a proprietary back-end framework and MongoDB as our database.

      PHP is a server-side scripting language. Like many scripting languages, it's loosely typed. This has some benefits and drawbacks.

      MongoDB is a document-oriented database. By default it's schema-less, allowing you to make any changes at will without an update to schema. This can blend pretty well with the loose typing of PHP. Each document is represented using a JSON-like structure and is stored in something called a "collection". For those of you accustomed to using relational database, a "collection" is analogous to a table, each document is a row, and each field in the document is a column. A typical query in the MongoDB shell would look something like this:

      db.users.findOne({
          username: "Emerald_Knight"
      });
      

      The framework itself has some framework-specific objects that are held in global references. This makes them easily accessible, but naturally littering your code with a bunch of globals is both error-prone and an eyesore.


      Unexpected Spaghetti

      In my code base are a number of different objects that are designed to handle basic CRUD-like operations on their associated database entries. Some of these objects hold references to other objects, so naturally there is some data validation that occurs to ensure that the references are both valid and authorized. Pretty typical stuff.

      What I noticed, however, is that the collection names for these database entries were littered throughout my code. This isn't necessarily a bad thing, except there were some use cases that came to mind: what if it turned out that my naming for one or more of these collections wasn't ideal? What if I wanted to change a collection name for the sake of easier management on the database end? What if I have a tendency to forget the name of a database collection and constantly have to look it up? What if I make a typo of all things? On top of that, the framework's database object was stored in a global variable.

      These seemingly minor sources of technical debt end up adding up over time and could cause some serious problems in the worst case. I've had breaking bugs make their way passed QA in the past, after all.


      Exchanging Spaghetti for Some Light Lasagna

      The problem could be characterized simply: there were scoping problems and too many references to what were essentially magic strings. The solution, then, was to move the database object reference from global to local scope within the application code and to eliminate the problem of magic strings. Additionally, it's a good idea to avoid polluting the namespace with an over-reliance on constants, and using those constants for database calls can also become unsightly and difficult to follow as those constants could end up being generally disconnected from the objects they're associated with.

      There turned out to be a nice, object-oriented, very PHP-like solution to this problem: a so-called "magic method" named "__call". This method is invoked whenever an "inaccessible" method is called on the object. Using this method, a database command executed on a non-database object could pass the command to the database object itself. If this logic were placed within an abstract class, the collection could then be specified simply as a configuration option in the inheriting class.

      This is what such a solution could look like:

      <?php
      
      abstract class MyBaseObject {
      
          protected $db = null;
          protected $collection_name = null;
      
          public function __construct() {
              global $db;
              
              $this->db = $db;
          }
      
          public function __call($method_name, $args) {
              if(method_exists($this->db, $method_name)) {
                  return $this->executeDatabaseCommand($method_name, $args);
              }
      
              throw new Exception(__CLASS__ . ': Method "' . $method_name . '" does not exist.');
          }
      
          public function executeDatabaseCommand($command, $args) {
              $collection = $this->collection_name;
              $db_collection = $this->db->$collection;
      
              return call_user_func_array(array($db_collection, $command), $args);
          }
      }
      
      class UserManager extends MyBaseObject {
          protected $collection_name = 'users';
      
          public function __construct() {
              parent::__construct();
          }
      }
      
      $user_manager = new UserManager();
      $my_user = $user_manager->findOne(array('username'=>'Emerald_Knight'));
      
      ?>
      

      This solution utilizes a single parent object which transforms a global database object reference into a local one, eliminating the scope issue. The collection name is specified as a class property of the inheriting object and only used in a single place in the parent object, eliminating the magic string and namespace polluting issues. Any time you perform queries on users, you do so by using the UserManager class, which guarantees that you will always know that your queries are being performed on the objects that you intend. And finally, if the collection name for an object class ever needs to be updated, it's a simple matter of modifying the single instance of the class property $collection_name, rather than tracking down some disconnected constant.


      Limitations

      This, of course, doesn't solve all of the existing problems. After all, executing the database queries for one object directly from another is still pretty bad practice, violating the principle of separation of concerns. Instead, those queries should generally be encapsulated within object methods and the objects themselves given primary responsibility in handling associated data. It's also incredibly easy to inadvertently override a database method, e.g. defining a findOne() method on UserManager, so there's still some mindfulness required on the part of the programmer.

      Still, given the previous alternative, this is a pretty major improvement, especially for an initial refactor.


      Final Thoughts

      As always, technical debt is both necessary and inevitable. After all, in exchange for not taking the excess time and considering structuring my code this way in the beginning, I had greater initial velocity to get the project off of the ground. What's important is continually reviewing your code as you're building on top of it so that you can identify bottlenecks as they begin to strain your efficiency, and getting those bottlenecks out of the way.

      In other words, even though technical debt is often necessary and is certainly inevitable, it's important to pay down on some of that debt once it starts getting expensive!

      7 votes
    12. Getting Started as a Developer from Scratch

      I have been interested in making the gradual career change to software development from my current humanities field. This stems from a handful of different places. Of course the pay and...

      I have been interested in making the gradual career change to software development from my current humanities field. This stems from a handful of different places. Of course the pay and flexibility are strong drivers but I like the idea of a field that is somewhat of a creative expression; one where you can manifest your knowledge and experience into something tangible.

      I have no experience with programming other than SQL use in ArcGIS and am hoping to gain some knowledge about the field; so anything would be helpful. Whether what to expect from this line of work, where someone with no experience should look to get started and what to expect, personal journeys, etc.

      Cheers!

      14 votes
    13. XML Data Munging Problem

      Here’s a problem I had to solve at work this week that I enjoyed solving. I think it’s a good programming challenge that will test if you really grok XML. Your input is some XML such as this:...

      Here’s a problem I had to solve at work this week that I enjoyed solving. I think it’s a good programming challenge that will test if you really grok XML.

      Your input is some XML such as this:

      <DOC>
      <TEXT PARTNO="000">
      <TAG ID="3">This</TAG> is <TAG ID="0">some *JUNK* data</TAG> .
      </TEXT>
      <TEXT PARTNO="001">
      *FOO* Sometimes <TAG ID="1">tags in <TAG ID="0">the data</TAG> are nested</TAG> .
      </TEXT>
      <TEXT PARTNO="002">
      In addition to <TAG ID="1">nested tags</TAG> , sometimes there is also <TAG ID="2">junk</TAG> we need to ignore .
      </TEXT>
      <TEXT PARTNO="003">*BAR*-1
      <TAG ID="2">Junk</TAG> is marked by uppercase characters between asterisks and can also optionally be followed by a dash and then one or more digits . *JUNK*-123
      </TEXT>
      <TEXT PARTNO="004">
      Note that <TAG ID="4">*this*</TAG> is just emphasized . It's not <TAG ID="2">junk</TAG> !
      </TEXT>
      </DOC>
      

      The above XML has so-called in-line textual annotations because the XML <TAG> elements are embedded within the document text itself.

      Your goal is to convert the in-line XML annotations to so-called stand-off annotations where the text is separated from the annotations and the annotations refer to the text via slicing into the text as a character array with starting and ending character offsets. While in-line annotations are more human-readable, stand-off annotations are equally machine-readable, and stand-off annotations can be modified without changing the document content itself (the text is immutable).

      The challenge, then, is to convert to a stand-off JSON format that includes the plain-text of the document and the XML tag annotations grouped by their tag element IDs. In order to preserve the annotation information from the original XML, you must keep track of each <TAG>’s starting and ending character offset within the plain-text of the document. The plain-text is defined as the character data in the XML document ignoring any junk. We’ll define junk as one or more uppercase ASCII characters [A-Z]+ between two *, and optionally a trailing dash - followed by any number of digits [0-9]+.

      Here is the desired JSON output for the above example to test your solution:

      {
        "data": "\nThis is some data .\n\n\nSometimes tags in the data are nested .\n\n\nIn addition to nested tags , sometimes there is also junk we need to ignore .\n\nJunk is marked by uppercase characters between asterisks and can also optionally be followed by a dash and then one or more digits . \n\nNote that *this* is just emphasized . It's not junk !\n\n",
        "entities": [
          {
            "id": 0,
            "mentions": [
              {
                "start": 9,
                "end": 18,
                "id": 0,
                "text": "some data"
              },
              {
                "start": 41,
                "end": 49,
                "id": 0,
                "text": "the data"
              }
            ]
          },
          {
            "id": 1,
            "mentions": [
              {
                "start": 33,
                "end": 60,
                "id": 1,
                "text": "tags in the data are nested"
              },
              {
                "start": 80,
                "end": 91,
                "id": 1,
                "text": "nested tags"
              }
            ]
          },
          {
            "id": 2,
            "mentions": [
              {
                "start": 118,
                "end": 122,
                "id": 2,
                "text": "junk"
              },
              {
                "start": 144,
                "end": 148,
                "id": 2,
                "text": "Junk"
              },
              {
                "start": 326,
                "end": 330,
                "id": 2,
                "text": "junk"
              }
            ]
          },
          {
            "id": 3,
            "mentions": [
              {
                "start": 1,
                "end": 5,
                "id": 3,
                "text": "This"
              }
            ]
          },
          {
            "id": 4,
            "mentions": [
              {
                "start": 289,
                "end": 295,
                "id": 4,
                "text": "*this*"
              }
            ]
          }
        ]
      }
      

      Python 3 solution here.

      If you need a hint, see if you can find an event-based XML parser (or if you’re feeling really motivated, write your own).

      4 votes
    14. Programming Challenge: Polygon analysis.

      It's time for another programming challenge! Given a list of coordinate pairs on a 2D plane that describe the vertices of a polygon, determine whether the polygon is concave or convex. Since a...

      It's time for another programming challenge!

      Given a list of coordinate pairs on a 2D plane that describe the vertices of a polygon, determine whether the polygon is concave or convex.

      Since a polygon could potentially be any shape if we don't specify which vertices connect to which, we'll assume that the coordinates are given in strict order such that adjacent coordinates in the list are connected. Specifically, if we call the list V[1, n] and say that V[i] <-> V[j] means "vertex i and vertex j are connected", then for each arbitrary V[i] we have V[i-1] <-> V[i] <-> V[i+1]. Moreover, since V[1] and V[n] are at the ends of the list, V[1] <-> V[n] holds (i.e. the list "wraps around").

      Finally, for simplicity we can assume that all coordinates are unique, that all polygon descriptions generate valid polygons with 3 or more non-overlapping sides, and that, yes, we're working with coordinates that exist in the set of real numbers only. Don't over-complicate it :)

      For those who want an even greater challenge, extend this out to work with 3D space!

      8 votes
    15. Programming Challenge: Reverse Polish Notation Calculator

      It's been nearly a week, so it's time for another programming challenge! This time, let's create a calculator that accepts reverse Polish notation (RPN), also known as postfix notation. For a bit...

      It's been nearly a week, so it's time for another programming challenge!

      This time, let's create a calculator that accepts reverse Polish notation (RPN), also known as postfix notation.

      For a bit of background, RPN is where you take your two operands in an expression and place the operator after them. For example, the expression 3 + 5 would be written as 3 5 +. A more complicated expression like (5 - 3) x 8 would be written as 5 3 - 8 x, or 8 5 3 - x.

      All your program has to do is accept a valid RPN string and apply the operations in the correct order to produce the expected result.

      18 votes
    16. Programming Challenge: Counting isolated regions.

      Another week, another challenge! This time, assume you're given a grid where each . represents an empty space and each # represents a "wall". We'll call any contiguous space of .s a "region". You...

      Another week, another challenge!

      This time, assume you're given a grid where each . represents an empty space and each # represents a "wall". We'll call any contiguous space of .s a "region". You can also think of a grid with no walls the "base" region. The walls may subdivide the base region into any number of isolated sub-regions of any shape or size.

      Write a program that will, given a grid description, compute the total number of isolated regions.

      For example, the following grid has 5 isolated regions:

      ....#....#
      ....#.###.
      ....#.#.#.
      #...#..#..
      .#..#...#.
      
      16 votes
    17. Programming Challenge: Merge an arbitrary number of arrays in sorted order.

      It looks like it's been over a week and a half since our last coding challenge, so let's get one going. This challenge is a relatively simple one, but it's complex enough that you can take a...

      It looks like it's been over a week and a half since our last coding challenge, so let's get one going. This challenge is a relatively simple one, but it's complex enough that you can take a variety of different approaches to it.

      As the title suggests, write a program that accepts an arbitrary number of arrays, in whatever form or manner you see fit (if you want to e.g. parse a potentially massive CSV file, then go nuts!), and returns a single array containing all of the elements of the other arrays in sorted order. That's it!

      Bonus points for creative, efficient, or generalized solutions!

      24 votes
    18. Programming Challenge: Compute the shortest path to visit all target spots on a grid.

      Let's do something a little more challenging this time. Given an MxN grid of arbitrary size, and given a random starting place on that grid and a list of points to visit, find the shortest path...

      Let's do something a little more challenging this time.

      Given an MxN grid of arbitrary size, and given a random starting place on that grid and a list of points to visit, find the shortest path such that you visit all of them. Path lengths will be computed using taxicab distances rather than strict coordinate distance calculations.

      There are no restrictions on expected input this time. Output should be the total distance traveled between points.


      Example

      Assume that we use the character # to denote a spot on the grid, the character @ to denote your starting point, and the character * to denote a place on the grid that you're required to visit. One such grid may look something like this:

      ######
      ######
      **####
      #*####
      #*#*##
      #@####
      ######
      

      In this case, let's say that the bottom-left point on the grid is point (0, 0) and we're starting on point (1, 1). One valid solution would be to move to point (3, 2), then (1, 2), then (1, 3), then (1, 4), and finally (0, 4). The shortest path available is thus 8. Note that it's not enough just to visit the next nearest point on the grid!

      15 votes
    19. An informal look at the concept of reduction (alternatively: problem-solving for beginners).

      Preface One of the most common questions I see from prospective programmers and computer scientists is "where should I start?". My answer to that is a pretty consistent one: learn how to solve...

      Preface

      One of the most common questions I see from prospective programmers and computer scientists is "where should I start?". My answer to that is a pretty consistent one: learn how to solve problems effectively. But that's vague and not really all that helpful, so I figured that I should actually tackle this in a little more depth by touching on something more specific.

      Specifically, I want to touch on the subject of how to think about complex problems.


      The Rationale Behind Learning

      Before we can better understand how to effectively solve problems, it's important to consider how it is that we learn. With any subject, the standard approach is to begin with the bare basics. For programming, that's writing a Hello, World! program in the new language you're working with. For foreign languages, you learn basic common words and sentence structure. For math, you learn your basic arithmetic operations like addition and multiplication.

      From there, we add on more additional complexity and string together everything we've learned. For a foreign language, this looks like learning about new words, stringing them together in your own sentences, then learning about verb tenses and throwing them into the mix as well. With math, you take your normal number crunching and suddenly throw the concept of order of operations into the mix, then variables and how to solve for them.

      As a general rule, we first get comfortable with solving a simple problem and gradually build up toward solving increasingly more difficult ones.


      The Missing Piece

      Odds are that we've all sat in a math class at one point, and when the teacher asked a student how to solve a problem, they received an immediate "I don't know". You may or may not have been that kid yourself. I have no intention of shaming the kids who struggled (or those who still struggle) with math. Rather, I want to point to what I believe is the fundamental cause of that mental barrier that has frustrated students for generations.

      Learning is not simply a matter of adding more complexity to problems. A key part of learning, and one that I don't recall ever having emphasized during my grade school studies, is your ability to break problems down into the steps that you know how to complete and combine the different, simpler skills you've already learned to arrive at a solution. Instead, you were expected to solve many of those complex problems and learn through practice, or through pure rote memorization.

      What determined whether or not you could solve those problems was then a question of whether or not you could intuit or memorize how to solve those specific problems, and brand new problems that still made use of the same skill sets but had completely different forms would throw a wrench in that. Those who could solve any of those problems--those who, I would argue, were often mistakenly referred to as "geniuses" or "talented"--were really just those who knew how to break a problem down into simpler pieces.

      This isn't a failing on the students, but on the way they've been taught to think about problems.


      Reducing Problems

      What does it mean to "break down" a problem, though? The few times I recall a teacher ever touching on the subject, "break down the problem" and "use the skills you've already learned" were the kinds of pieces of advice passed around, completely vague and devoid of meaning for anyone who didn't already understand. How can we better grasp this important step?

      There's a term in complexity theory known as "reduction". The general idea is that if you have problems A and B, where you already know how to solve B, then if you can transform problem A so that it looks like problem B, then you can use your solution for B to solve at least part of A.

      In other words, finding the solution to a more complex problem is just a matter of finding a way to make it look like a problem you already know how to solve.

      The advice to "break down" a problem really means to perform this process of "reduction", of transforming your more complicated problem A into your simpler, known problem B.


      In Practice

      We're still discussing a vague concept, but now that we have more specific language to work with, we can more easily see how it works in practice (a reduction of its own!).

      Let's consider a conceptually simple problem: grabbing the kth largest (or smallest) item from a list. How do we solve this problem? Probably the most obvious and straightforward answer is to sort the list then grab the kth item, right?

      Notice that we gave two high-level descriptions of the steps we need to solve this problem: sorting, then grabbing the appropriate item. We can therefore then state that the problem of "grab the kth largest/smallest item from a list" can be reduced to the two problems "sort a list" and "grab the kth item from a list".

      Now, let's say we're given the problem "take this list of competitor times from the race and tell me what the top 10 race times were". What do we know about this problem? We know that we're being given a list, and we know that we need the 10 smallest items from that list. We also know that "10 smallest items" is just shorthand for "the 1st smallest item, the 2nd smallest item, ..., and the 10th smallest item". We can therefore reduce this problem to the previous one we solved by transforming it into "grab the kth smallest item from a list" and "repeat for values 1-10 for k".


      Practical Advice

      In the end, my explanation may not have helped much at all in actually grasping the concept of reduction. My intent isn't necessarily to help you understand it immediately, but to provide you a framework for a way of thinking. Even if you do grasp the general concept, you may even wonder how you're supposed to recognize these kinds of reductions out in the wild in non-academic environments. The answer, perhaps annoying, is practice. Much like an appraiser can only become good at discerning details through experience, a programmer or computer scientist can only recognize these patterns through repeated exposure.

      In general, if I had to narrow it down to a small list of tips for improving your problem solving skills, this would be it:

      • Work on grasping the concept of reduction itself.
      • Expose yourself to lots of new problems.
      • Don't shy away from difficult problems. Reduce them as much as you can and solve the pieces you're able to. Try to research the pieces you're struggling with. Return to the problem later when you have more experience if you have to, but take a crack at it first.
      • Don't accept "I don't know" as an answer in itself. Ask yourself why you don't how to solve a problem. Narrow down which pieces you're able to solve and which pieces you're not.
      • Just solve problems. Any problems. Easy ones, hard ones, and anything in between. Solving problems is a skill, and practicing it will make you better at solving problems in general, and better at recognizing the simpler problems inside of more complicated ones.
      • Don't just come up with a solution to a problem. Ensure that you understand how each piece of it works and why it works. Copy-pasting from StackOverflow can be a valid tool at your disposal, but doing so mindlessly isn't nearly as valuable as reviewing the solution, being able to determine whether or not it works before ever executing the code, and being able to discard anything unnecessary from it.

      Final Thoughts

      I'm not an authoritative voice on this subject. I'm not an educator. More than anything, I'm a life-long student and an enthusiast. There's seldom a day when I don't have to research something new in order to solve a problem I'm not familiar with, or remind myself the syntax for a function I've used several times in the past. I don't know anything about teaching others, but I do know plenty about learning, and if there's anything that has stood out to me over the years, it's the fact that I find it easier to learn about something or to solve a problem if I can transform the concept into something that's easier for me to grasp.

      Moreover, I'm human and thus prone to mistakes. Call me out on them if you notice them. I'll take any of my mistakes as learning opportunities :)

      11 votes
    20. Programming Mini-Challenge: KnightBot

      Another programming mini-challenge for you. It's been a month since the first one and that seemed to be rather successful. (I appreciate that there are other challenges on here but trying to sync...

      Another programming mini-challenge for you. It's been a month since the first one and that seemed to be rather successful. (I appreciate that there are other challenges on here but trying to sync with them seems tricky!)

      A reminder:
      I'm certain that many of you might find these pretty straight forward, but I still think there's merit in sharing different approaches to simple problems, including weird-and-wonderful ones.


      KnightBot


      Info

      You will be writing a small part of a Chess program, specifically focusing on the Knight, on an 8 x 8 board.


      Input

      The top-left square of the board will have index 0, and the bottom-right square will have index 63.

      • The first input is the starting square of the knight.
      • The second input is the requested finishing square of the knight.
      • The third input is the number of maximum moves allowed.

      Output

      The expected outcome is either True or False, determined by whether or not the Knight can reach the requested finishing square within the number of allowed moves when stating on the starting square.

      e.g. The expected output for the input 16, 21, 4 is True since the Knight can move 16->33->27->21, which is 3 moves.
      

      Extensions

      Some additional ideas for extending this challenge...

      1. Instead of an 8x8, what if the board was nxn?
      2. Instead of "within x moves", what if it was "with exactly x moves?"
      3. Instead of a traditional Knight's move (2 long, 1 short), what if it was n long and m short?
      4. What if the board was infinite?
      5. What if the board looped back around when crossing the edges? (e.g. the square to the right of 7 is 0)
      17 votes
    21. Reflections on past lessons regarding code quality.

      Preface Over the last couple of years, I've had the opportunity to learn from the mistakes of my predecessors and put those lessons into practice. Among those lessons, three have stood out to me...

      Preface

      Over the last couple of years, I've had the opportunity to learn from the mistakes of my predecessors and put those lessons into practice. Among those lessons, three have stood out to me in particular:

      1. Consistency is king.
      2. Try not to be too clever for your own good.
      3. Good code takes time.

      I know that there are a lot of new and aspiring programmers here (and I'm admittedly far from being a guru myself), so I thought it would be good to touch on these three lessons, what they mean, and why they're so important.


      Consistency is King

      This is something that I had drilled into my head over nearly two years working on the code base at my previous job. Not by my fellow programmers (who did not exist), nor by my boss, but by the code itself.

      Consistency can mean a number of things, but there are two primary points that matter:

      1. Syntactic consistency.
      2. Architectural consistency.

      Syntactic consistency concerns standards in what your code looks like. For example, the choice between snake_case or camelCase or PascalCase for naming; function parameter order; or even something as benign as what kind of indentation and how much of it you use.

      Architectural consistency concerns standards in how you structure your code. Making sure that you either use public class properties or getter and setter methods; using multiple booleans or using bitmasks; using or not using objects for encapsulating data to be passed around; validating data within the primary object or relegating that responsibility to a validator class; and other seemingly minor decisions about how you handle certain behavior make a big difference.

      The code base I maintained had no such consistency. You could never remember whether the method you needed to call was named using snake_case or camelCase and had to perform several searches just to find it. Worse still, some methods defined to handle Ajax calls were prefixed with ajax while many weren't. Argument ordering seemed to be determined by a coin flip, and indentation seemed to vary between 2-space, 3-space, 4-space, and even 5-space indentation depending on what mood my predecessor was in at the time. You often could not tell where a function's body began and where it ended. Writing code was an exercise both in problem solving and in deciphering ancient religious texts.

      Architecturally it was no better. There was no standardization in how data was validated or sanitized, how class members were accessed or modified, how functionality was inherited, whether the functionality was encapsulated in an object method or in a function, or which objects were responsible for which behavior.

      That lack of consistency makes introducing or modifying a small feature, a task which should ordinarily be a breeze, an engineering feat of its own. Often you end up implementing that feature, after dancing around the tangled mess of spaghetti, only to find that the functionality that you implemented already existed somewhere else in the code base but was hiding out in a deep, dark corner that you never even knew was there until you had to fix some other broken feature months later and happened to stumble across it.

      Consistency means predictability, and predictability means discoverability and, more importantly, easier changes and higher confidence in those changes.


      Cleverness is a Fallacy

      In any given project, it can be tempting to do something that saves you extra lines of code, or saves on CPU cycles, or just looks awesome and does something nobody would have thought of before. As human beings and especially as craftsmen, we like to leave our mark and take pride in breaking the status quo by taking a novel and interesting approach to a problem. It can make us feel fulfilled in our work, that we've done something unique, a trademark of sorts.

      The problem with that is that it directly conflicts with the aforementioned consistency and predictability. What ends up being an engineering wonder to you ends up being an engineering nightmare to someone else. While you're enjoying the houses you build with wall studs arranged in the shape of a spider's web, the home remodelers who come along later aren't even sure if they can change part of the structure without causing the entire wall to collapse, and they're not even sure which walls are load-bearing and which aren't, so they're basically playing Jenga while blindfolded.

      The code base I maintained had a few such gems, with what looked like load-bearing walls but were actually made of papier-mâché and were only decorative in nature, and the occasional spider's web wall studs. One spider's web comes to mind in particular. It's been a while since I've worked on that piece of code, so I can't recall what exactly it did, but two query-constructing pieces of logic had overlapping query structure with the difference being the operators and data. Rather than being smart and allowing those two constructs to be different, however, my predecessor decided to be clever and the query construction was abstracted into a separate method so that the same general query structure could be used in other places (note: it never was, and was only ever used in those two instances). It was abstracted so that all original context was lost and no comments existed to explain any of it. On top of that, the method was being called from the most critical piece of the system which, unfortunately, was already a convoluted mess and desperately required a rewrite and thus required me to understand what the hell that method was even doing (incidentally, I fell in love with whiteboards as a result).

      When you feel like you're being clever, you should always stop what you're doing and make sure that what you're doing isn't actually a really terrible idea. Cleverness doesn't exist. Knowledge and intelligence do. Write intelligent code, not clever code.


      Good Code Takes Time

      Bad code more often than not is the result of impatience. We don't like to plan out the solution before we get to writing code. We like to use variables like x and temp in order to quickly achieve functional correctness of our code because stopping to think about how to name them is just additional overhead getting in the way. We don't like to scrap our bad work if we can salvage it in some way instead, because then we have to start from scratch and that's daunting. We continually work against ourselves and gradually increase our mental overhead because we try to decrease our mental overhead. As a result we find ourselves too exhausted by the end of our initial implementations to concern ourselves with fixing obvious problems. Obviously bad but functional code is preferable because we just want the task to be done and over with.

      The more you get exposed to bad code and the more you try to avoid pushing that hell onto yourself and your successors, the more you realize that you need to spend less time coding and more time researching and planning. Whereas you may have been spending upwards of 50% of your time coding previously, suddenly you find yourself spending as little as 10% of your time writing any code at all.

      Professionals from just about any field can tell you that you can either do something right or you can do it twice. You might recognize this most easily in the age-old piece of woodworking wisdom, "measure twice, cut once". The same is true of code, and doing something right means planning how to do it right in the first place before you've even started on the job.


      Putting into Practice

      I've been fortunate over the last couple of months to be able to start on a brand new project and architect it in a way that I see fit. Changes which would ordinarily take days or weeks in the old code base now take me half a day at most, and a matter of minutes at best. I remember where to find a piece of code that I need because I'm consistent and predictable about where I place things; I don't struggle to tell where something begins and where it ends because I'm consistent about structure; I don't continually hate myself when I need to make changes to my code because I don't do anything wildly out of the ordinary; and most importantly, I take my time to figure out what it is that I need to do and how I want to do it before I've written a single line of code.

      When I needed to add a web portal interface for uploading a media asset to associate with a database object, the initial implementation took me a week, due to the need for planning, adding the interface, and supporting and debugging the asset management. When I needed to extended that interface to allow for uploading the same kinds of assets for a completely different object type, it took me only half an hour, with most of that time being dedicated toward updating a Vue.js component to accept configuration via props rather than working for only the single hard-coded object type. If I need to add a case for any additional object type, it will take me only five minutes.

      That initial week of work for the web interface provided me with cost savings that would not have been feasible otherwise, and that initial week of work would have taken as many as three weeks had I not structured the API to be as consistent as it is now. Every initial lag in implementation is offset heavily by the long-term cost savings of writing good code.


      Technical Debt

      Technical debt is the cost of your code over time. The messier and worse your code gets, the more it costs you to try to change, and those costs only build up. Even good code can accumulate technical debt if the needs for your software have changed and its current architecture isn't compatible with those changes.

      No project is without technical debt. Even my own code, that I've been painstakingly working on for the last couple of months, has technical debt. Odds are a programmer far more experienced than I am will come along and want to scrap everything I've done, and will do a far better job rewriting it.

      That's okay, though. In fact, a certain amount of technical debt is good. If we try to never write any bad code whatsoever, then we could never possibly get to writing any code at all, because there are far too many unknowns for a new project.

      What's important is knowing when to pay down on that technical debt, which could mean anything from paying it up front (i.e. through planning ahead of time) to paying it down when it starts to get too expensive (e.g. refactoring a complicated section of code when changes become sufficiently difficult). That's not something you can learn through a StackOverflow post or a college lecture, and certainly not from some unknown stranger on some relatively unknown website in a long, informal blog-like post.


      Final Thoughts

      I'm far from being a great programmer. There's a lot that I don't know and I still have quite a bit to learn. I love programming, though, and more than that I enjoy sharing the lessons I've learned with others. Especially the ones that I wish I'd learned back in college.

      Please feel free to share your own experiences, learned lessons, and (if you have it) feedback here. I'd love to read up on some other thoughts on this subject!

      21 votes
    22. Coding Noob Needs Help/Guidance on Small Project

      Hi, There's a certain site which hosts media files and has a player that depends on a lot of third-party resources to play, while browsers have native support for those file types. Those 3rd-party...

      Hi,

      There's a certain site which hosts media files and has a player that depends on a lot of third-party resources to play, while browsers have native support for those file types. Those 3rd-party resources are often blocked by ad blockers and I have no desire to white-list them. I would like to extract the direct link to the media file and make it playable on my custom web page.

      The link to the media file is present in the page source of each page, always on the same line. It's not anchored in HTML but present in the JavaScript for the player, like so:

          $(document).ready(function(){
            $("#jquery_jplayer_1").jPlayer({
              ready: function () {
                $(this).jPlayer("setMedia", {
                  [ext]: "https://[domain]/[filename.ext]"
                });
              },
      

      In this example it's on line #5. [ext] = the file extension.

      I want to build the following:

      • A web page with a form with a single input field meant to receive links from that specific file host
      • [Something] that extracts the file link from the source of the host's page
      • Present the linked file as playable in an embedded native player

      So far I've managed to create a form with an input box and a submit button, but it doesn't do anything yet. What is the best way to build the actual functionality? I know HTML/CSS. I have some rudimentary understanding of JavaScript/jQuery and Python3, so those would be my preferred tools.

      For those worried about piracy: The files in question are not copyrighted and I'm not looking to make copies. I just want to make them playable. This is for personal use.

      Thank you for reading this far. Any and all advice is welcome!

      10 votes
    23. What are your unsolved programming problems?

      I thought it could be fun to discuss problems that we've encountered in our programming or programming-related work and have never found a solution for. I figure that at worst we can have a lot of...

      I thought it could be fun to discuss problems that we've encountered in our programming or programming-related work and have never found a solution for. I figure that at worst we can have a lot of fun venting about and scratching our heads at things that just don't make any sense to anyone, and at best we might be able to help each other find answers and, more importantly, some closure.

      16 votes
    24. Inexperienced Programming Question

      TLDR: What programming language would be useful for taking info in an excel file and producing a text file (that is organized and arranged in a particular way) containing that info? Which would be...

      TLDR: What programming language would be useful for taking info in an excel file and producing a text file (that is organized and arranged in a particular way) containing that info? Which would be useful for this problem but also helpful in general? And also, are there any recommended online courses where I could learn it?


      I have no real experience coding or anything but have always wanted to learn. Recently at work we've encountered a problem. My boss had created a matlab program in order to take text/numbers from an excel document and transfer them to a text file, but in an organized way.

      Say you have something you call "Pancakes" and the cell next to it has the number "3", as in there are three pancakes. I want to be able to create a text file that would read something like this:

      NUMBER OF PANCAKES

      • Pancakes: 3

      We recently have changed around the format of the excel document for a different item, for example "French Toast". I've tried to mess with matlab briefly but was unable to change the program to compensate, and I no longer easily have access to matlab.

      I'm seeing this as an opportunity to learn some programming and also fix some stuff at work. So what programming language would be useful for fixing this problem? Which would be useful for this problem, but also helpful in general? And also, are there any recommended online courses where I could learn it?

      Thanks for any help, I appreciate it.

      16 votes
    25. Programming Challenge: Make a game in 1 hour!

      Background There's been some talk on ~ before, and it seems like there are quite a few people who are either interested in, learning, or working in game development, so I thought this could be a...

      Background

      There's been some talk on ~ before, and it seems like there are quite a few people who are either interested in, learning, or working in game development, so I thought this could be a fun programming challenge.

      This one is fairly open-ended: make a game in 1 hour. Any game, any engine, don't worry about art or sound or anything.

      Doing is the best way to learn. Most people's first project is something overly ambitious, and when they find that it's more difficult than they thought, they can get discouraged, or even give up entirely. This is why the 1 hour limit is important: it forces you to finish something, even if it's small. When you're done, you can come out of it saying you made a game, and you learned from it.

      Chances are the game might not be fun, look bad, be buggy, etc. But don't worry about that, everyone's game will have problems, and if you do create something really fun or innovative, congratulations, you have a prototype that you can expand on later!

      "Rules"

      Like I said before, these "rules" are pretty simple: make a game in (approximately) 1 hour. You can use any tools you want. If you use external assets (art, sound), it's probably best you use something you have the rights to (see resources). If you're completely new to game development/programming, your goal could even be to finish a tutorial.

      If you're the kind of person who tends to get carried away with these things, you might want to post a comment saying you're starting, then another one once you've finished your game.

      Please share your finished game, I'm sure everyone would love to try them! If your game is web-based, it can be hosted for free on Github Pages or Itch.io. If downloadable, it can be hosted for free on Google Drive, Mega, Dropbox, Itch.io, etc.

      Resources

      Engines

      If you're a beginner, a good engine to start with is LÖVE. It's very simple, and uses Lua, which is very easy to learn.

      If you're familiar with another language, you could use a library to make it in that language. Some examples:

      C++: SFML, SDL, Allegro

      Javascript: kontra, Phaser, pixi.js

      Python: pygame

      Rust: Piston, ggez, Amethyst

      If you want something more complex, consider Godot, Unity, or Unreal.

      You can also try something visual like Construct, Clickteam Fusion, or GDevelop

      Art

      For such a short time constraint, I'd suggest you use your own "programmer art": just use some basic shapes. Your primary focus should be gameplay.

      If you think you have time to find something, try looking on OpenGameArt.

      Sound

      You can make simple sound effects very quickly with sfxr (or in this case, a web port of sfxr called jsfxr).

      27 votes
    26. Learning to Program

      Hi folks, I figured this would be a good place to ask a rather simple question. Where do I start to learn to code? I'm in high school, so I have (some) time to dedicate to it, and it seems there...

      Hi folks,

      I figured this would be a good place to ask a rather simple question.

      Where do I start to learn to code?

      I'm in high school, so I have (some) time to dedicate to it, and it seems there are a plethora of websites/resources out there, so I ask: what do you recommend, and why has it worked for you? I have no prior experience. I believe that this would really help out in the long run, as I will graduate high school with an Associate's Degree in Business. Thank you!

      EDIT: Thank you for all your responses! I'll start with Python and move on from there. You guys have been a great help, and I'll vote you up or reply.

      26 votes
    27. Programming Mini-Challenge: TicTacToeBot

      I've seen the programming challenges on ~comp as well as quite a few users who are interested in getting started with programming. I thought it would be interesting to post some 'mini-challenges'...

      I've seen the programming challenges on ~comp as well as quite a few users who are interested in getting started with programming. I thought it would be interesting to post some 'mini-challenges' that all could have a go at. I'm certain that many of you might find these pretty straight forward, but I still think there's merit in sharing different approaches to simple problems, including weird-and-wonderful ones.

      This is my first post and I'm a maths-guy who dabbles in programming, so I'm not promising anything mind-blowing. If these gain any sort of traction I'll post some more.

      Starting of with...


      TicTacToeBot


      Info

      You will be writing code for a programme that will check to see if a player has won a game of tic-tac-toe.


      Input

      The input will be 9 characters that denote the situation of each square on the grid.

      • 'X' represents the X-player has moved on that square.
      • 'O' represents the O-player has moved on that square.
      • '#' represents that this square is empty.

      Example:

      |O| |X|
      |X|X|O|    The input for this grid will be O#XXXOO##
      |O| | |
      

      Output

      The expected output is the character representing the winning player, or "#" if the game is not won.

      (e.g. The expected output for the example above is '#' since no player has won)


      29 votes
    28. Programming Challenge: Two Wizards algorithm challenge

      I'm running out of ideas, if you have any, please make your own programming challenge. This challenge is about designing algorithm to solve this problem. Let's have game field of size x, y (like...

      I'm running out of ideas, if you have any, please make your own programming challenge.


      This challenge is about designing algorithm to solve this problem.

      Let's have game field of size x, y (like in chess). There are two wizards, that are standing at [ 0, 0 ] and are teleporting themselves using spells. The goal is to not be the one who teleports them outside of the map. Each spell teleports wizard by at least +1 tile. Given map size and collection of spells, who wins (they do not make any mistakes)?

      Here are few examples:


      Example 1

      x:4,y:5

      Spells: { 0, 2 }

      Output: false

      Description: Wizard A starts, teleporting both of them to 0, 2. Wizard B teleports them to 0, 4. Wizard A has to teleport them to 0,6, which overflows from the map, so he loses the game. Because starting wizard (wizard A) loses, output is false.

      Example 2

      x:4,y:4

      Spells: { 1,1 }

      Output: true

      Example 3

      x:4,y:5

      Spells: { 1,1 },{ 3,2 },{ 1,4 },{ 0,2 },{ 6,5 },{ 3,1 }

      Output: true

      Example 4

      x:400,y:400

      Spells: {9,2},{15,1},{1,4},{7,20},{3,100},{6,4},{9,0},{7,0},{8,3},{8,44}

      Ouput: true


      Good luck! I'll comment here my solution in about a day.

      Note: This challenge comes from fiks, programming competition by Czech college ČVUT (CTU).

      15 votes