-
7 votes
-
Dissecting A Dweet ~ Spirograph Design Generator
6 votes -
The Recurse Center is accepting applications for Fellowships of up to $10,000 for women, trans, and non-binary programmers who want to pursue ambitious projects this fall.
9 votes -
Catching use-after-move bugs with Clang's consumed annotations
5 votes -
How To Build An App: Everything You Didn't Know You Needed To Know | Tom Scott
8 votes -
"Perl 6 is Cursed! I hate it!"
7 votes -
Eleven great mechanical keyboards for coders — updated for 2019
9 votes -
Code Quality Tip: The importance of understanding correctness vs. accuracy.
Preface It's not uncommon for a written piece of code to be both brief and functionality correct, yet difficult to reason about. This is especially true of recursive algorithms, which can require...
Preface
It's not uncommon for a written piece of code to be both brief and functionality correct, yet difficult to reason about. This is especially true of recursive algorithms, which can require some amount of simulating the algorithm mentally (or on a whiteboard) on smaller problems to try to understand the underlying logic. The more you have to perform these manual simulations, the more difficult it becomes to track what exactly is going on at any stage of computation. It's also not uncommon that these algorithms can be made easier to reason about with relatively small changes, particularly in the way you conceptualize the solution to the problem. Our goal will be to take a brief tour into what these changes might look like and why they are effective at reducing our mental overhead.
Background
We will consider the case of the subset sum problem, which is essentially a special case of the knapsack problem where you have a finite number of each item and each item's value is equal to its weight. In short, the problem is summarized as one of the following:
-
Given a set of numbers, is there a subset whose sum is exactly equal to some target value?
-
Given a set of numbers, what is the subset whose sum is the closest to some target value without exceeding it?
For example, given the set of numbers
{1, 3, 3, 5}
and a target value of9
, the answer for both of those questions is{1, 3, 5}
because the sum of those numbers is9
. For a target value of10
, however, the first question has no solution because no combination of numbers in the set{1, 3, 3, 5}
produces a total of10
, but the second question produces a solution of{1, 3, 5}
because9
is the closest value to10
that those numbers can produce without going over.
A Greedy Example
We'll stick to the much simpler case of finding an exact match to our target value so we don't have to track what the highest value found so far is. To make things even simpler, we'll consider the case where all numbers are positive, non-zero integers. This problem can be solved with some naive recursion--simply try all combinations until either a solution is found or all combinations have been exhausted. While more efficient solutions exist, naive recursion is the easiest to conceptualize.
An initial assessment of the problem seems simple enough. Our solution is defined as the set of array elements whose total is equal to our target value. To achieve this, we loop through each of the elements in the array, try combinations with all of the remaining elements, and keep track of what the current total is so we can compare it to our target. If we find an exact match, we return an array containing the matching elements, otherwise we return nothing. This gives us something like the following:
function subsetSum($target_sum, $values, $total = 0) { // Base case: a total exceeding our target sum is a failure. if($total > $target_sum) { return null; } // Base case: a total matching our target sum means we've found a match. if($total == $target_sum) { return array(); } foreach($values as $index=>$value) { // Recursive case: try combining the current array element with the remaining elements. $result = subsetSum($target_sum, array_slice($values, $index + 1), $total + $value); if(!is_null($result)) { return array_merge(array($value), $result); } } return null; }
Your Scope is Leaking
This solution works. It's functionally correct and will produce a valid result every single time. From a purely functional perspective, nothing is wrong with it at all; however, it's not easy to follow what's going on despite how short the code is. If we look closely, we can tell that there are a few major problems:
-
It's not obvious at first glance whether or not the programmer is expected to provide the third argument. While a default value is provided, it's not clear if this value is only a default that should be overridden or if the value should be left untouched. This ambiguity means relying on documentation to explain the intention of the third argument, which may still be ignored by an inattentive developer.
-
The base case where a failure occurs, i.e. when the accumulated total exceeds the target sum, occurs one stack frame further into the recursion than when the total has been incremented. This forces us to consider not only the current iteration of recursion, but one additional iteration deeper in order to track the flow of execution. Ideally an iteration of recursion should be conceptually isolated from any other, limiting our mental scope to only the current iteration.
-
We're propagating an accumulating total that starts from
0
and increments toward our target value, forcing us to to track two different values simultaneously. Ideally we would only track one value if possible. If we can manage that, then the ambiguity of the third argument will be eliminated along with the argument itself.
Overall, the amount of code that the programmer needs to look at and the amount of branching they need to follow manually is excessive. The function is only 22 lines long, including whitespace and comments, and yet the amount of effort it takes to ensure you're understanding the flow of execution correctly is pretty significant. This is a pretty good indicator that we probably did something wrong. Something so simple and short shouldn't take so much effort to understand.
Patching the Leak
Now that we've assessed the problems, we can see that our original solution isn't going to cut it. We have a couple of ways we could approach fixing our function: we can either attempt to translate the abstract problems into tangible solutions or we can modify the way we've conceptualized the solution. With that in mind, let's take a second crack at this problem by trying the latter.
We've tried taking a look at this problem from a top-down perspective: "given a target value, are there any elements that produce a sum exactly equal to it?" Clearly this perspective failed us. Instead, let's try flipping the equation: "given an array element, can it be summed with others to produce the target value?"
This fundamentally changes the way we can think about the problem. Previously we were hung up on the idea of keeping track of the current total sum of the elements we've encountered so far, but that approach is incompatible with the way we're thinking of this problem now. Rather than incrementing a total, we now find ourselves having to do something entirely different: if we want to know if a given array element is part of the solution, we need to first subtract the element from the problem and find out if the smaller problem has a solution. That is, to find if the element
3
is part of the solution for the target sum of8
, then we're really asking if3 + solutionFor(5)
is valid.The new solution therefore involves looping over our array elements just as before, but this time we check if there is a solution for the target sum minus the current array element:
function subsetSum($target_sum, $values) { // Base case: the solution to the target sum of 0 is the empty set. if($target_sum === 0) { return array(); } foreach($values as $index=>$value) { // Base case: any element larger than our target sum cannot be part of the solution. if($value > $target_sum) { continue; } // Recursive case: do the remaining elements create a solution for the sub-problem? $result = subsetSum($target_sum - $value, array_slice($values, $index + 1)); if(!is_null($result)) { return array_merge(array($value), $result); } } return null; }
A Brief Review
With the changes now in place, let's compare our two functions and, more importantly, compare our new function to the problems we assessed with the original. A few brief points:
-
Both functions are the same exact length, being only 22 lines long with the same number of comments and an identical amount of whitespace.
-
Both functions touch the same number of elements and produce the same output given the same input. Apart from a change in execution order of a base case, functionality is nearly identical.
-
The new function no longer requires thinking about the scope of next iteration of recursion to determine whether or not an array element is included in the result set. The base case for exceeding the target sum now occurs prior to recursion, keeping the scope of the value comparison nearest where those values are defined.
-
The new function no longer uses a third accumulator argument, reducing the number of values to be tracked and removing the issue of ambiguity with whether or not to include the third argument in top-level calls.
-
The new function is now defined in terms of finding the solutions to increasingly smaller target sums, making it easier to determine functional correctness.
Considering all of the above, we can confidently state that the second function is easier to follow, easier to verify functional correctness for, and less confusing for anyone who needs to use it. Although the two functions are nearly identical, the second version is clearly and objectively better than the original. This is because despite both being functionally correct, the first function does a poor job at accurately defining the problem it's solving while the second function is clear and accurate in its definition.
Correct code isn't necessarily accurate code. Anyone can write code that works, but writing code that accurately defines a problem can mean the difference between understanding what you're looking at, and being completely bewildered at how, or even why, your code works in the first place.
Final Thoughts
Accurately defining a problem in code isn't easy. Sometimes you'll get it right, but more often than not you'll get it wrong on the first go, and it's only after you've had some distance from you original solution that you realize that you should've done things differently. Despite that, understanding the difference between functional correctness and accuracy gives you the opportunity to watch for obvious inaccuracies and keep them to a minimum.
In the end, even functionally correct, inaccurate code is worth more than no code at all. No amount of theory is a replacement for practical experience. The only way to get better is to mess up, assess why you messed up, and make things just a little bit better the next time around. Theory just makes that a little easier.
17 votes -
-
Challenge: defuse this fork bomb
On lobste.rs I found link to an article from Vidar Holen, the author of shellcheck. He made a fork bomb that is really interesting. Here's the bomb: DO NOT RUN THIS. eval $(echo...
On lobste.rs I found link to an article from Vidar Holen, the author of shellcheck. He made a fork bomb that is really interesting. Here's the bomb:
DO NOT RUN THIS.
eval $(echo "I<RA('1E<W3t`rYWdl&r()(Y29j&r{,3Rl7Ig}&r{,T31wo});r`26<F]F;==" | uudecode)
This may look pretty obvious, but it's harder than you think. I fell for it. twice. Can you find out how this bomb works?
Warning: executing the bomb will slow down your computer and will force you to restart.
You can limit impact of the fork bomb by settingFUNCNEST
.export FUNCNEST=3
Have fun!
12 votes -
Tildes from the command line
How many people would be interested in browsing Tildes from a TUI, ala rtv?
21 votes -
Prisons are banning books that teach prisoners how to code
8 votes -
What are the minimal features every good blog should have?
I've been learning Laravel, and familiarizing myself with the framework by coding up a blogging website. Right now, it's minimally functional, and I'd like to add some more features to it. Since...
I've been learning Laravel, and familiarizing myself with the framework by coding up a blogging website. Right now, it's minimally functional, and I'd like to add some more features to it. Since this is my first project with Laravel the code is a mess, and it's just about time for me to rewrite the whole thing. Before starting that, I'd like to have a better idea of what my final product should be. I don't want to recreate WordPress in Laravel, but I do want to have something I wouldn't spit at. Basically a project that would be good as a resume builder if I ever needed one.
So far, my website allows users to...
- register for an account, log in/out, update their email address and display name
- create posts with a WISIWYG editor
- upload files
- create profiles
- and manipulate everything through CRUD.
What do you think the minimal features a blogging platform needs to have to be "complete" and usable as a stand-alone system?
13 votes -
Things I Learnt The Hard Way (in 30 Years of Software Development)
5 votes -
Starbase - YOLOL Programming
4 votes -
From design patterns to Category theory
8 votes -
The Expression Problem and its solutions
4 votes -
Programming sucks
25 votes -
The hidden heroines of chaos
5 votes -
Choosing the right coding summer camp for your kid: nine questions to ask
3 votes -
Programming Challenge: Text compression
In an effort to make these weekly, I present a new programming challenge. The challenge this week is to compress some text using a prefix code. Prefix codes associate each letter with a given bit...
In an effort to make these weekly, I present a new programming challenge.
The challenge this week is to compress some text using a prefix code. Prefix codes associate each letter with a given bit string, such that no encoded bitstring is the prefix of any other. These bit strings are then concatenated into one long integer which is separated into bytes for ease of reading. These bytes can be represented as hex values as well. The provided prefix encoding is as follows:
char value char value ' ' 11 'e' 101 't' 1001 'o' 10001 'n' 10000 'a' 011 's' 0101 'i' 01001 'r' 01000 'h' 0011 'd' 00101 'l' 001001 '~' 001000 'u' 00011 'c' 000101 'f' 000100 'm' 000011 'p' 0000101 'g' 0000100 'w' 0000011 'b' 0000010 'y' 0000001 'v' 00000001 'j' 000000001 'k' 0000000001 'x' 00000000001 'q' 000000000001 'z' 000000000000 Challenge
Your program should accept a lowercase string (including the ~ character), and should output the formatted compressed bit string in binary and hex. Your final byte should be 0 padded so that it has 8 bits as required. For your convenience, here is the above table in a text file for easy read-in.
Example
Here is an example:
$> tildes ~comp 10010100 10010010 01011010 10111001 00000010 11000100 00110000 10100000 94 92 5A B9 02 C4 30 A0
Bonuses
- Print the data compression ratio for a given compression, assuming the original input was encoded in 8 bit ASCII (one byte per character).
2. Output the ASCII string corresponding to the encoded byte string in addition to the above outputs. - @onyxleopard points out that many bytes won't actually be valid ASCII. Instead, do as they suggested and treat each byte as an ordinal value and print it as if encoded as UTF-8.
- An input prefixed by 'D' should be interpreted as an already compressed string using this encoding, and should be decompressed (by inverting the above procedure).
Previous Challenges (I am aware of prior existing ones, but it is hard to collect them as they were irregular. Thus I list last week's challenge as 'Week 1')
Week 113 votes - Print the data compression ratio for a given compression, assuming the original input was encoded in 8 bit ASCII (one byte per character).
-
Why Precompiled Headers do (not) Improve C++ Compile Times
4 votes -
Programming Challenge: Dice Roller
Its been a while since we did one of these, which is a shame. Create a program that takes is an input of the type: "d6 + 3" or "2d20 - 5", and return a valid roll. The result should display both...
Its been a while since we did one of these, which is a shame.
Create a program that takes is an input of the type: "d6 + 3" or "2d20 - 5", and return a valid roll.
The result should display both the actual rolls as well as the final result. The program should accept any valid roll of the type 'xdx'
Bonuses:- Multiplication "d6 * 3"
- Division "d12 / 6"
- Polish notation "4d6 * (5d4 - 3)"
As a side note, it would be really cool if weekly programming challenges became a thing
33 votes -
Falsehoods programmers believe about Unix time
7 votes -
For Better Computing, Liberate CPUs from Garbage Collection
15 votes -
The little printf
15 votes -
What is the most creative app or website you know of?
HELLO TILDES USERS. IT IS I, FELLOW HUMAN, BISHOP. As you may have read in an earlier post of mine (ok probably not it was a one-off comment, not like I reinforced the thought anywhere.) I do...
HELLO TILDES USERS. IT IS I, FELLOW HUMAN, BISHOP.
As you may have read in an earlier post of mine (ok probably not it was a one-off comment, not like I reinforced the thought anywhere.)
I do indeed hold the belief that code can be, itself, art, in the right context.
Or, rather, that code can be used for artistic purposes.
I dunno.
That's why I'm posting.
What would you say is the most artistic or, at least, creatively designed website or mobile app that you've seen?
I've got some creativity a-stewin' away in my head, and I need a new excuse to kill some time on frontend.
So, fellow humans, hit me with your best shot duh-nuh-nuh-nuh fire away.
What ya got?
(@mods fix my tags please. Not sure what to put, but you might have a good idea. Ya boy's had a few.)
18 votes -
How do you structure larger projects?
I'll be writing a relatively large piece of scientific code for the first time, and before I begin I would at least like to outline how the project will be structured so that I don't run into...
I'll be writing a relatively large piece of scientific code for the first time, and before I begin I would at least like to outline how the project will be structured so that I don't run into headaches later on. The problem is, I don't have much experience structuring large projects. Up until now most of the code I have written as been in the form of python scripts that I string together to form an ad-hoc pipeline for analysis, or else C++ programs that are relatively self contained. My current project is much larger in scope. It will consist of four main 'modules' (I'm not sure if this is the correct term, apologies if not) each of which consist of a handful of .cpp and .h files. The schematic I have in mind for how it should look is something like:
src ├──Module1 (Initializer) │ ├ file1.cpp │ ├ file1.h │ │... │ └ Makefile ├───Module2 (solver) │ ├ file1.cpp │ ├ file1.h │ │... │ └ Makefile ├───Module3 (Distribute) │ ├ file1.cpp │ └Makefile └ Makefile
Basically, I build each self-contained 'module', and use the object files produced there to build my main program. Is there anything I should keep in mind here, or is this basically how such a project should be structured?
I imagine the particularly structure will be dependent on my project, but I am more interested in general principles to keep in mind.
14 votes -
Modern SQL Window Function Questions
7 votes -
Ligatures in programming fonts: hell no
9 votes -
Rust is not a good C replacement
27 votes -
Swift 5 Released
12 votes -
Nine APIs for the geekiest of programmers
7 votes -
Do you enjoy programming outside of work?
I have found this to be a semi controversial topic. Its almost becoming a required point for getting a new job to have open source work that you can show. Some people just enjoy working on...
I have found this to be a semi controversial topic. Its almost becoming a required point for getting a new job to have open source work that you can show. Some people just enjoy working on programming side projects and others don't want to do any more after they leave the office.
Whats your opinion on this? Do you work on any side projects? Do you think its reasonable for interviewers to look for open source work when hiring?
16 votes -
Looking for assistance for professional or personal development? There is an opportunity to receive a coding scholarship through Lesbians Who Tech!
7 votes -
Coding Challenge - Design network communication protocol
Previous challenges It's time for another coding challenge! This challenge isn't mine, it's this challenge (year 5, season 3, challenge 3) by ČVUT FIKS. The task is to design a network...
It's time for another coding challenge!
This challenge isn't mine, it's this challenge (year 5, season 3, challenge 3) by ČVUT FIKS.
The task is to design a network communication protocol. You're sending large amount of bits over the network. The problem is that network is not perfect and the message sometimes arrives corrupted. Design a network protocol, that will guarantee that the decoded message will be exactly same as the message that was encoded.
MESSAGE => (encoding) => message corrupted => (decoding) => MESSAGE
Corruption
Transmitting the message might corrupt it and introduce errors. Each error in a message (there might be more than one error in a single message) will flip all following bits of the message.
Example:
011101 => 011|010
(
|
is place where an error occured).There might be more than one error in a message, but there are some rules:
-
Minimum distance between two errors in a single message is
k
-
Number of bits between two errors is always odd number
According to these rules, describe a communication protocol, that will encode a message, and later decode message with errors.
Bonus
-
Guarantee your protocol will work always - even when errors are as common as possible
-
Try to make the protocol as short as possible.
8 votes -
-
Programming Challenge: Build an Interpreter
Hello everyone! It has been a while since last programming challenge, it's time for another one! This week's goal would be to build your own interpreter. Interpreter is program that receives input...
Hello everyone! It has been a while since last programming challenge, it's time for another one!
This week's goal would be to build your own interpreter.
Interpreter is program that receives input and executes it. For example Python is interpreted language, meaning you are actually writing instructions for the interpreter, which does the magic.
Probably the easiest interpereter to write is Brainfuck interpreter. If someone here doesn't know, Brainfuck is programming language, which contains following instructions:
,.<>[]-+
. Other characters are ignored. It has memory in form of array of integers. At the start, pointer that points to one specific memory cell points to cell 0. We can use<
to move pointer to left (decrement) and>
to move pointer to right (increment)..
can be used to print value of cell the pointer is currently pointing to (ascii).,
can be used to read one character from stdin and write it to memory.[
is beggining of loop and]
is end of loop. Loops can be nested. Loop is terminated when we reach]
character and current value in memory is equal to 0.-
can be used to decrement value in memory by 1 and+
can be used to increment value in memory by 1. Here's Hello World:++++++++++[>+++++++>++++++++++>+++>+<<<< -]>++.>+.+++++++..+++.>++.<<++++++++++++ +++.>.+++.------.--------.>+.>.
People with nothing to do today can attemp to make an interpreter for the Taxi programming language.
You can even make your own language! There are no limits for this challenge.
23 votes -
Super Mario Bros. 3 - Extended 1up Sound | Retro Game Mechanics Explained
7 votes -
The culture war at the heart of open source
14 votes -
Announcing my first business card size C++ game: Tiny Ski
14 votes -
Conceptualizing Data: Simplifying the way we think about complex data structures.
Preface Conceptual models in programming are essential for being able to reason about problems. We see this through code all the time, with implementation details hidden away behind abstractions...
Preface
Conceptual models in programming are essential for being able to reason about problems. We see this through code all the time, with implementation details hidden away behind abstractions like functions and objects so that we can ignore the cumbersome details and focus only on the details that matter. Without these abstractions and conceptual models, we might find ourselves overwhelmed by the size and complexity of the problem we’re facing. Of these conceptual models, one of the most easily neglected is that of data and object structure.
Data Types Galore
Possibly one of the most overwhelming aspects of conceptualizing data and object structure is the sheer breadth of data types available. Depending on the programming language you’re working with, you may find that you have more than several dozens of object classes already defined as part of the language’s core; primitives like booleans, ints, unsigned ints, floats, doubles, longs, strings, chars, and possibly others; arrays that can contain any of the objects or primitives, and even other arrays; and several other data structures like queues, vectors, and mixed-type collections, among others.
With so many types of data, it’s incredibly easy to lose track in a sea of type declarations and find yourself confused and unsure of where to go.
Tree’s Company
Let’s start by trying to make these data types a little less overwhelming. Rather than thinking strictly of types, let’s classify them. We can group all data types into one of three basic classifications:
- Objects, which contain key/value pairs. For example, an object property that stores a string.
- Arrays, which contain some arbitrary number of values.
- Primitives, which contain nothing. They’re simply a “flat” data value.
We can also make a couple of additional notes. First, arrays and objects are very similar; both contain references to internal data, but the way that data is referenced differs. In particular, objects have named keys while arrays have numeric, zero-indexed keys. In a sense, arrays are a special case of objects where the keys are more strictly typed. From this, we can condense the classifications of objects and arrays into the more general “container” classification.
With that in mind, we now have the following classifications:
- Containers.
- Primitives.
We can now generally state that containers may contain other containers and primitives, and primitives may not contain anything. In other words, all data structures are a composition of containers and/or primitives, where containers may accept containers and/or primitives and primitives may not accept anything. More experienced programmers should notice something very familiar about this description--we’re basically describing a tree structure! Primitive types and empty containers act as the leaves in a tree, whereas objects and arrays act as the nodes.
Trees Help You Breathe
Okay, great. So what’s the big deal, anyway? We’ve now traded a bunch of concrete data types that we can actually think about and abstracted them away into this nebulous mess of containers and primitives. What do we get out of this?
A common mistake many programmers make is planning their data types out from the very beginning. Rather than planning out an abstraction for their data and object architecture, it’s easy to immediately find yourself focusing too much on the concrete implementation details.
Imagine, for example, modeling a user account for an online payment system. A common feature to include is the ability to store payment information for auto-pay, and payment methods typically take the form of some combination of credit/debit cards and bank accounts. If we focus on implementation details from the beginning, then we may find ourselves with something like this in a first iteration:
UserAccount: { username: String, password: String, payment_methods: PaymentMethod[] } PaymentMethod: { account_name: String, account_type: Enum, account_holder: String, number: String, routing_number: String?, cvv: String?, expiration_date: DateString? }
We then find ourselves realizing that
PaymentMethod
is an unnecessary mess of optional values and needing to refactor it. Odds are we would break it off immediately into separate account types and make a note that they both implement some interface. We may also find that, as a result, remodeling thePaymentMethod
could result in the need to remodel theUserAccount
. For more deeply nested data structures, a single change deeper within the structure could result in those changes cascading all the way to the top-level object. If we have multiple objects, then these changes could propagate to them as well. And what if we decide a type needs to be changed, like deciding that our expiration date needs to be some sort of date object? Or what if we decide that we want to modify our property names? We’re then stuck having to update these definitions as we go along. What if we decide that we don't want an interface for different payment method types after all and instead want separate collections for each type? Then including the interface consideration will have proven to be a waste of time. The end result is that before we’ve even touched a single line of code, we’ve already found ourselves stuck with a bunch of technical debt, and we’re only in our initial planning stages!To alleviate these kinds of problems, it’s far better to just ignore the implementation details. By doing so, we may find ourselves with something like this:
UserAccount: { Username, Password, PaymentMethods } PaymentMethods: // TODO: Decide on this container’s structure. CardAccount: { AccountName, CardHolder, CardNumber, CVV, ExpirationDate, CardType } BankAccount: { AccountName, AccountNumber, RoutingNumber, AccountType }
A few important notes about what we’ve just done here:
- We don’t specify any concrete data types.
- All fields within our models have the capacity to be either containers or primitives.
- We’re able to defer a model’s structural definition without affecting the pace of our planning.
- Any changes to a particular field type will automatically propagate in our structural definitions, making it trivial to create a definition like
ExpirationDate: String
and later change it toExpirationDate: DateObject
. - The amount of information we need to think about is reduced down to the very bare minimum.
- By deferring the definition of the
PaymentMethods
structure, we find ourselves more inclined to focus on the more concrete payment method definitions from the very beginning, rather than trying to force them to be compatible through an interface. - We focused only on data representation, ensuring that representation and implementation are both separate and can be handled differently if needed.
SOLIDifying Our Conceptual Model
In object-oriented programming (OOP), there’s a generally recommended set of principles to follow, represented by the acronym “SOLID”:
- Single responsibility.
- Open/closed.
- Liskov substitution.
- Interface segregation.
- Dependency inversion.
These “SOLID” principles were defined to help resolve common, recurring design problems and anti-patterns in OOP.
Of particular note for us is the last one, the “dependency inversion” principle. The idea behind this principle is that implementation details should depend on abstractions, not the other way around. Our new conceptual model obeys the dependency inversion principle by prioritizing a focus on abstractions while leaving implementation details to the future class definitions that are based on our abstractions. By doing so, we limit the elements involved in our planning and problem-solving stages to only what is necessary.
Final Thoughts
The consequences of such a conceptual model extend well beyond simply planning out data and object structures. For example, if implemented as an actual programming or language construct, you could make the parsing of your data fairly simple. By implementing an object parser that performs reflection on some passed object, you can extract all of the publicly accessible object properties of the target object and the data contained therein. Thus, if your language doesn’t have a built-in JSON encoding function and no library yet exists, you could recursively traverse your data structure to generate the appropriate JSON with very little effort.
Many of the most fundamental programming concepts, like data structures ultimately being nothing more than trees at their most abstract representation, are things we tend to take for granted and think very little about. By making ourselves conscious of these fundamental concepts, however, we can more effectively take advantage of them.
Additionally, successful programmers typically solve a programming problem before they’ve ever written a single line of code. Whether or not they’re conscious of it, the tools they use to solve these problems effectively consist largely of the myriad conceptual models they’ve collected and developed over time, and the experience they’ve accumulated to determine which conceptual models need to be utilized to solve a particular problem.
Even when you have a solid grasp of your programming fundamentals, you should always revisit them every now and then. Sometimes there are details that you may have missed or just couldn’t fully appreciate when you learned about them. This is something that I’m continually reminded of as I continue on in my own career growth, and I hope that I can continue passing these lessons on to others.
As always, I'm absolutely open to feedback and questions!
15 votes -
Neural Networks, Types and Functional Programming
4 votes -
Steve Klabnik - Learning Ada
6 votes -
Better x86 Assembly Generation with Go
4 votes -
Rust: undefined behaviour in numeric conversions
6 votes -
The RedMonk Programming Language Rankings: January 2019
4 votes -
How designers engineer luck into video games
9 votes -
Coding for the Parallel Sega Saturn DSP
5 votes -
Programming Challenge - Find path from city A to city B with least traffic controls inbetween.
Previous challenges Hi, it's been very long time from last Programming Challenge, and I'd like to revive the tradition. The point of programming challenge is to create your own solution, and if...
Hi, it's been very long time from last Programming Challenge, and I'd like to revive the tradition.
The point of programming challenge is to create your own solution, and if you're bored, even program it in your favourite programming language. Today's challenge isn't mine. It was created by ČVUT FIKS (year 5, season 2, challenge #4).
You need to transport plans for your quantum computer through Totalitatia. The problem is, that Totalitatia's government would love to have the plans. And they know you're going to transport the computer through the country. You'll receive number
N
, which denotes number of cities on the map. Then, you'll getM
paths, each going from one city to another. Each path hask
traffic controls. They're not that much effective, but the less of them you have to pass, the better. Find path from cityA
to cityB
, so the maximum number of traffic controls between any two cities is minimal. CityA
is always the first one (0
) and cityB
is always the last one (N-1
).Input format:
N M A1 B1 K1 A2 B2 K2 ...
On the first two lines, you'll get numbers N (number of cities) and M (number of paths). Than, on next
M
lines, you'll get definition of a path. The definition looks like1 2 6
, where1
is id of first city and2
is id of second city (delimited by a space). You can go from city 1 to city 2, or from city 2 to city 1. The third number (6
) is number of traffic controls.Output format:
Single number, which denotes maximum number of traffic controls encountered on one path.
Hint: This means, that path that goes via roads with numbers of traffic controls
4 4 4
is better than path via roads with numbers of traffic controls1 5 1
. First example would have output4
, the second one would have output5
.Example:
IN:
4 5 0 1 3 0 2 2 1 2 1 1 3 4 2 3 5
OUT:
4
Solution: The optimal path is either
0 2 1 3
or0 1 3
.Bonus
- Describe time complexity of your algorithm.
- If multiple optimal paths exist, find the shortest one.
- Does your algorithm work without changing the core logic, if the source city and the target city is not known beforehand (it changes on each input)?
- Do you use special collection to speed up minimum value search?
Hints
13 votes -
Programming Challenge: Anagram checking.
It's been over a week since the last programming challenge and the previous one was a bit more difficult, so let's do something easier and more accessible to newer programmers in particular. Write...
It's been over a week since the last programming challenge and the previous one was a bit more difficult, so let's do something easier and more accessible to newer programmers in particular. Write a function that takes two strings as input and returns
true
if they're anagrams of each other, orfalse
if they're not.Extra credit tasks:
- Don't consider the strings anagrams if they're the same barring punctuation.
- Write an efficient implementation (in terms of time and/or space complexity).
- Minimize your use of built-in functions and methods to bare essentials.
- Write the worst--but still working--implementation that you can conceive of.
24 votes -
Code Quality Tip: Cyclomatic complexity in depth.
Preface Recently I briefly touched on the subject of cyclomatic complexity. This is an important concept for any programmer to understand and think about as they write their code. In order to...
Preface
Recently I briefly touched on the subject of cyclomatic complexity. This is an important concept for any programmer to understand and think about as they write their code. In order to provide a more solid understanding of the subject, however, I feel that I need to address the topic more thoroughly with a more practical example.
What is cyclomatic complexity?
The concept of "cyclomatic complexity" is simple: the more conditional branching and looping in your code, the more complex--and therefore the more difficult to maintain--that code is. We can visualize this complexity by drawing a diagram that illustrates the flow of logic in our program. For example, let's take the following toy example of a user login attempt:
<?php $login_data = getLoginCredentialsFromInput(); $login_succeeded = false; $error = ''; if(usernameExists($login_data['username'])) { $user = getUser($login_data['username']); if(!isDeleted($user)) { if(!isBanned($user)) { if(!loginRateLimitReached($user)) { if(passwordMatches($user, $login_data['password'])) { loginUser($user); $login_succeeded = true; } else { $error = getBadPasswordError(); logBadLoginAttempt(); } } else { $error = getLoginRateLimitError($user); } } else { $error = getUserBannedError($user); } } else { $error = getUserDeletedError($user); } } else { $error = getBadUsernameError($login_data['username']); } if($login_succeeded) { sendSuccessResponse(); } else { sendErrorResponse($error); } ?>
A diagram for this logic might look something like this:
+-----------------+ | | | Program Start | | | +--------+--------+ | | v +--------+--------+ +-----------------+ | | | | | Username +--->+ Set Error +--+ | Exists? | No | | | | | +-----------------+ | +--------+--------+ | | | Yes | | v | +--------+--------+ +-----------------+ | | | | | | | User Deleted? +--->+ Set Error +->+ | | Yes| | | +--------+--------+ +-----------------+ | | | No | | v | +--------+--------+ +-----------------+ | | | | | | | User Banned? +--->+ Set Error +->+ | | Yes| | | +--------+--------+ +-----------------+ | | | No | | v | +--------+--------+ +-----------------+ | | | | | | | Login Rate +--->+ Set Error +->+ | Limit Reached? | Yes| | | | | +-----------------+ | +--------+--------+ | | | No | | v | +--------+--------+ +-----------------+ | | | | | | |Password Matches?+--->+ Set Error +->+ | | No | | | +--------+--------+ +-----------------+ | | | Yes | | v | +--------+--------+ +----------+ | | | | | | | Login User +--->+ Converge +<--------+ | | | | +-----------------+ +---+------+ | | +-----------------+ | v +--------+--------+ | | | Succeeded? +-------------+ | | No | +--------+--------+ | | | Yes | | v v +--------+--------+ +--------+--------+ | | | | | Send Success | | Send Error | | Message | | Message | | | | | +-----------------+ +-----------------+
It's important to note that between nodes in this directed graph, you can find certain enclosed regions being formed. Specifically, each conditional branch that converges back into the main line of execution generates an additional region. The number of these distinct enclosed regions is directly proportional to the level of cyclomatic complexity of the system--that is, more regions means more complicated code.
Clocking out early.
There's an important piece of information I noted when describing the above example:
. . . each conditional branch that converges back into the main line of execution generates an additional region.
The above example is made complex largely due to an attempt to create a single exit point at the end of the program logic, causing these conditional branches to converge and thus generate the additional enclosed regions within our diagram.
But what if we stopped trying to converge back into the main line of execution? What if, instead, we decided to interrupt the program execution as soon as we encountered an error? Our code might look something like this:
<?php $login_data = getLoginCredentialsFromInput(); if(!usernameExists($login_data['username'])) { sendErrorResponse(getBadUsernameError($login_data['username'])); return; } $user = getUser($login_data['username']); if(isDeleted($user)) { sendErrorResponse(getUserDeletedError($user)); return; } if(isBanned($user)) { sendErrorResponse(getUserBannedError($user)); return; } if(loginRateLimitReached($user)) { logBadLoginAttempt($user); sendErrorResponse(getLoginRateLimitError($user)); return; } if(!passwordMatches($user, $login_data['password'])) { logBadLoginAttempt($user); sendErrorResponse(getBadPasswordError()); return; } loginUser($user); sendSuccessResponse(); ?>
Before we've even constructed a diagram for this logic, we can already see just how much simpler this logic is. We don't need to traverse a tree of if statements to determine which error message has priority to be sent out, we don't need to attempt to follow indentation levels, and our behavior on success is right at the very end and at the lowest level of indentation, where it's easily and obviously located at a glance.
Now, however, let's verify this reduction in complexity by examining the associated diagram:
+-----------------+ | | | Program Start | | | +--------+--------+ | | v +--------+--------+ +-----------------+ | | | | | Username +--->+ Send Error | | Exists? | No | Message | | | | | +--------+--------+ +-----------------+ | Yes | v +--------+--------+ +-----------------+ | | | | | User Deleted? +--->+ Send Error | | | Yes| Message | +--------+--------+ | | | +-----------------+ No | v +--------+--------+ +-----------------+ | | | | | User Banned? +--->+ Send Error | | | Yes| Message | +--------+--------+ | | | +-----------------+ No | v +--------+--------+ +-----------------+ | | | | | Login Rate +--->+ Send Error | | Limit Reached? | Yes| Message | | | | | +--------+--------+ +-----------------+ | No | v +--------+--------+ +-----------------+ | | | | |Password Matches?+--->+ Send Error | | | No | Message | +--------+--------+ | | | +-----------------+ Yes | v +--------+--------+ | | | Login User | | | +--------+--------+ | | v +--------+--------+ | | | Send Success | | Message | | | +-----------------+
Something should immediately stand out here: there are no enclosed regions in this diagram! Furthermore, even our new diagram is much simpler to follow than the old one was.
Reality is rarely simple.
The above is a really forgiving example. It has no loops, and loops are going to create enclosed regions that can't be broken apart so easily; it has no conditional branches that are so tightly coupled with the main path of execution that they can't be broken up; and the scope of functionality and side effects are minimal. Sometimes you can't break those regions up. So what do we do when we inevitably encounter these cases?
High cyclomatic complexity in your program as a whole is inevitable for sufficiently large projects, especially in a production environment, and your efforts to reduce it can only go so far. In fact, I don't recommend trying to remove all or even most instances of cyclomatic complexity at all--instead, you should just be keeping the concept in mind to determine whether or not a function, method, class, module, or other component of your system is accumulating technical debt and therefore in need of refactoring.
At this point, astute readers might ask, "How does refactoring help if the cyclomatic complexity doesn't actually go away?", and this is a valid concern. The answer to that is simple, however: we're hiding complexity behind abstractions.
To test this, let's forget about cyclomatic complexity for a moment and instead focus on simplifying the refactored version of our toy example using abstraction:
<?php function handleLoginAttempt($login_data) { if(!usernameExists($login_data['username'])) { sendErrorResponse(getBadUsernameError($login_data['username'])); return; } $user = getUser($login_data['username']); if(isDeleted($user)) { sendErrorResponse(getUserDeletedError($user)); return; } if(isBanned($user)) { sendErrorResponse(getUserBannedError($user)); return; } if(loginRateLimitReached($user)) { logBadLoginAttempt($user); sendErrorResponse(getLoginRateLimitError($user)); return; } if(!passwordMatches($user, $login_data['password'])) { logBadLoginAttempt($user); sendErrorResponse(getBadPasswordError()); return; } loginUser($user); sendSuccessResponse(); } $login_data = getLoginCredentialsFromInput(); handleLoginAttempt($login_data); ?>
The code above is functionally identical to our refactored example from earlier, but has an additional abstraction via a function. Now we can diagram this higher-level abstraction as follows:
+-----------------+ | | | Program Start | | | +--------+--------+ | | v +--------+--------+ | | | Attempt Login | | | +-----------------+
This is, of course, a pretty extreme example, but this is how we handle thinking about complex program logic. We abstract it down to the barest basics so that we can visualize, in its simplest form, what the program is supposed to do. We don't actually care about the implementation unless we're digging into that specific part of the system, because otherwise we would be so bogged down by the details that we wouldn't be able to reason about what our program is supposed to do.
Likewise, we can use these abstractions to hide away the cyclomatic complexity underlying different components of our software. This keeps everything clean and clutter-free in our head. And the more we do to keep our smaller components simple and easy to think about, the easier the larger components are to deal with, no matter how much cyclomatic complexity all of those components share as a collective.
Final Thoughts
Cyclomatic complexity isn't a bad thing to have in your code. The concept itself is only intended to be used as one of many tools to assess when your code is accumulating too much technical debt. It's a warning sign that you may need to change something, nothing more. But it's an incredibly useful tool to have available to you and you should get comfortable using it.
As a general rule of thumb, you can usually just take a glance at your code and assess whether or not there's too much cyclomatic complexity in a component by looking for either of the following:
- Too many loops and/or conditional statements nested within each other, i.e. you have a lot of indentation.
- Many loops in the same function/method.
It's not a perfect rule of thumb, but it's useful for at least 90% of your development needs, and there will inevitably be cases where you will prefer to accept some greater cyclomatic complexity because there is some benefit that makes it a better trade-off. Making that judgment is up to you as a developer.
As always, I'm more than willing to listen to feedback and answer any questions!
25 votes