- ~comp

[13]

anahata

December 1, 2019

Link

This is the kind of absolutionist programming guidance that seems to come from the programmer (autistic?) personality type of trying to simplify a complex thing (here, an entire technical field)...

This is the kind of absolutionist programming guidance that seems to come from the programmer (autistic?) personality type of trying to simplify a complex thing (here, an entire technical field) into simple, easy to follow rules. It's not that simple. I really don't see the value in making the developer read some other function / method, losing the mental context they've established in the middle of the code they're already reading, because some wotsit is an exceptional case. These suggestions don't make sense to me and I can't work out how they make code better.

A lot of this post seems to come from the perspective of someone working with extremely junior developers (for whom absolutist rules like this have a small amount of currency) and given that he works at Uber as an engineering manager, it seems like that may be the case.

15 votes

[8]
NaraVara
December 1, 2019 (edited December 1, 2019)
Link Parent
Maybe this is because I’m completely self-taught and only ever worked on solo projects where my managers hadn’t the foggiest notion about what kind of witchcraft I was up to. But am I the only one...

Maybe this is because I’m completely self-taught and only ever worked on solo projects where my managers hadn’t the foggiest notion about what kind of witchcraft I was up to.

But am I the only one who learned to comment my code such that once you strip the code out it should read like a technical manual explaining exactly what each thing is doing? It is often useful for me to look back on something I did forever ago and remember what I was thinking each step of the way since I was often figuring it out as I went along. That certainly is an invitation to refactor, but it also just seems like good hygiene for documentation.

Especially once you start including references to other things, I don’t see how you get away with not having legible comments in the code explaining what all you’re pointing to. I noticed a similar thing in a question posed to some John Siracusa podcast where someone asked him whether comments are a “code smell” and Siracusa was completely baffled, as was I. LACK of comments, to me, implies a rushed job by someone who doesn’t care about long-term maintenance. That’s the real code smell isn’t it? Maybe this comes from my background in social science research where you bring the mentality of a lab notebook to your code.

9 votes
1. [5]
  skybrian
  December 1, 2019
  Link Parent
  The way I approach this is to think about the expected reader. For example, if you're writing a programming tutorial for someone who is learning a programming language, you might comment on...
  
  The way I approach this is to think about the expected reader. For example, if you're writing a programming tutorial for someone who is learning a programming language, you might comment on interesting language constructs or programming idioms that they're probably not familiar with.
  
  That gets seen as clutter by people who are fluent in the programming language. We tend to see the comments as supplementing the code, not duplicating it. But comments giving context about why the code is the way it is, giving references to algorithms, and so on, are always welcome. In particular, you want to comment on things that look like they might just be a mistake, but were done that way for a reason. Otherwise, someone might try to fix the "mistake" and break something.
  
  The same is true of domain knowledge. A domain expert might consider certain things obvious, but if you assume that the reader knows the programming language but nothing about the domain, then you'd want to explain them. This helps programmers that are new to the project get started.
  
  I'm not sure the idea of reading the comments without the code makes much sense to the profressional programmer, since we are usually reading the comments in-context and we want to avoid duplication. But this doesn't apply to API comments like JavaDoc or Python's docstrings, where often the idea is that the reader of the API documentation for a function shouldn't have to look at how the function is implemented.
  
  11 votes
  1. [4]
    9000
    December 2, 2019
    Link Parent
    I think it also depends on the kind of churn you expect your project to have. Because, like you said, domain experts don't need comments. So, if you're working with the same code day in and day...
    
    I think it also depends on the kind of churn you expect your project to have. Because, like you said, domain experts don't need comments. So, if you're working with the same code day in and day out such that you become a domain expert, you might find all comments to be unnecessary. But, even another competent programmer (or, your future self) might not have the context that makes it clear. Sure, you can have variable and function names that are a sentence long (especially if you use an IDE), but sometimes it's just more clear and ergonomic to use shorter names and have a comment that explains their relationship. Or, keep more lines of code in the same function and leave an inline comment to group them without as much boilerplate as a new function (and thus, without forcing the reader to CTRL-f back and forth to understand what's going on).
    
    For instance, I don't like the first example. I'd rather just leave the context in the function. Otherwise, if I wanted to change something, I have to search through a deep callstack to understand what's going on. Why create a function if you're not planning on reusing the code? If it's just to name what's going on, that seems wasteful of line space and mental real estate.
    
    I've seen codebases where there were three login*() functions in a class, one calling the next, before the logic was implemented in a different class! To discover that, I found the first login call, and had to keep searching through it's callstack, understanding a new context each time, until I found the implemented logic. Self-documenting code is nice, but it's a tool, not a Platonic form.
    
    6 votes
    
    [3]
    teaearlgraycold
    December 2, 2019
    Link Parent
    At work we still have all of the original engineers that built our major applications. The team is fairly small and people don't context-switch much. Probably 85% of my time is spent in just one...
    
    At work we still have all of the original engineers that built our major applications. The team is fairly small and people don't context-switch much. Probably 85% of my time is spent in just one of our projects.
    
    With one exception our codebases are very light on comments. So far it's been fine, but probably by next year that won't hold up anymore. The CEO's goal is to triple the team size in the next year. That kind of growth is necessary from the business side of things but will be a large test on our team.
    
    2 votes
    
    [2]
    skybrian
    December 2, 2019
    Link Parent
    When you hire someone, maybe have them write the getting started document they wish they had? (With review by people who have been there.)
    
    When you hire someone, maybe have them write the getting started document they wish they had? (With review by people who have been there.)
    
    5 votes
    
    teaearlgraycold
    December 2, 2019
    Link Parent
    We've actually started on the engineering handbook. It's pretty helpful for your first week.
    
    We've actually started on the engineering handbook. It's pretty helpful for your first week.
    
    3 votes
2. [2]
  anahata
  December 2, 2019
  Link Parent
  You're not at all the only one, and you have Don Knuth as company, which is about the best company you can hope to have. I should point out that both you and Knuth have something in common: you're...
  
  You're not at all the only one, and you have Don Knuth as company, which is about the best company you can hope to have.
  
  I should point out that both you and Knuth have something in common: you're both academic programmers. Sadly, those of us writing code in industry don't generally have the time to spend writing all that documentation--generally because manglement and marketing are awful and overpromise.
  
  On an unrelated note, John Siracusa... heh. He isn't someone you should listen to about code. I've seen some real horror stories in things he's written. Something like a 2,500 line method. One method. 2,500 lines. I don't drink but that's probably something that would've pushed me in that direction.
  
  1 vote
  1. [2]
    
    Comment deleted by author
    Link Parent
    
    anahata
    December 3, 2019
    Link Parent
    I tend to be rather quick to judge people, something I should work on addressing; the above was an example of that. It's hard for me to trust someone if they do something... weird... like that.
    
    I tend to be rather quick to judge people, something I should work on addressing; the above was an example of that. It's hard for me to trust someone if they do something... weird... like that.
    
    1 vote
Sahasrahla
December 1, 2019
Link Parent
I feel like this sort of thing comes from the idea of the True Programmer that a lot of people internalize. A True Programmer, among other things, is 1. so good at coding that their code is...

I feel like this sort of thing comes from the idea of the True Programmer that a lot of people internalize. A True Programmer, among other things, is 1. so good at coding that their code is self-explanatory, and 2. so good at reading code that comments never help them, unlike the mere mortals they hate having to work with. And, to question either of these things is an admission of failure and a statement that you are not a True Programmer yourself. The practicality of these rules matters less than the ideal because supporting that ideal is a statement of basic competency. To say otherwise is to admit that maybe your code or ability to read code isn't perfect, because in the perfect world of the True Programmer comments are superfluous and only serve to discredit those who use them.

4 votes
[3]
mrbig
December 2, 2019
Link Parent
Mentioning a developmental disorder in this manner does not help your argument.

This is the kind of absolutionist programming guidance that seems to come from the programmer (autistic?) personality type

Mentioning a developmental disorder in this manner does not help your argument.

10 votes
1. [2]
  anahata
  December 3, 2019
  Link Parent
  So I'm not allowed to talk about it because it's a developmental disorder (that I happen to have)? How are we to destigmatize mental health if we're not allowed to talk about it openly and honestly?
  
  So I'm not allowed to talk about it because it's a developmental disorder (that I happen to have)? How are we to destigmatize mental health if we're not allowed to talk about it openly and honestly?
  1. [2]
    
    Comment deleted by author
    Link Parent
    
    [2]
    
    Comment deleted by author
    Link Parent
    
    mrbig
    December 3, 2019
    Link Parent
    Sorry! I could have been nicer. I'll correct my behavior @anahata
    
    Sorry! I could have been nicer. I'll correct my behavior @anahata
    
    3 votes

[8]

escher

December 1, 2019

Link

The worst, most unreadable code I've ever seen comes from people who say "comments are friction!". Comment your code, people. "Self-documenting code" isn't.

The worst, most unreadable code I've ever seen comes from people who say "comments are friction!".

Comment your code, people. "Self-documenting code" isn't.

9 votes

[7]
unknown user
December 1, 2019
Link Parent
What I'm seeing a lot of people ignore is that comments isn't just an explanation of how code works, realistically, it should also be about why code is the way it is. How does this satisfy...

What I'm seeing a lot of people ignore is that comments isn't just an explanation of how code works, realistically, it should also be about why code is the way it is. How does this satisfy business requirements? Is there a particular way this query is written to deliver something important or useful?

Often, more lines of my methods are dedicated to explanations around why I've written something in a particular way—often reaching paragraph lengths of detail. Coding is a layer on top of writing, at least to me.

12 votes
1. [6]
 anahata
 December 2, 2019
 Link Parent
 How and why are generally obvious (if you're doing the obvious thing) to anyone who isn't a junior. I find those comments to be noise. I've seen people defend comments like these: import requests...
 
 How and why are generally obvious (if you're doing the obvious thing) to anyone who isn't a junior. I find those comments to be noise. I've seen people defend comments like these:
 
 import requests from bs4 import BeautifulSoup BUSINESS_LOGIC_URL = 'https://google.com/index.xml' # get the page index res = requests.get(BUSINESS_LOGIC_URL) # check for errors res.raise_for_status() # parse the contents soup = BeautifulSoup(res.text)
 
 and I just cannot at all understand what value comments like that have. Presumably you're commenting why you're not doing the obvious thing, which is why I will always advocate for commenting why not rather than just why.
 1. [3]
 scrambo
 December 2, 2019
 Link Parent
 And water is wet *shrug*. Now I agree with you that in the above code examples, the comments are useless. They don't provide any information that isn't already available from reading the code...
 
 How and why are generally obvious (if you're doing the obvious thing) to anyone who isn't a junior.
 
 And water is wet *shrug*. Now I agree with you that in the above code examples, the comments are useless. They don't provide any information that isn't already available from reading the code itself. I even addressed this issue with an intern that I was in a code review with a week or so ago, he was doing the same thing.
 
 To juxtapose the previous example with one of my own, here's a function I wrote for an application.
 
 private int getIndex(int i, int j) { return (i * 20) + j; }
 
 It's pretty dang simple right? Just does some math operations with two provided numbers. This is the documentation that I wrote for that simple method (edited to remove any potentially sensitive information):
 
 /** * This calculates the index of the current <thing> in the * List, using what we know about several things. * <ol> * <li>The MAX_SIZE is set dynamically by dividing the number of accounts by * 20. (& adding 1 in some cases. See <other function> </li> * <li>The number of <things> that can fit in a request is ALWAYS 20. </li> * <li>The two lists [queryList & responseList] are the same size and are parallel. * (same objects at the same index) </li> * </ol> * * From the above we can extrapolate this: our current index relative to * queryList can be found in by multiplying the outer index by 20, and then adding * the inner index. * * Ex: If we have 37 requests to batch up this is what the data looks like * * <pre> * batch = [req1,req2] * req1 = [thing1,thing2,...,thing20] (size = 20) * req2 = [thing21,thing22,...,thing37] (size = 17) * * for request : batch { // i * for <thing>: request { // j * // do something * } * } * </pre> * * <thing> #29 has i = 1 and j = 9 * * <pre>(1 * 20) + 9 = 29.</pre> * * @param i the index of the outer for loop (the one looping over * responseList) * @param j the index of the inner loop (looping over thingList) * @return the index of the current <thing> in the List */
 
 I consider this comment to be crucial to understand how the application is running as it provides both why the function was written, how the function works, and some domain specific knowledge to provide context to the previous two points. With all three of those, a simple but context free function can be quickly understood and applied across the code base (at least, I hope).
 
 I think this comment started out with a tone of disagreement, but ended with me agreeing with your main points lol. I just thought an inverse example would provide some more information with what you were saying as well.
 
 8 votes
 
 skybrian
 December 2, 2019
 Link Parent
 In a code review I'd ask why 20 isn't a constant like items_per_request and i and j aren't named better, something like request_index and item_index? Then you get: private int getIndex(int...
 
 In a code review I'd ask why 20 isn't a constant like items_per_request and i and j aren't named better, something like request_index and item_index? Then you get:
 
 private int getIndex(int request_index, int item_index) { return (request_index * items_per_request) + item_index; }
 
 This seems like a good example of when paying a little more attention to naming can replace a lot of comments. (It's a better example than the ones in the original article, actually.)
 
 You do get the annoyance of the constant possibly being declared far away, though. I tend not to use constants for string literals, but I try to use them for magic numbers, particularly when they're used in more than one place. If you ever want to experiment with changing the batch size, you'll want to be able to do that by changing one constant.
 
 4 votes
 
 anahata
 December 3, 2019
 Link Parent
 That's all business logic that you're describing in your (extensive!) documentation comment. This is the kind of thing that's non-obvious and that I like to see explained. FWIW, your reply didn't...
 
 That's all business logic that you're describing in your (extensive!) documentation comment. This is the kind of thing that's non-obvious and that I like to see explained. FWIW, your reply didn't come across with a tone of disagreement, and indeed read as an inverse example.
 2. [2]
 unknown user
 December 2, 2019
 Link Parent
 I think you’ve misunderstood my point because those sorts of comments are not the ones I’m discussing.
 
 I think you’ve misunderstood my point because those sorts of comments are not the ones I’m discussing.
 
 2 votes
 
 anahata
 December 3, 2019
 Link Parent
 Can you provide an example to clarify, then?
 
 Can you provide an example to clarify, then?

[4]

Comment deleted by author

Link

anahata
December 1, 2019
Link Parent
It is definitely possible to factor too much. I've worked with some codebases like that and it's quite difficult to get your head around a codebase when it's comprised of three line methods /...

It is definitely possible to factor too much. I've worked with some codebases like that and it's quite difficult to get your head around a codebase when it's comprised of three line methods / functions and is quite large. I would seriously consider discarding the advice in this article as the opinionated pontifications of someone who has a very specific way of doing things that he clearly believes is best but is presenting as objectively, rather than subjectively, valuable.

6 votes
[2]
skybrian
December 1, 2019 (edited December 1, 2019)
Link Parent
I agree with the author that some comments may indicate an opportunity for simplification, but I tend to wait until I have a good idea what to do about it. I don't know what language you're in,...

I agree with the author that some comments may indicate an opportunity for simplification, but I tend to wait until I have a good idea what to do about it.

I don't know what language you're in, but classes and methods do provide a way to group functions logically. When writing a function, sometimes I look at it and decide it should be a method on one of its arguments. Also, keeping functions private until you need to make them public is a good default.

There are other tricks depending on the language.

It's hard to learn this stuff in a vacuum. Pair programming with someone more experienced helps a lot and code review helps somewhat.

(Also, the function names in the examples are way too long.)

5 votes
1. [2]
  
  Comment deleted by author
  Link Parent
  1. skybrian
    December 1, 2019
    Link Parent
    Yeah, research code is different. A lot of software engineering practices are geared towards long-running teams maintaining codebases that are meant to last. If you've got a fixed deadline and...
    
    Yeah, research code is different. A lot of software engineering practices are geared towards long-running teams maintaining codebases that are meant to last. If you've got a fixed deadline and then you stop working on the code, and you're not going to reuse it for another project, a lot of that isn't going to pay off.
    
    You have to think about what's going to pay off for you, rather than for anyone else. A unit test in the right place can save a lot of debugging, but comprehensive unit tests probably don't make sense. I usually write a lot of tests for parsing code, for example, but it probably doesn't matter to you whether you can handle every possible video so long as the ones you're using don't mysteriously stop working and the data being sent to the next stage looks good.
    
    Also, there are a lot of software engineering skills that might be worth taking to the extreme as an experiment, to see how you like it and what the downsides are. I think I learned a lot from giving rigorous test-first development a serious try even if I don't do it that way normally; it gives you intuition about what sort of designs are flexible enough to be testable. Such experiments might not be a good idea when you have a hard deadline.
    
    3 votes

[2]

Comment deleted by author

Link

skybrian
December 2, 2019 (edited December 2, 2019)
Link Parent
Literate programming is a wonderful but somewhat specialized technique. It's most suitable when you want to write a program and a comprehensive book about the program's internals at the same time....

Literate programming is a wonderful but somewhat specialized technique. It's most suitable when you want to write a program and a comprehensive book about the program's internals at the same time. It's not that often that a program's internals are so interesting that we want to write a book about them! More often, we want to write a program and generate its API documentation at the same time, and things like javadoc (etc) are better for that.

Crafting Interpreters is an in-progress book that might not technically be literate programming, but it's pretty close, and maybe even better.

[4]

Comment deleted by author

Link

[3]
skybrian
December 2, 2019
Link Parent
First impression is that it's nicely done, this is fine for research code, and if you weren't using it as an example I'd say just move on to other things. For the top-level comment I might word it...

First impression is that it's nicely done, this is fine for research code, and if you weren't using it as an example I'd say just move on to other things.

For the top-level comment I might word it as:

Pops up a window showing the first frame of a video file and prompts the user to select two corners. Returns a list containing the two points. Exits Python if the user closes the window.

It looks like it you couldn't really test it automatically but given your situation, that seems fine.

There's a slight inconsistency in that the function is named select_two_corners which is generic, but the function's comments correspond more to select_chimney_corners. The only places in the code that aren't generic are the comments and the message printed to stderr if the user cancels.

Also, I wouldn't normally call sys.exit() anywhere other than a top-level function, preferring to return None and let the caller decide what to do if the user cancels. If you did that, the code would be as generic as select_two_corners implies. Or you could resolve the ambiguity the other way and say this function really is specialized for selecting chimney corners, and rename it select_chimney_corners.

But if the function is in practice only used for chimney corners, that's purely a theoretical distinction.

4 votes
1. [3]
  
  Comment deleted by author
  Link Parent
  1. [2]
    skybrian
    December 2, 2019
    Link Parent
    Splitting out functions seems like a fine way to go, certainly better than spending time writing code you're not sure you will use. I tend to write generic code, building a library of useful...
    
    Splitting out functions seems like a fine way to go, certainly better than spending time writing code you're not sure you will use. I tend to write generic code, building a library of useful functions and classes as I go, but there is a cost to it since it can result in functions and classes that take a lot of parameters to configure them. That can be more difficult to read than code that just does one thing, and the flexibility might not be used.
    
    I'm wondering how you test stuff? Do you call select_two_corners from the Python repl to try it out?
    
    [2]
    
    Comment deleted by author
    Link Parent
    
    skybrian
    December 2, 2019
    Link Parent
    Interactive testing is still testing. In industry, it's often avoided because it doesn't scale; if you have a complicated app, you don't want to spend weeks manually doing a bunch of tests before...
    
    Interactive testing is still testing.
    
    In industry, it's often avoided because it doesn't scale; if you have a complicated app, you don't want to spend weeks manually doing a bunch of tests before doing a release, just to make sure nothing is broken. In the old days, companies would commonly have an army of QA people just to do all this tedious work. These days, people go through a lot of trouble to fully automate tests so they don't have to do that.
    
    Learning to write automated tests is a skill that needs practice, and the results are often mixed for UI; it's good for catching some mistakes but you can't write an automatic test for "does the UI look pretty" or "is the game still fun to play."
    
    It's a whole lot easier to test anything with a UI interactively, so for something like select_two_corners, maybe just come up with an easy way to run it and print the results? You could make a collection of "test harnesses" that let you run each stage of the process in isolation. I don't know PyCharm but it sounds like that's what you're doing?
    
    Automatic testing is better for functions doing calculations where you want to make sure that given a certain input, the output is what you expect. Verifying the results by hand is tedious, so you just have the program do the comparison and tell you if something is wrong. You don't really need special software to do this; it could just print "ok" if everything worked and an error message if something is wrong. If the tests don't take long to run, you could even have the program do some automatic tests at startup, so you remember to run them.
    
    3 votes

izik1

December 2, 2019

Link

I find that I tend to write a few types of comments documentation comments I actually don't write these nearly as much as I should, so I'm bad at it, which makes me not want to write them, since...

I find that I tend to write a few types of comments

documentation comments
I actually don't write these nearly as much as I should, so I'm bad at it, which makes me not want to write them, since I'm bad at it. Nonetheless, documentation is important, and decreases the domain specific knowledge required to understand how something works.
Note that here I'm defining "documentation" as something that lowers domain specificy.
bugfix comments
While I don't always write these, when I do it's because whichever problem I had was very much non-trivial to figure out. Sometimes I'll split the bugfix- or maybe even code surrounding it- out into a function, but that's only if it even makes sense to split a function anyway.
todo/fixme comments
While I do use issue trackers, I find it much easier to figure out where to do future work if there's already a todo comment there, typically I use both.

I also find that I wouldn't recommend doing exactly what I'd do unless you're me. I tend to write my code for frequent re-reading as I often have to do that.
I'm not worried about ruining my reading context by having to read another function because I've already read it. The name tells me what it does, I don't need to read it again (unless it's related to my problem.

kinda offtopic, but I seem to initially read my code differently than a lot of people on this thread. I read the shape of a function, and then fill in the blur. I also don't really seem to have problems with "losing my reading context". Normally when I'm reading code for the first time in order to understand something, it's in order to understand a specific thing, so I read lazily (which works great even when I want to read in detail, since it often allows me to look at the full function before looking at what it calls)

1 vote

unknown user

December 1, 2019

Link

I couldn't read half of that blog post without an ad modal popping up. The title should be taken at face value. Not all invitations should be agreed to: most parties are not that good anyway.

I couldn't read half of that blog post without an ad modal popping up.

The title should be taken at face value. Not all invitations should be agreed to: most parties are not that good anyway.