16 votes

MATLAB learning resources for software engineers

I'm starting grad school in neuroscience/biomedical engineering soon, and one of my most dreaded parts of it is inevitably having to develop Matlab code. I understand why people use it -- it's arguably best in class at a lot of engineering tasks, and the matrix-first approach of the language makes it very fast to prototype things if you think like a mathematician/engineer.

However, the language also seems to actively discourage good software practices, and many frequently used scientific projects have atrocious code. Think python dependency management is bad? How about NO DEPENDENCY MANAGEMENT? Yes, that's right, the way you share code in matlab is by importing collections of loose files from github/matlab file exchange. The Matlab neuroimaging code that I have worked has also frequently abused the workspace to share state implicitly between scripts, which makes the code virtually incomprehensible. Instead of using packages to create namespaces, common practice is give function names a prefix and import them into the global namespace.

I know there's multiple large companies that rely on Matlab for their products, so it must be doable; I just haven't seen it for myself yet.

Do you guys have any experience developing in Matlab, and if so, are there any good resources to learn how to build robust software in it? What are some open source projects that have good Matlab code?

8 comments

  1. Arminius
    Link
    I have been working on Matlab code for years in different companies. There are examples of terrible code like you mentioned. In my opinion this is because engineers and researchers are not...

    I have been working on Matlab code for years in different companies. There are examples of terrible code like you mentioned. In my opinion this is because engineers and researchers are not software developers, usually not having the skills or interest in building good code.

    But it can be done well too. You can create classes, and even place them in folders that start with a + to have a package, see https://www.mathworks.com/help/matlab/matlab_oop/scoping-classes-with-packages.html
    This was used to build an extensive project I worked on (something like 50 man-years work) and still going. This had unit testing as well to keep everything bug free and in a version controlled environment. It can be neat, but it takes effort on your part to convince the people you work with to get it right.

    6 votes
  2. VoidSage
    Link
    Anecdotal, but I have friends in Mechanical and Electrical engineering. Both of those fields seem to be trending towards python instead of Matlab. My ME friend is currently working on his master's...

    Anecdotal, but I have friends in Mechanical and Electrical engineering. Both of those fields seem to be trending towards python instead of Matlab.

    My ME friend is currently working on his master's degree and is using primarily python for his classes.

    2 votes
  3. [6]
    lbr
    (edited )
    Link
    Can you say a bit more about sharing state between scripts? I think I might be guilty of that. The problem is that I have numerical simulations which run for a long time. So I can’t do everything...

    Can you say a bit more about sharing state between scripts? I think I might be guilty of that. The problem is that I have numerical simulations which run for a long time. So I can’t do everything in one go. What’s the best way of doing that - if save workspace isn’t it?

    Speaking as an economist, Matlab is absolutely ubiquitous and unavoidable for me. The vast majority of literature in my field (empirical macro/time series) uses Matlab for the reasons that you mentioned. It’s so easy to implement new estimators by just writing down the algebra.

    However, these are typically rather small-scale projects with just a couple of scripts. Examples of bigger pieces of software that I’m aware of are the ECB‘s BEAR toolbox (though the underlying code looks pretty terrible tbh). Or the dynare package which is very popular for a certain class of macroeconomic models.

    The problem is that there really is no alternative. To me, R and Python just aren’t good substitutes, though Julia is making inroads. But that of course doesn’t solve the coordination problem.

    1. [5]
      nomadpenguin
      Link Parent
      The workspace itself isn't inherently evil, and doing some sketchy stuff from a software standpoint is understandable if you're doing exploratory analysis and just trying to iterate fast. The...

      The workspace itself isn't inherently evil, and doing some sketchy stuff from a software standpoint is understandable if you're doing exploratory analysis and just trying to iterate fast.

      The worst (and most common) way I see it abused is script A creates variable X, and then script B accesses X without a save/load step. So if I just look at script B, it looks like you're referencing a variable that was never created.

      Saving and loading workspaces can be implemented in a responsible way though. You can explicitly define which variables are saved/loaded, which helps a lot. You can also wrap up the matlab scripts with a workflow manager like Snakemake, which can then maintain some version control over the saved workspaces and auto-rerun/alert you if you're using a stale version of the saved workspace.

      2 votes
      1. [3]
        PopNFresh
        Link Parent
        I think a lot of grad school falls under the quick exploratory analysis. the example of running script A then B will depend a lot of the reasoning for having two scripts. Is it due to length as...

        I think a lot of grad school falls under the quick exploratory analysis. the example of running script A then B will depend a lot of the reasoning for having two scripts. Is it due to length as mentioned above. If it’s only a few variables being manipulated my first thought would be to make script B into function B or place script B back into script A.

        Probably bad software practice but since all the scripts share the same workspace when some math doesn’t work out how you expect it’s quicker to pull out the variables from the workspace over debugging a function.

        As an engineer another I would commonly see global variables abused to get parameters into a dynamic equation fed into a function like ode45 but this can be achieved in better ways with anonymous and inline functions.

        1. [2]
          nomadpenguin
          Link Parent
          Yeah, bad exploratory code is understandable to an extent. What is much, much more annoying is when these get packaged up and shipped when the paper is published, and then other groups have to...

          Yeah, bad exploratory code is understandable to an extent. What is much, much more annoying is when these get packaged up and shipped when the paper is published, and then other groups have to build on top of it.

          For example, my lab made made heavy use of a Matlab program PALM. It's brilliant when it works, but if there's a bug or if you want to do anything that would involve modifying the source code, it quickly becomes unusable for anyone except the author. Eg the PALM core loop function is completely incomprehensible.

          2 votes
          1. ebonGavia
            Link Parent
            [plm.X{y}{m}{c}{o},plm.Z{y}{m}{c}{o},plm.eCm{y}{m}{c}{o},plm.eCx{y}{m}{c}{o}] = ... Woof. 😂

            [plm.X{y}{m}{c}{o},plm.Z{y}{m}{c}{o},plm.eCm{y}{m}{c}{o},plm.eCx{y}{m}{c}{o}] = ...

            Woof. 😂

            2 votes
      2. lbr
        Link Parent
        Thank you, that’s reassuring. I’ll continue to try and be responsible when using save/load. And I’ll have a look at snakemake, sounds useful

        Thank you, that’s reassuring. I’ll continue to try and be responsible when using save/load. And I’ll have a look at snakemake, sounds useful