How do you structure larger projects?
I'll be writing a relatively large piece of scientific code for the first time, and before I begin I would at least like to outline how the project will be structured so that I don't run into headaches later on. The problem is, I don't have much experience structuring large projects. Up until now most of the code I have written as been in the form of python scripts that I string together to form an ad-hoc pipeline for analysis, or else C++ programs that are relatively self contained. My current project is much larger in scope. It will consist of four main 'modules' (I'm not sure if this is the correct term, apologies if not) each of which consist of a handful of .cpp and .h files. The schematic I have in mind for how it should look is something like:
src
├──Module1 (Initializer)
│ ├ file1.cpp
│ ├ file1.h
│ │...
│ └ Makefile
├───Module2 (solver)
│ ├ file1.cpp
│ ├ file1.h
│ │...
│ └ Makefile
├───Module3 (Distribute)
│ ├ file1.cpp
│ └Makefile
└ Makefile
Basically, I build each self-contained 'module', and use the object files produced there to build my main program. Is there anything I should keep in mind here, or is this basically how such a project should be structured?
I imagine the particularly structure will be dependent on my project, but I am more interested in general principles to keep in mind.
There are tons of books on this out there but this is where years of experience with large systems come in. So I doubt anyone can tell you how to do this but there are principles... which again, everyone values differently.
But searching for and learning about how to architect enterprise software should set you up for this, and you should be just in time for those amazon orders to arrive before the weekend. :)
Also have a look at design patterns. These are common architectural solutions to common problems. Even if you don't implement them 1:1 it's just good to have those in the back of your head.
Also, naming things is hard.
Thanks so much, I didn't expect to get an explicit road map since I assumed these things are a whole topic in and of themselves :) Would you happen to have recommendations for reading that you found particularly useful, or should I just browse and see what's popular?
My pleasure. The problem is really that it's such a vast subject and if you really master it you're pretty much "there", raking in the big bucks.
I personally enjoyed "Patterns of Enterprise Application Architecture", I heard good things about "Code Complete: A Practical Handbook of Software Construction" and started my career reading "Object-Oriented Analysis and Design with Applications", this is like the ancient bible.
Other than that I don't read many blogs anymore and things like that on the subject, so I can't help you there.
Good luck!
Could you tell us a bit more about the outcome these modules are trying to achieve? Are they doing large amounts of ETL of data? Data modeling? Etc?
Most starter templates for various applications types will give you a decent starter organization of your source and header files. But in my experience it’s ultimately dependent on your use-case.
The entire program is a simulation of a physical system. The first module sets up a grid and initializes a system on it (essentially amounts to looping over an array and assigning values as to reproduce the statistics of my initial state). Another module advances this initial state in time through through a split-operator PDE solver. I'm supposing another module will be I/O to write these states to disk, and another will be a driver for parallelization so this whole thing can take place across multiple nodes. I have some of these modules written either partially or completely.
In any case, as per other suggestions I'm definitely going to try and find a book or template that is similar to my use case here.