20
votes
Which computational notebooks do you like?
Here's a nice website for comparing between a lot of different computational notebooks, such as Jupyter, Google Colab, and Observable. (And popularized by Mathematica, though spreadsheets sort of count.)
I'm wondering what computational notebook software other people have used and how you like it? I will make top-level posts for some that I used.
I find that running Jupyter notebooks in VS Code does everything I need it to. What are the advantages of some of these other options? One obvious one is the ability to share notebooks easily in Google Collab and possibly others, but are there other features I am missing out on?
I'll second VS Code.
Personally I almost always use python in an object-oriented manner, defining classes in separate files and then working with the objects in a notebook. This feels a little more natural in a full-fledged IDE compared to something like Jupyter.
Plus the marketplace is massive -- there are extensions for markdown, latex, hdf5, etc.
That is precisely my workflow as well, which is perhaps not surprising as I believe we are both in academic settings, if I recall correctly.
Other people being able to run your code seems like it might be important? (That is, avoiding "it works on my machine" problems.) But if you load your notebook into Google Colab and it works, and then save the file to Github, maybe that's good enough?
Github renders notebook files so you can see the code and images, and Colab optionally puts a URL in an exported notebook file so that anyone who sees it can click to open it in Colab.
I switched over to Emacs at the start of the year to give Org Mode a serious try.
I now use it for all of the readme's in my projects, a wiki via Org-roam, and my todo lists.
Everywhere that I use org files I have access to any programming language I've setup with Babel.
So in a project readme I can have examples that are runnable and in my todo lists I can have a block for exploratory programming underneath simple tasks that don't warrant a full project setup.
It's not the same as the other notebooks in that list in that it requires Emacs instead of just a web browser and it doesn't have collaborative editing.
You sort of have to go all in on Emacs to get the most out of it, but it's a really nice experience if you do.
Can you create images and plot graphs easily?
You can use gnuplot through org-plot or org-babel-gnuplot (both are built in, but you also need the gnuplot package), but I haven't needed to graph anything yet (just tables are good enough for my needs and editing tables is really nice in org-mode) so I can't speak to how easy it is in day-to-day use.
It looks like org-plot works by adding an option to tables with configuration values and org-babel-gnuplot works by having a named table of data used by a block of gnuplot code.
Here's an example of using org-babel-gnuplot (what it looks like on my screen).
The table could be from data entered manually or generated by another org-babel block.
Here are some relevant links I found.
org-plot entry of org-mode manual
org-plot tutorial
org-babel-gnuplot tutorial
post that I got the above example from
I’ve always thought of org mode more as personal note-taking software. Using it in project README’s suggests it’s at least somewhat reasonable for publishing too? What do those files look like?
It sounds like if you go beyond the basics, the notebook is going to depend on the emacs packages you have installed locally. Since notebooks contain source code, they tend to depend on package management, resulting in similar dependency issues as maintaining software.
For publishing on the web, ideally you want to be able to lock dependency versions, have them automatically download, and run the notebook in a sandbox. Also, cold start times matter.
GitHub and GitLab render org-mode READMEs no problem. For other publishing, e.g. blogging, I think Hugo is the most popular (though I haven’t experimented with it myself)
Here's a repo that tests how some org mode features work on GitHub. I guess it's spotty? I don't know which features are important, though.
(One thing I'm learning is that computational notebooks are programs and people use them for a wide variety of purposes. This is similar to how people write scripts for many different reasons.)
Regarding the dependencies part, I see this as a progressive enhancement thing.
If you are looking at a project that uses an org file for the readme with examples and you don't have Emacs, your experience is the same as if a markdown file was used.
You setup the project dependencies (all of the languages I use have their own systems for this such as cargo, mix, and npm that make it as simple as running a single command) and then copy and paste the examples into files and run them.
If you do have Emacs then you can open the org file and run the examples directly from that.
As for the emacs packages required, it's usually just a small addition to your Emacs config (
use-package
the relevantob-*
package (if support for it isn't built-in) and add the language toorg-babel-load-languages
).For publishing, org mode has many export options built-in and more available as packages.
You can include the output of the code blocks in these exports.
Yeah, fair. I do like Github’s rendering of Jupyter notebooks better because it includes images, and seeing plots is nice.
I’ve read that Jupyter format doesn’t play well with version control, though.
Until recently my favorite has been Observable, which lets you write notebooks a variant of JavaScript. They are published online by default and are reactive, like a spreadsheet. You can put cells in any order and after modifying one cell, they automatically recalculate. You can also do things like make a slider that controls how a graph is rendered.
The code runs in your browser. A big advantage of this is that notebooks start up fast, like any other web page. Interactive widgets are responsive. It's not a good idea to do any computations that are too heavy, though, because they will be rerun when you reload the page.
Reproducibility seems fairly good, at least in the short term. It’s a web page and any JavaScript libraries a notebook depends on will be downloaded from a CDN, the same way they might for someone visiting an ordinary web page. Web pages do break, though, and you do want to be careful about adding dependencies and pinning versions for what you use.
Observable depends on browser sandboxing for security. Behind the scenes, each Observable user gets a separate subdomain where the code for their notebooks run. Any browser permissions you grant are just for that subdomain.
Longer term, the Observable website is proprietary software that hosts your notebook, so it depends on how long the company lasts. The code for running a notebook has been open sourced, but not the editor or other website code.
A downside is that there's no good way of uploading source code for a notebook. You need to copy code into each cell one at a time. Downloading is easier, and I've seen a plugin for VS Code that will run an Observable notebook.
Pluto.jl is a notebook interface for Julia that's inspired by Observable. Since it's server-side, starting up an interactive notebook online is slower than I like. It requires launching a virtual machine using Binder or JuliaHub. Running it locally might be better, but I haven't tried it.
I've used it locally, it works well and it's fast. I prefer it to Jupyter.
It's Pluto.jl by the way.
Oops, fixed!
What do you use it for?
Julia has had issues with long startup times, but I understand that they have improved recently. How is time to first plot for you?
It has improved steadily yes. I agree it gives a worse first impression about Julia, but it hasn’t really been an issue for me. I don’t do anything very intensive though.
Juno is a self-contained (client/server) notebook app for iOS that works a lot like Jupyter Lab. I use an iPad Pro with a keyboard case as a laptop replacement and it lets me quickly get into my work for ten seconds or an entire day. It’s got support for all the big libraries I use and can even render interactive charts with Bokeh and things like that. Best of all it is a one time purchase with absolutely no extra junk about it. 10/10.
Interesting! What options are there for sharing notebooks with others after you've built them?
So you can share .ipynb notebook files directly from the app. If you’ve got linked files you’ll need to package them into an archive yourself. If there are dependencies then no the app doesn’t have a feature I’m aware of to autograb them on first run but it does have an in-app package management system that gives yoy version control. If you are a user with many advanced needs you may find that there are some libraries missing, but this is due to limitations imposed on the app through iOS so you’ll find the same problems wherever you turn within apple’s walled garden. For everything I need I’m happy to say it functions really well and I am unlikely to switch back to a laptop.
I work in Mathematica, but also use git for version management which out of the box don't really go together. For my projects, I write Wolfram .wl scripts within the notebook interface. I do it this way because unlike Wolfram .nb notebook files which contains a ton of overhead and isn't easily parsable, scripts are super lightweight being just text files of raw uncompiled code making them git friendly.
I guess one thing I use that's not yet mentioned is Maple. It's IMO the best available symbolic computation system, at least for what I do with it. I have less experience with Mathematica, but I like the language in Maple more.
I want to use Sage, but it's not quite there yet I don't think.
It's not quite as datascience focused (though I believe it can do it), but I absolutely love Elixir Livebook.
Interesting. What do you use it for?
Looks like they have database plugins. Using a notebook as a reporting tool for a database is a use case I hadn’t considered since I didn’t have a database, though it explains why notebooks tend to have secret management.
I’m more interested in using a notebook as a publishing tool (like for an article on a blog), and then storing data as a file attachment makes more sense to me, to make it self-contained. That doesn’t make sense for a live dashboard, though.
I use it mostly in place of the usual cli REPL, as well as for one-off scripts for a bit of data transformation or generating visuals and such. I also did 2022's Advent of Code in it, which was a very pleasant experience.
I have yet to find anything even remotely close to as good as JupyterLab.
I’ve played with Jetbrains Dataspell though, and while it wasn’t quite a fit for me it’s pretty good. People who are already regular PyCharm users might really enjoy it.
What features of JupyterLab do you miss with other notebook software?
Most alternatives aren’t Python, which is the biggest issue. Julia and JavaScript are ok languages but they don’t have the depth in terms of data science libraries. Definitely nothing with the maturity of sklearn, PyTorch, and seaborn for instance.
I also need to be able to work remotely but on my own hardware as I have local GPUs. This means anything that isn’t self hosted won’t work for me. I’ve already invested in all this hardware — running equivalent cloud instances would be too expensive.
One other thing is the ease of setting up jupyter on other platforms. If I do end up needing to be on much bigger hardware, it’s just a couple minutes before I’m running on an 8xA100 on lambda labs or AWS, running the exact same code.
It might not be complete enough to do everything you need yet, but the Elixir ecosystem is very deliberately trying to make itself useful as a replacement here, so that might be worth taking a look at for you? There's a little overview and explanation at https://github.com/elixir-nx.
Google Colab provides free hosted Jupyter-compatible notebooks. (The free version has limits on CPU time, but you can buy more.) You can store private notebook files in Google Drive or export them to a file in a Github repo.
Since it's Python (by default), I was able to get GPT4 with Code Interpreter to export some code it wrote as a Jupyter notebook, load it into Google Colab, polish it up a bit and then export to a file in a GitHub repo. (Here's the result.) It's a bit finicky, but this seems like a nicer way to publish than using ChatGPT itself. (I haven't come up with a good way to preserve the prompts yet, though.)
It's easy enough to make graphs (or least, easy enough to ask GPT4 to do it.) This lacks the interactive widgets I'm used to from Observable, though.
Unfortunately, it doesn't work like a makefile or a spreadsheet. Conceptually, the cells should be run in order from top to bottom, but you can run them in any order. It's a good idea to 'Restart and run all' every so often to make sure you didn't break the build. However, this can be slow if some tasks take a long time to run.