11
votes
What programming/technical projects have you been working on?
This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?
I've written in past threads, but I am writing my own language/runtime/shell/os after the past decade of forcing all languages into my weird format with minimal success. It's my dream language. There is no need for manual memory management, GC, or reference counters (all instantiated memory is dropped at the end of the functional use) and comes with enhanced debugging (all variables and functions that are touched are automatically logged and able to be 'replayed' on demand) with an integrated vfs database.
It's a compiled language with the host language written in Fortran, and should be about as fast as C (+/- 20% for overhead and linking). There is also no need for a lexer or parser in the most conventional sense, so compilation time should be about equal to Fortran1, which is a hair faster than C in most cases. Because it's host language is modern Fortran 2018, it will feature parallel computing by default2. Syntax is heavily inspired by Powershell. The plan is to be available as a single portable executable, and shouldn't be larger than 10mb with a few hundred keywords by the time we reach 1.0. Verbosity over the implicit.
I have the core of the language working, a shell/repl implementation kind of working, and the debugging and compilation architecture design finished (as much as it can be until I can optimize and rework it).
It's progressing really well, but life gets in the way. Hoping for a mature alpha release by spring.
This is my grand "twenty-year project". I want to still be working on this in twenty years, rather happily. It's a new-type of language design and implementation.
For those that may ask, it's certainly not object-oriented, and not quite functional either, so I've just started to explaining it as an 'automation language'. It is general purpose, but both high level and low at the same time, I want this to be able to be written in the same way on embedded devices such as the ESP32, or on large outdated and modern clustered mainframes and servers with relative ease without sacrificing the language itself.
I haven't really had anyone to talk about this endeavor, thus the overly long response, if you're interested, let me know by using the rather simple hint I've included in this comment - we all should have some whimsy in our life, y'know? Progress will be slow, but that is how I want it to be.
1. The 'compilation' model is similar to Forth with something I've been calling precognitive direct interpretation, similar to AOT/JIT designs. The closest similarity in other languages may be Smalltalk.
2, Fortrans parallel-by-default design is just monumentally better than async/await adhoc additions in other languages, modern fortran is crazy fast and fantastic, totally slept on language. There is a reason it's used in the HPC space at the top of the programming and data intensive world. Really easy to write/read language as well.
Edit: Added more details on compilation and design, added twenty year project aside.
🧙♂️
Interesting! Inventing a new language is difficult, popularizing it even more so. I recommend writing a paper about what you did, so your ideas can have some influence even if people don’t actually use it. Actually trying it out is the next step for people who find it intriguing.
Not worried about that! I am more just designing this for my own fancy.
I'll be open-sourcing it1 but I am more focused on developing a language for my own projects. I work on anything from global network security and infrastructure (cyphy, cyber & physical) for
$MegaCorp1
, to incredibly small embedded devices, and autonomous terran rovers and robots. I just want something that I can use in my day-to-day and only have myself to be angry at for mytech debtdecisions. As most of my work is now hard security, I can't rely on dependency hell to be secure, and I don't have the time to go over every release and detail to ensure my clients and I are safe on their time. So fewer headaches, a general language I can use everywhere, and one that can easily run and execute other languages code without a C ABI being needed.As said, this is my "twenty-year project" or "twenty-year trip" — if it’s just me using it, that’s wizrd, no problem. If others jump on board? That'll probably bring some chaos, but I’ll evolve and vibe with it as it comes. Of course, it would be cool if people got into it, but that’s not the point. It’s more like a long, slow painting, you know? Something I’ll spend years pouring into, not for anyone else, but just for the pure joy of creation. So one day, I can look back and see the art in the machine, the chaotic harmony in its evolution.
If I were a painter, I’d probably be one of those artists and go totally avant-garde. I’d strip it down to the real essentials — grind my own pigments, carve my own brushes — crafting from scratch. I really just enjoy learning new things, not particularly for the result of the thing itself, but in the production. Creation and building things are the only ways you really learn — it’s how the mind connects with the universe, the way thoughts turn into something real, become tangible, becomes something that binds to your helix.
I am sure I'll write a lot about it over the years. Perhaps not in formal papers, something I believe is an antiquated standard, but will certainly devote time to writing on the design of the language as it progresses.
1. I'll likely be following what I wrote here, the more I think about it, the more I am fond of 'open-core' with more socialist characteristics.
I'm definitely with you on not wanting many users. More users, more problems is my motto.
I think collaborating on an open source project might be nice, though, where the other developers are also using it for their own purposes.
I pushed another release of repeatTest on Friday and continued working on performance, adding a couple new benchmarks for things that were particularly slow when doing searches. I started out fixing whatever looks slow in a profile, but the big improvements were from making the brute-force search algorithm somewhat smarter in cases I care about. Here are my notes with benchmark results.
And, finally, started writing documentation. Here's the page on Getting Started.
(Skimming it now, it seems rather terse.)
I have written so many tests that are an ad–hoc version of this. Something like: array of inputs with expected outputs + for loop invocation over the array. Having the control over the individual tests to run in debug is a nice add.
Have you considered adding command line range control? It seems like a natural extension of the usual option "run only the tests that match this regex" to say "turn only the tests that match this regex and only with these inputs". (Maybe you already have, I only read the getting started).
Since repeat-test is a library, not a test framework, it doesn't have any command-line stuff. Usually, I'm working on one failed test at a time, so I select a test using the test framework (which chooses which repeatTest call to run) and then edit the code inside that test to adjust which examples it runs.
I suppose it could be done with environment variables, though?
I haven't written any other documentation yet. Next, I need to explain the different ways of generating test data. (That's most of the library, actually.)
Looks like Deno test has a way of passing in additional arguments as globals. If you put in a hook that controls the selection of tests that repeat-test runs, maybe the default behavior is to read an environment variable, but people can add their own hook logic to connect with the test framework they are using. If it becomes heavily used, the test framework maintainers will probably add first class support to their CLIs. Pytest (more my bag than JS/TS) supports cli integration for tons of plugins.
Since you mentioned generating inputs, I wanted to mention some stress testing literature that might be of interest. This gets away from functional testing, but it may have some ideas that are useful.
The short version is that rather than fuzzing with random values, you fuzz with a dictionary of "interesting" values which are a mixture of possibly good and bad values. How many tests would you run with random inputs before you got 0, -1, +1? You mix good and bad inputs because if you have a test that takes two inputs, and you always give a bad first input, it may mask failure in the bounds checking logic of the second input.
The pass condition becomes an invariant – something that's supposed to be true regardless of the inputs. "Doesn't crash" is often the most basic invariant. But it vould be "output is always between -1 and 1" . The invariant you choose depends on the requirements you are testing.
Ballista – a good overview of the seminal work. This is where the idea of basing inputs on types comes from
ASTAA – deals with systems that has stateful behavior, more about invariant checking and generating test inputs.
HPSL – searching input spaces for sets of inputs that trigger failures
Thanks! Yeah, it certainly could be done. But I'm not sure how command-line arguments like that would be used? I often use print-based debugging, so I'm pretty comfortable editing code temporarily and I'm not sure when I would use it. What do people do with it in Python?
Since you're using Python, I'll mention that the shrinking part of
repeat-test
is inspired by Hypothesis, which is certainly far more advanced than my library. Apparently Hypothesis also keeps a local database of failed test examples, which means that when you rerun a test, it automatically re-runs the failed example first. And it keeps recent failures around to guard against regressions.Maybe I'll add that someday, but I'd like to finish documenting what I have first and try it out testing things other than
repeat-test
itself. There are also very basic features I haven't added yet, like support for generating floating point numbers.Regarding probability distributions, a simple, crude approach is just to do it on a case-by-case basis. I am doing a little bit of that: every Arbitrary has a "default value" and that's always run before going into random mode.
I modified my fish shell prompt config to generate a random color based on the hostname when SSH is active:
I was struggling to write the following code inside of fish. But since this only runs once at startup I'm okay with it running in python: