9 votes

What programming/technical projects have you been working on?

This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?

14 comments

  1. tesseractcat
    Link
    I've been working on a markdown editor like Typora, but with vim bindings. It's mostly been a way to practice Rust, but I think it's turning out pretty decent. I started working on it because I...

    I've been working on a markdown editor like Typora, but with vim bindings. It's mostly been a way to practice Rust, but I think it's turning out pretty decent. I started working on it because I wanted an editor with inline latex math/markdown rendering, but I couldn't find anything that did that well and had vim bindings. I've also been working on adding multi-cursor support.

    9 votes
  2. [4]
    joplin
    Link
    A few weeks ago, I made a cheesy Perlin Noise-based terrain generator. I noticed that there was some z-fighting on the farthest-away parts of the terrain, so I looked up ways to deal with that and...

    A few weeks ago, I made a cheesy Perlin Noise-based terrain generator. I noticed that there was some z-fighting on the farthest-away parts of the terrain, so I looked up ways to deal with that and learned about reverse Z buffers. Initially, I was able to find lots of information describing the problem, but no actual descriptions of how to implement a reverse Z buffer. I tried a few things and eventually figured it out, though. It's a pretty interesting technique. Once that was implemented, I added fog rendering, too. Nothing fancy, just simple uniform fog because it's easy. It requires calculating the depth twice when using a reverse depth buffer, though, because you need the forward depth to calculate the amount of fog for a given point.

    7 votes
    1. [3]
      thorondir
      Link Parent
      Have you considered writing a blog post about it, or something along those line, to make the things you figured out public? Make it easier for the next person?

      Have you considered writing a blog post about it, or something along those line, to make the things you figured out public? Make it easier for the next person?

      2 votes
      1. [2]
        joplin
        Link Parent
        I have, but so far I hate every blogging platform I've tried. I also am not sure I want to spend the time to maintain a blog. I go back and forth on it.

        I have, but so far I hate every blogging platform I've tried. I also am not sure I want to spend the time to maintain a blog. I go back and forth on it.

        2 votes
        1. thorondir
          Link Parent
          That's very fair. If you do decide to publish it somewhere, do post it here on Tildes, I'd be curious. :)

          That's very fair.

          If you do decide to publish it somewhere, do post it here on Tildes, I'd be curious. :)

          3 votes
  3. Edgeworth
    Link
    I'm making a site that is a mix of Twitch, Rabbit, and Netflix, made for small group media parties. You have the option to queue up movies stored on the server. Or you can screenshare via WebRTC,...

    I'm making a site that is a mix of Twitch, Rabbit, and Netflix, made for small group media parties. You have the option to queue up movies stored on the server. Or you can screenshare via WebRTC, or RMTP like twitch, or paste a youtube link (or anything that can be gotten from youtube-dl) and it will be queued up. There's a twitch chat on the side, a row of webcams on the bottom, and even video reactions that will show a reaction gif to everyone when you click a button. There's a voting system for what goes next on the queue. Since it's RMTP, anyone who has OBS can get their own room and stream through that after I give them a stream key. The whole thing works great with a 30-person discord family that I do it with, there is a 5-10 person party most nights. But not quite ready to show the world.

    A node.js server with socket.io powers a lot of it, the chat, the reactions, most other network stuff. When someone starts some content, it starts an instance of ffmpeg which streams the media file via rtmp. Then an nginx server serves the actual page and accepts the rtmp and makes an hls stream. The HLS stream is then used as the video.js source. A big challenge that I just tried and failed at was to display embedded subtitles from an mkv movie as webvtt subtitles and have them show up on video.js, so that every individual could decide whether to have subtitles on or off. Ultimately everything that appears on screen is through the ffmpeg rtmp stream, so dynamic stuff like that is difficult. Sooo now I'm considering doing a sync'd subtitle renderer from scratch.

    I doubt if I'll ever release it beyond my private discord but for our purposes it is basically a far superior successor to rabbit. The whole thing came about because discord Go Live was just really bad and you can't stream through OBS. With this I can tweak every tiny parameter if I wish.

    6 votes
  4. moocow1452
    Link
    Android 10 and up let's you load developer drivers for certain apps, so I'm trying to kitbash together a driver that will let full fat PC games run on phones as the current Android driver is...

    Android 10 and up let's you load developer drivers for certain apps, so I'm trying to kitbash together a driver that will let full fat PC games run on phones as the current Android driver is lacking in those areas to keep performance under control. I do not know the first thing about OpenGL and it's derivatives, if it's something I seriously want to look into, I'm going to have to learn to Drive.

    5 votes
  5. [3]
    schwartz
    Link
    I'm working on (yet another) static site generator, but this one uses Gmail as an ingestion service for content (and comments! And responses to comments!)

    I'm working on (yet another) static site generator, but this one uses Gmail as an ingestion service for content (and comments! And responses to comments!)

    5 votes
    1. [2]
      Apos
      Link Parent
      That's quite interesting. Is it a bit like a mailing list? Does the site get regenerated when a new email comes in?

      That's quite interesting. Is it a bit like a mailing list? Does the site get regenerated when a new email comes in?

      3 votes
      1. schwartz
        Link Parent
        I am still fleshing out a lot of functionality, but it takes advantage of the fact that Gmail let's you put any arbitrary string in your email by using a +. Each post or response gets a unique ID,...

        I am still fleshing out a lot of functionality, but it takes advantage of the fact that Gmail let's you put any arbitrary string in your email by using a +. Each post or response gets a unique ID, and this feature lets you include that ID in your response reply+SOMEID@yourdomain.com.

        At build time, I download new emails, validate that they are responding to valid posts, and then the site regenerates to include them. Email addresses are hashed to allow me to maintain a user identity without actually keeping any useful information around.

        I haven't decided if I'll be constantly polling for new email, but I kind of like the idea of running the build on a cron job. Maybe once an hour.

        3 votes
  6. [3]
    psi
    Link
    I'm currently fitting correlation functions [1]. In theory, correlation functions are extremely useful since they can be used to, for example, extract the energy spectrum of a particle created on...

    I'm currently fitting correlation functions [1]. In theory, correlation functions are extremely useful since they can be used to, for example, extract the energy spectrum of a particle created on a lattice [2]. In practice, it's the most tedious part of my PhD research.

    Essentially my task is to fit a sum of exponentials

    C(t) = A_0 e^{E_0 t} + A_1 e^{E_1 t} + ...
    

    in order to determine the wave function overlaps A_i and energies E_i given data vectors C(t), t and a covariance matrix for C(t) (examples: [3, 4]).

    I've written some code to perform a Bayesian non-linear least squares fit and output some pretty plots (the latter being basically the most important part of my research). In some sense, everything up to this point is "easy": there are a lot of concepts to wrap your head around, but at the end of the day you're just writing a fitter.

    However, the tricky part is determining what constitutes a "good" fit. For each set of correlator data (there are more than 100 in total), I have to determine:

    1. How many terms to include in the expansion
    2. Which subset of the data to include in the fit

    That means balancing the following considerations:

    1. Obviously it's impossible to fit a function to infinitely many terms in an expansion, so I have to cut off the sum at some point. Therefore I need to make a choice for how many terms to fit.
    2. Consequently, the fit is sensitive to the range of time I fit. If I only include a single term in the expansion, for example, then it wouldn't make sense to include data from the earliest times when the signal is almost certainly contaminated by truncated terms. I need to choose the earliest time that is minimally impacted by terms not included in my expansion.
    3. On the other hand, the noise dominates the signal at late times, so I also have to make a decision on the latest times to include.
    4. Occasionally there are random, correlated fluctuations in the data, which means I have to suss-out whether the signal I'm fitting is actually real.

    For all these reasons, fitting correlation functions feels more like an art than a science. Fortunately we have cross-checks, and there is even some work on automating this process. But unfortunately nobody will trust your automated techniques unless you fit all the data by hand too.


    [1] https://en.wikipedia.org/wiki/Correlation_function_(quantum_field_theory)
    [2] https://en.wikipedia.org/wiki/Lattice_QCD#Fermions_on_the_lattice
    [3] Quick example (just imagine adding noise): https://www.wolframalpha.com/input/?i=plot+y+%3D+1+e%5E%28-+0.5x+%29+-++2+e%5E%28-x%29+%2B+0.5+e%5E%28-3x%29+from+x+%3D+0+to+x+%3D+20
    [4] More realistic example (fig 1 & the last few pages): https://arxiv.org/abs/2011.12166

    4 votes
    1. [2]
      skybrian
      Link Parent
      I don't understand this but I'll just throw out a question: would there be reasonable ways to decide this in a more application-specific setting?

      I don't understand this but I'll just throw out a question: would there be reasonable ways to decide this in a more application-specific setting?

      2 votes
      1. psi
        Link Parent
        Sorry, I'm sure my explanation wasn't perfect. As a bit of a tangent, one of the hardest skills to learn in grad school is how to effectively communicate technical results to any audience (of...

        Sorry, I'm sure my explanation wasn't perfect. As a bit of a tangent, one of the hardest skills to learn in grad school is how to effectively communicate technical results to any audience (of course, this is an important skill outside academia too). Grad school is somewhat paradoxical in that, compared to your average layperson, grad students know orders of magnitude more about their field; but grad students don't work with laypersons -- grad students work with other researchers, who themselves know orders of magnitude more than the grad students.

        So graduate school is basically the awkward teenage phase of academia. We know enough to understand our own research, but we don't know enough of the technical details to communicate effectively with our seniors, and we generally don't have the intuition to explain our research to a non-technical audience.

        Anyway, I guess my point here is that I wasn't purposefully trying to obfuscate my explanation with professional verbiage. I'm just using this post to practice explaining my work to others, which is something I'm admittedly not great at.


        So now to actually answer your question: fitting correlators basically amounts to staring at a few relevant plots and trying to use your best judgement to determine where the signal start. For example, if you plot the effective mass

        m_eff(t) = C(t) / C(t+1)
        

        you can work-out that for large t, this quantity asymptotically approaches E0 (ie, the mass of the particle on the lattice, which is usually the quantity we're most interested in). But as I wrote before, the noise dominates the signal for late times, so the effective mass might not plateau until after the signal disappears.

        Nevertheless, with enough practice, you start to intuit what a good fit looks like (which you augment with statistics such as the p-value and reduced chi square [2] to check for goodness of fit). In that manner, some people have thought to use machine learning to determine these fits, as you can essentially train a machine to think like a human. Of course, machine learning algorithms can be a of a black box, so when a fit appears to go awry, it's hard to know whether the fault is with your judgement or the algorithm's.


        More generally, fitting correlators can be thought of as a data selection problem. To give a dumb example: suppose you're taking an introductory physics course and your instructor asks you to determine the radius of three balls on your desk. You and your lab partners make the following measurements of the three balls:

        Ball A Ball B Ball C
        10 cm 15 cm 12 m

        See the mistake here? A 12 m ball probably isn't going to fit on your desk. In principle you shouldn't exclude data from your analysis, but you might be justified in doing that if you have a physical reason for thinking your data is inaccurate (in this case, your lab partner probably made a typo).

        But if you start removing outliers willy-nilly, you risk biasing your data (sometimes that blip in your data is actually exciting new physics). In general, data selection problems tend to be about understanding the limitations of your instruments and less so about calculating statistics. To be slightly more technical, the easiest way to compare models in a Bayesian scheme is to compare the Bayes factor [3] of model A to model B, but Bayes factors can only be compared if the models share a common data set. In a data selection problem, the models do not.


        At least, that was more or less my thinking until I came across this paper [4]. The authors managed to convert the correlator data selection problem ("which range of times should I fit") into a model selection problem. Unlike data selection problems, model selection problems are easily amenable to statistical techniques (such as computing Bayes factors). Using the algorithm outlined in the paper, I can automatically select the best time range and the number of terms to include in my sum. And... it works pretty well! Sometimes the algorithm makes some bold choices, but even then they are usually pretty reasonable. And while the automated fits aren't quite convincing enough for us to use this algorithm exclusively, I can at least use this automated technique as a cross-check against the fits I pick by hand.


        [1] https://en.wikipedia.org/wiki/P-value
        [2] https://en.wikipedia.org/wiki/Reduced_chi-squared_statistic
        [3] https://en.wikipedia.org/wiki/Bayes_factor
        [4] https://arxiv.org/abs/2008.01069

        5 votes
  7. Apos
    (edited )
    Link
    I've been rewriting my Apos.Gui library. It's a UI library for MonoGame. A while ago I discovered IMGUI. It's a way to build user interfaces where instead of setting up everything when the program...

    I've been rewriting my Apos.Gui library. It's a UI library for MonoGame.

    A while ago I discovered IMGUI. It's a way to build user interfaces where instead of setting up everything when the program loads, you set up the UI directly in a game loop. Every frame, you "rebuild" the UI from scratch (In reality, there's some magic behind the scene to make things efficient.). This way of building UIs is really convenient. It requires much less boilerplate code and it's much easier to have dynamic UIs that change depending on various game states. Their main example is pretty good snip:

    ImGui::Text("Hello, world %d", 123);
    if (ImGui::Button("Save"))
        MySaveFunction();
    ImGui::InputText("string", buf, IM_ARRAYSIZE(buf));
    ImGui::SliderFloat("float", &f, 0.0f, 1.0f);
    

    This code gets executed every frame. As you can see, they put the "Save" button right into an if statement. When the button gets pressed, the if evaluates to true and you can call "MySaveFunction()". No need to use events and callbacks. If you want to do something you just write the code inline. You can show or hide the UI at will, add or remove anything between frames. From a developer perspective, this is really convenient.

    I didn't do a release for my library, but I managed to get my API pretty close to IMGUI:

    Panel.Put();
    Label.Put("Hello, world");
    if (Button.Put("Save").Clicked)
        MySaveFunction();
    TextBox.Put(ref superText);
    Slider.Put(ref f, 0.1f, 1.0f);
    Panel.Pop();
    

    Usually, IMGUI is best used for making debug tools. My UI library on the other hand is designed to do actual in-game UIs. I can have textures, animations, etc.

    This next week I'll be beta testing the API and creating more components as I need them.

    Edit: Cleaned up the API some more.

    3 votes