Is there a sane way to use Git as a glorified sync tool?
I am not a programmer nor am I in IT, but I like to use some of the same tools they use. I use Emacs for writing fiction and I like it a lot. One of the packages I use with Emacs is git-timemachine
, which allows me to visualize all the previously commited versions of the file I am currently working on. It serves as a very good and very reliable undo system. All my writing is on a private repo on Github. My usage is so simple and basic, Git/Github only serves as a kind of backup and undo (I know Git is not a backup, so I regularly download my repos as zips and send to OneDrive as an extra. They are also always available offline in the machines work, of course).
The problem is, sometimes I work on different machines, and sometimes on different operating systems on the same machine (via dual boot). So I would like to know if there's an easy way to always "sync" the local mirror I am currently working on with the latest changes (also making sure that all changes are pushed). Essentially, I am asking if I can make Git work like Dropbox or OneDrive by automatically accept changes as long as they are the most recent version of a file. I do not wish to go through diffs approving every single change.
I understand I could use something like rclone for that, but their bisync
feature is still very new and not considered reliable. Also, I already use Git and it is good for me. So I would prefer not adding an extra piece to the puzzle.
I am familiar with cron, have an elementary understanding of shell scripts, and can follow instructions.
So, can Git do the job?
You should only be seeing diffs you didn't ask for if you have conflicts or you're doing a pull request sort of workflow. If you just pull + make changes + commit + push to the same branch every time, and don't mess with it from another computer between those steps, you shouldn't ever need to see them since your history can't ever diverge into multiple timelines.
As for automatically committing and pushing, not that I'm aware of. It's designed for use cases where you want the programmer explicitly saying "yes, I am done with this part and now wish to inflict it upon other people".
Aliases to the rescue. Do not do this for things that involve other people. The worst thing is needing to remember to pull before making changes.
alias lou-edit='git pull --rebase && emacs'
alias lou-push="git pull --rebase && git add . && git commit -m $(date) && git push"
Also @lou, you can set up each machine to be able to push/pull to and from each other with SSH with no third party. It just requires a bit more setup.
That is essentially what chatGPT suggested me. But I didn't feel confident enough to just use a suggestion by AI since this repo is quite important to me. My machines are all setup for Github with SSH already so no problem there. Thanks ;)
Have you looked at git-annex?
Interesting tool, thanks!
Looking at their website, I am having a little trouble understing how I should use it. What do you have in mind?
Did you read the walk through? The basic idea is to setup each of your machines as repos and just call git-annex sync. Basically git-annex is an extension to git that aims to use git for file sync. The other major difference is that git is only used for metadata tracking, not the file content itself which obviates the need for something like git-lfs if you have pictures or other large binary assets that you want to frequently edit.
I don't use it, because I have never needed what it offers (I'm just a regular boring software dev that uses restic for backups), but it seemed to fit your use case (git-like workflow for file sync across a set of decentralised remotes).
I'd strongly recommend against git-annex for your use cases. From what your needs are, it sounds like specifically and dangerously what you don't want. It is designed for very large directories with large files, potentially too large to be on a single machine at once. It is specifically not designed for files to be modified in most circumstances, in the way git would normally be used.
I use it, but in very different situations to what you use git for.
The simplest way to move to a new machine is to commit your latest changes, push them to Github, and then pull them on the new machine. As a software developer, I wouldn’t automate this process because I don’t switch machines all that often and I want my commit history to be clean. It wouldn’t do to commit unfinished work that doesn’t even compile, and then publish it by pushing to Github. Also, I’d want to write a commit message describing each change.
As a writer, you probably don’t care about that, so you’re off the beaten path a bit, but setting up syncing should be doable. One way to partially automate it might be to have your editor update GitHub when you save a file?
I did a quick search and found an emacs package that will automatically commit and (optionally) push on save. It has settings to customize how it works. I haven’t tried it, but it seems suitable?
https://github.com/ryuslash/git-auto-commit-mode
I don’t know how familiar you are with customizing emacs. Please ask if their directions are unclear. (There is an INSTALL file explaining how to install it.)
That script will ensure that the Github repo is always up to date, but the next step would be to do a pull before starting work on your new machine. Since it’s just one command, I would do that manually.
If it is something along these lines you're looking for, I also found Git Auto Sync. It would automate both pushing and pulling, but works on a 10 minute interval instead of immediately when saving.
This is the purpose of GitHub and remote repositories. In the middle of something but if someone else doesn’t answer by the time I’m done I can help
So obviously you've gotten a lot of help since I posted this. That said I'd think the main thing i'd add is maybe looking into VS Code plus it's github extension + something depending on the limiations of your setup. It seems you prefer emacs, which is fine, but i'm unclear if you always have access to it.
It's very trivial to configure git in vs code to just commit, push, and pull on a button click once you check some "yes i'm really sure I want to do this" boxes. I run dendron for taking office notes and found it to be excellent for that, and it helps abstract away the coder centric nightmare git can quickly become.
Of course that being said if VS code can do it for you then yeah it's not too hard to just do it yourself. For awhile mine was borked because "??" so I just did
or something roughly along those lines.
That basically breaks down to:
For pulling them it's either just cloning the repo to a new machine, or for a machine that's already got the repo and out of date:
All that said I'm actually still a little unclear on the majority of these other solutions with rebase. I'm not a great git user but a friend is, and his advice was avoid things like rebase unless you have to. I'm a little curious still as to WHY git is asking you to do merges (which are conceptually simple things that are just hidden behind some awkward git workflow if you'd like to learn how to handle them)?
That's a concern from my POV because as others pointed out you shouldn't be seeing those depending on your workflow, and if you are I just want to be sure you're not somehow losing data. Its most likely i'm misunderstanding and you just assumed you'd have merges and didn't, or the merges are for mild formatting things like the removal of an indent or something, but as always with "oh just do this" in git it can quickly lead to unintended consequences.
That said, git is rarely truly destructive, so if you DO suddenly have that sinking "oh shit" moment and realize you have somehow gotten yourself in trouble, i'd highly recommend making another topic here or somewhere to get help navigating back to a satisfactory state with some help. The issue with the stack overflow and maaaaybe AI help (it does seem better) is you can sometimes walk yourself off a cliff.
Oh also you might want to check out gitjournal on your phone, as it allows you to do some light text editing of git, as if it was a journal, from mobile. A lifesaver for those "damn i'm not at a computer/have nothing to write with" moments so you can get the idea down and flesh it out later.
Incidentally, just from a quick glance, I think Magit can do everything this does (and much more).
Thanks!
Yes, of course. I don't need something as comprehensive as Magit, so I use
git-timemachine
.So what is the actual problem? Every time you try to commit and/or push changes you have made, you have to review a diff of those changes and approve them?
That sounds like you're trying to fix a merge conflict.
Generally, there are two ways to handle merge conflicts:
Option #1 (usually) works fine if you are the only person working on a repository. Option #2 (usually) is preferred for collaborating with other people.
I'll leave it up to you to figure out what branches are, how to use them, etc.
For option #1 though, here is what I mean:
Every time you clone a git repo to your computer, you get a local working copy of that repo. Your working copy is the same as the remote copy hosted on GitHub. It is "clean" and "in-sync".
When you make changes to the (local) repo on your computer, you are making the working copy "dirty", until you successfully commit and push your changes to the (remote) repo on GitHub. After you push your changes, your repos (local and remote) are in-sync again. They once again have the same state, you once again have a clean working copy on your computer.
If I had to guess, you are encountering merge conflicts because you are working on the repo on
Computer1
, but not committing and pushing your changes when you are done working (probably because the changes themselves aren't done; for example: not having fully rewritten a paragraph yet).So
Computer1
's (local) working copy is dirty. Its changes have not been committed and pushed to the remote repo yet, so it is "out-of-sync".Then later on, you work on the repo some more from
Computer2
. This time you actually do commit and push your changes, such thatComputer2
's (local) working copy is clean, and in-sync with the remote repo.Now,
Computer2
and the remote repo are in-sync with each other, but out-of-sync withComputer1
's repo.So you go back to
Computer1
, finish your work there, try to commit the changes, and experience a merge conflict.The key to avoiding this situation is to always commit and push your changes when you are done working, not when the changes themselves are finished.
Finished rewriting a paragraph? Commit-push. Started rewriting a paragraph, but now you have to go change a diaper? Commit-push.
Always commit and push your work when your work session is done. Religiously do that, and you will always maintain a clean working copy in every instance of that repo, on every machine.
Again, this way of working with git does not universally work well. It generally is not recommended to collaborate with other people this way.
Also, a concern with committing and pushing unfinished work is that it makes it difficult to find and pick-up where you left off. Personally, I litter
TODO:
comments throughout my files (prose and code) to help me get back to work later on. For prose, I also sometimes wrap sentences and paragraphs with curly braces ({}
) to signify to myself that I want to rewrite whatever is contained within the braces.This workflow might be good for you, or you might prefer branches, or you might prefer to use something other than git entirely. This is what works for me.
I sync my home folder with git (between many machines) and syncthing (between two machines). Syncthing might work well for you because it has automatic conflict resolution and backs up what would be "lost" under git (hidden behind automatic merge conflict resolution) as a
*.sync-conflict.*
file which may be easier for you to recover.Here is something I run frequently:
The clean_home function syncs git automatically by doing something like this:
Using this technique of staging changes, pulling, and resetting the stage eliminates a lot of handholding that git wants when it sees incoming changes modifying an unstaged file.
Related files which may be important:
Here is how I use git in a "centralized" sync context (similar to rsync--though bidirectional sync works unhindered... it's just controlled by the client).
It's useful for if you want to push commits directly to a web server or something like that. I don't know if it's exactly what you want but reading your title reminded me of it and some of your words make me think this might be helpful:
On the "server"
On the "client"
But if you work with multiple people this type of configuration will lead to merge conflicts on the server HEAD which is why bare repos are used for collaborative contexts 99% of the time
Syncthing is awesome for asynchronous / non-concurrent cross-device document editing. Even works well across more than two devices. The only caveat is that synchronization is purely peer-to-peer. So either you need both devices turned on for a bit to let the changes sync. Or alternatively I imagine you could set up a server/VPS that always keeps up with, and propagates any changes done on each of your end-devices.
I have a Raspberry Pi connected to my router for exactly this purpose, it works great.
Personally I do not recommend trying to automate or hide any of the complexities of git with cron jobs or layering on additional tools--this just adds more potential surface area for things to go wrong, and if you're not actively involved in the sync process it could stop working and you might not even realize it until it's too late.
I use git for the exact purpose that you state for my own writing and I think once you engrain the necessary routine to muscle memory it's not too bad (admittedly I use git daily for my job too so it comes very naturally to me). You need to
git pull
changes from the server before you start writing to sync up your local machine,git add
+git commit
whenever you want to save a snapshot of your current state, followed by a finalgit push
to sync everything you've done in this writing session back to the server when you're done (or click whatever buttons do the equivalents of those commands if you're using a graphical git application). It's really just those three things to ensure you're always in sync everywhere and don't lose any work. The only situation where you'd ever potentially need to look at diffs and approve things is if you forgot one of those steps and had conflicting work on two machines simultaneously, but once you've got the routine down to muscle memory that should hopefully not happen very often or at all.I have no answer for your question, just small anecdot. Few years ago I read in a few sources that git does not work well with big files. (Mostly from performance perspective I recall?). I do not have such experience, and maybe it's not a issue anymore. So, please, correct me if Im wrong.
This is especially true for large binary files or anything that doesn’t merge well. But if it’s for writing text files, even pretty large ones, it’s unlikely to be a problem.
There is git-lfs for that. Otherwise you can run into a problem if you are regularly changing a large binary asset as you'll end up bloating the history for everyone.
Thanks!
I don't know, I don't work with big files. Just fairly small text files.
The problem is not big files per se, but non-text files. Git generally records file changes in a "diff format", only needing to store the line differences per file. When this does not work, any small change requires git to store a whole new copy of the changed file in its history. And especially for bigger files, this quickly adds up.
Yeah, for truly large files there are tools like git-lfs anyway. And for non-technical work, you'd be very unlikely to run into this with the types of files git handles well. It's gonna be the binary files that cause difficulty.
Thanks for explanation, makes sense. Following this logic we should avoid put archives into git or anything that not clearly structured.
It depends what it's an archive of, but yeah, git isn't really a suitable tool if you want to save a bunch of binary files.
Since I don't think anyone has said this yet, and I think it's the simplest solution:
(This is assuming your branch is
master
and your remote is calledorigin
.)This will reset your local remote to be the exact same as it exists in Github. No merging, no rebasing, it just overwrites your local repository with the head of your branch from Github.
You will lose any local changes, so make sure you don't want them and push them first if you do.