Dealing with external repos: submodules or what


Well if you have a project, you are inevitably going to have some external software. Whereas most folks just say, copy in a file into your own project, this works and is easy, but it means that if there is an update you won’t get it.
The other option is to have buckets of random GitHub repos on your machine. It is nice they are source controlled now, but on the other hand, you can get a reproducible build if you are incorporating various parts and they update under you.
Also if you are copying things from or some other source, there is nothing to prevent someone from just deleting or renaming the repo entirely. All of these things have happened to me.
So what’s a person to do? Well, there are a couple of difficult alternatives, but here are the things that you should do:

  1. Fork all repos you are using. that is, if you are saying using richtong/src because you love it, then you should fork it. This means that you will always have this source even if the other guy deletes his. It does mean that every so often though, you have to update that repo with the latest fixes but this is way better be causing you know exactly what you are building
  2. Get upstream fixes and sync forks. This means first you need to tell git where the upstream is, so first clone the repo to your local and then do a git remote add upstream owner_/_upstream repo_ and then when you want fixes, you can just do a git fetch upstream which brings in those fixes and then you can merge it into that repo. Assuming that you don’t make changes to that repo, then a git merge upstream --ff-only should work. You do a fast forward only as a check to make sure you didn’t accidentally mess with something.

Now this gives you control of the repos you have, but how to make sure that all the repos are synchronized properly. Here you have a big decision, there are basically three different mainstream methods for doing this.

Git Submodules: The most common, the strangest

This is the easiest to start but it can lead to all kinds of problems mainly because the semantics are a little confusing but basically if you only ever pull from sub repos, this works pretty well.

  1. This is the most commonly used and the most disliked mainly because it is pretty unintuitive what is going on. These are built into git, but are a little difficult to understand.
  2. What happens is that if you do a git submodule add you will find that you have their repo inside of your. What happens now is a little weird. When you are in the master repo, you have one commit history, but when you dive down into one of the sub-repos, you are actually living in a different commit history.  The only things that binds things together is a magic file called .gitmodules which tells git which directories are actually sub repos and what commit to pull from them.
  3. So when you clone a repo like this, you *don’t* get any of the lower level repos, instead you have to remember to git clone --recursive richtong/src which means you should get all the submodules if they exist.
  4. Now before you actually compile and run, you need to make sure you have all the latest sub repos. It’s a little involved, but you have to do a agit submodule sync --recursive && git submodules update --init --recursive. This basically means, first do a sync, so if any of the of the repos have new URLs, you sync them. The second says makes sure you have the right commit for the submodules.
  5. If you decide to reorganize where you put them, then you can just do a simple git mv _old location_  _new location_ and this just works. Git knows how to adjust the .gitmodules so that it all points correctly.
  6. The biggest gotcha is that you can’t see what is happening to submodules, so you should do a git config --global status.submoduleSummary true so you can see what’s going on down there.
  7. When you want to remove a submodules, then git rm actually does that properly for the latest version of git.

Other approaches and exotica

I actually haven’t tried this yet, but it is supposed to be better to use a subtree concept. There is also the Google repos tool which is tuned for Android.

Related Posts

© All Right Reserved