Fun programming challenge: figure out which sets of passports grant visa-free access to the whole world
Hey there,
I wanted to know which sets of passports grant together visa-free access to every country in the world, but I could not easily find that info online. So I figured that I could try to write a small program to determine these sets of passports myself, and then it occurred to me that it would probably be a fun programming challenge to organize, so here we go.
Here's the challenge.
- Scrape the data you need for instance from The Henley Passport Index.
- Design a clever algorithm to efficiently find out which are the smallest sets of passports that will grant you visa-free access to every country in the world.
- Optional. Allow the user to specify which passports they already hold and find out which sets of passports would complement their passports well.
- Optional. Rank the sets of passports by how easy it is to acquire citizenship in those countries.
The choice of the programming language is yours, bonus points if you write it in assembly 😂
Feel free to collaborate and share your solutions (the algorithms and the actual results) in the comments, and feel free to share your own twists to the challenge that could make it even more fun & interesting.
The person with the most clever, efficient and elegant algorithm wins!
Happy coding folks!
Some countries are jealous and won't let you keep your old passport. For example Japan. EU countries tend to tolerate multiple nationalities as US does.
Since United Nations has 193 members and both EU and Japanese passports will give you about 181-194 visa-free accesses, the most practical approach would be to just get one of those.
The top EU and Japanese passports grant you access to 194 countries, which means you get the whole Earth + one country on a different planet of your choice[citation needed].
Having two citizenships is pretty restricted in Germany too. There are plans to make it a bit easier though.
A friend once told me, that there are ways around that, at least for them. Don't know how accurate that is.
In case of Germany, you are allowed to have dual citizenship if the other country is in EU. Also if your country of birth does not allow you to renounce the citizenship. I'm sure there are some other deals or restrictions that I am not aware of.
They actually recently passed a law changing this in Germany, now allowing dual citizenship, which goes into effect in April.
Though as an immigrant I had heard stories of people pulling similar tricks to what you describe in the past. Not sure how well they'd work nowadays, but ig no more need anyway.
Even the most powerful passports won't let you travel in the whole world visa-free unfortunately.
UN diplomatic passport can be pretty handy https://i.imgur.com/ZGagXdy.jpeg but I suppose that doesn't count?
Well you can include it in your solution if you want, even though most people would not be able to get a hold of one.
That was a joke. :-)
Oops, my bad! (:
I definitely won't have time to write anything for this but I did get curious about one issue related to this.
Easy example, Afghanistan. Top of the list. If seems you can't travel to Afghanistan without a visa at all. In this hypothetical, would your solution include an Afghani passport then, since that's the only way to travel without a visa?
Good observation! I think that I'll just let participants decide how they want to handle such edge cases.
I won't have time until tonight, but this could be written as a linear program. Or, more specifically, a (0,1) integer linear program. The objective function would be a summation of Boolean variables that represent each passport, with the goal of minimizing the function. The constraints would be the trickier part, but I think I could get it going with a little tweaking.
Alternately, I'd probably load the data into a database or some other tool that allows me to use set-based operations.
@hydravion
As promised, I wrote an integer linear program to solve this problem. It's not exactly what you envisioned, I suspect, but it's what seemed to me the ideal way to solve the problem, and linear programming is an underused tool in the IT world.
Step 1: I downloaded a csv of the passport data, re-ordered a few columns so the matrix was symetric along the diagonal, and transposed the values so as to prepare the data for Octave's expected format.
Step 2: I collapsed all the values on visas status to either 1 or 0; 1 if no visa is required (or granted automatically on entry), and 0 if a visa is required.
Step 3: I used some Excel and Notepad++ trickery to prepare the text of the linear program.
Before I show my linear program, which is large, let's look at a sample one I wrote. Below is a linear program to solve a related set covering problem (scroll down to numerical example) that is easier to conceptualize.
Basically, we sum the number of cameras needed to cover all the parts of the stadium. You should be able to see the correlation between the variables I've defined above, and the mathematical representation that I linked.
Now, in contrast, my LP will have a 199x199 constraint matrix, and a bunch of 199x1 and 1x199 other variables. In all of its glory, here it is:
code was too long for the post, so moved to pastebin
Step 4: We save the above to a file, and run it in Octave.
After running the program, which takes fractions of a second, we get a solution array with an entry of 1 in the spot corresponding to each country whose passport is part of the solution set. Those are:
To understand the results, we look at our data, which for certain restrictive countries which only allow themselves or one or two others, the minimal solution set to visit every country in the world would require a passport from that country or one of the few it accepts.
Additionally, we know this is an optimal solution, but we don't know if it is unique. I.e., there could be other sets of 13 passports that grant access to every country in the world.
Honestly, after futzing with the data, the actual LP writing was just a few minutes. So this is probably the most efficient way in programmer time to solve these sorts of optimization problems.
I also hand wave away all the questions such as whether the ILP is in a convex space, but I'm not going to get that deep into it.
Have a great night!
I made a quick adjustment based on comments from @xk3 about the years to acquire the passports. I quickly went through the CIA factbook and updated my spreadsheet to include years to naturalization. When a country prohibits naturalization without proving ancestry, I gave them a penalty years of 50, which will effectively exclude them from the solution. When the requirement was a range of years, I went with the lower end to see what the best case would be if you were clever. When different years were provided based on ancestry, I went with the value that didn't discriminate based on ethnicity.
I then changed the objective function in the ILP from 1 for each country's variable to the years for naturalization, which is basically a 199x1 array of years. After running it, the initial results are:
Years required: 116* (actually 61, see below)
Solution:
Afghanistan, 5 yearsNorth Korea, 50 yearsNow, the elephant here is North Korea, which recognizes no passport for visa-free entry, so let's exclude it as an infeasible constraint. Additionally, Afghanistan is likely only included for entry to itself as well, although technically I think it recognizes one or two others. So let's give it a penalty value of 50 and see what we get.
After running, I found that Afghanistan was still included, meaning it doesn't accept any of the other solution members' passports, and no other passport inclusion was more optimal.
So, removing Afghanistan and North Korea, the actual optimal year is 61 years to be able to travel visa-free to almost every country in the world.
A fun extension would be to make a tool that prompted people to rate countries in the order they would like to live, rank them again as interested in visiting, and build a program that would optimize for that user's preferences.
Presumeably one could force the program to use passports one already owns by putting them with a naturalization weight of 0? It could be interesting to see how it would change if you already had a starting point of an EU option or US as neither of those are included.
Surprisingly, that wouldn't guarantee it being selected. Because the underlying algorithms often traverse the boundary of the solution space, assuming linearity or convexity, it isn't a guarantee that a freebie would be encountered before another optimal solution. The introduction of a freebie would possibly introduce the existence of multiple optimal solutions.
So force a specific country in the solution, we would make it 0 cost like you said, and add a constraint forcing its inclusion.
Great comment!
I'm wondering how we could easily tell which countries each passport adds to the set of countries one will be able to visit visa-free to know which passports we can choose to forgo to reduce the number of years we need to dedicate to acquire them.
I suppose that if you wanted to minimize the number of years dedicated to acquiring passports while still being able to access most of the world visa-free, you would need to go with a combination like Canada + Australia + Singapore, but this is just me guessing, and I don't really know how many countries (and which ones) would still be left inaccessible visa-free. According to the data given by @xk3 this would probably take only around 9 years? This seems much more manageable, although it may not be the most optimal combination if you already own a powerful passport, like the French or German ones?
The best way to do that is probably a filterable spreadsheet with some conditional highlighting logic that highlights duplicates in one color, and unique value in a second. So filter to your optional solution, and visually inspect.
We could do it programmatically, but it might be overkill.
Your second part is exactly what I envisioned worth reformulating the problem to once where we maximize countries visited subject to the constraints that limit the time spent, and force inclusion of passports we already have, or countries we want to live in.
Great questions!
Also, it would be interesting to see in comparison how many days it would take / how much money you'd need to spend to get a set of visas rather than another passport. Of course getting a visa and being able to use it to enter a country is not a "given" but neither is getting a passport a "given" for many countries
For anyone interested - there's a great SoME video on linear programming that might explain what a linear program actually does: The Art of Linear Programming. The chapter Integer Linear Programming - at 14:01 - is particularly relevant, but the rest of the video leading up to that is probably required reading if you aren't already familiar.
Also see the docs on
glpk
to get a sense of the format of these constraints. I'm going through to try to understand exactly how this formulation works, since I've never written a linear program before. To really spell things out (and as an invitation for corrections), here's my understanding of the representation:Each row of
A
represents a destination, and each column represents a passport. The variablex
will represent the number of a given passport held, and then the matrix multiplicationA x
represents the number of held passports that grant access to a given destination.ctype=L, b=1
indicates a lower bound only on the result the equation is of the formA x >= 1
- we want to hold at least one passport that grants access to each country. We could discard a given destination or require holding more than one passport that grants access by changing the value ofb
for that country.lb = 0, ub = inf
constrains the variables - we can't have less than 0 or more than infinite passports. We could indicate we already hold a particular passport by changing the value oflb
on that country.Aside - it doesn't matter here since this is a minimization problem and taking a given passport more than once doesn't affect the cover, but I wonder if it is better to set
ub=1
? I'm not familiar with Octave to know the impact on performance that would have.vtype=I
indicates this is an integer problem, we can't hold fractional passports.sense=1
indicates this is a minimization problem, we want to hold as few passports as possible.Then the linear program runs and finds a vector
x
that lists the count of each passport that we need to hold.I made
a fewa bunch of edits here for clarity as I'm understanding the problem better. @krellor could you please let me know if I've misunderstood something?Hey krellor, that rocks! I like your unusual angle, I actually didn't even know about such methods!
Is there a way to list all the optimal solutions that exist using this method?
PS - Interesting to see that many powerful passports fail to grant you visa-free access to India, but the Maldives' passport does, and also that Singapore's passport grants you visa-free access to China.
There are a few obscure algorithms that could be used to enumerate all optimal solutions in polynomial through the use of lattices in polyhedra. We could also determine the family of optimal solutions by calculating the convex hull (shape or box) around all optimal solutions. There are probably lots of visualization tools that would make it easy to inspect.
But in this case I suspect that there are only a small number of optimal solutions, and we might find them by perturbing our initial position or objective function. For example, if we know the optimal is 61 years and we remove Afghanistan and North Korea, then we could change the problem to maximize the number of countries visited, while futzing with the constraints to selectively include our exclude countries in our original optimal. Or simply force the inclusion of one of the strong passports and see what happens.
We could also use repeated iterations of things like simulated annealing whose initial condition is situated near the optional solution.
Finally, we could perform a constraint reduction algorithm that would dramatically decrease the number of constraints, and may allow us to explore the solutions better.
Sorry to give a complicated answer. The gist is that yes, there are ways to do it, but they aren't trivial.
I wrote a website that helps plan multiple passport trips: https://unli.xyz/city/honeymoon/ I wrote it to help plan honeymoons with my wife but it might also be useful for people planning conferences.
For people that own multiple passports I have another website which lets you choose "solo travel" in the passport-visa country filter: https://unli.xyz/city/calc/
krellor answered the programming challenge aspect of this question pretty well, though you will probably want to consider how difficult it is to get a passport. If they are ordinary passports most countries require you to be present in the country for a certain length of time--and you might not live long enough:
So there's a few ways we could incorporate this into my ILP. Since the objective function that we seek to minimize is a 199x1 array of boolean variables, simple changing the coefficients from 1 to the number of years it takes to get a visa from that country would then minimize the problem of the least years it takes to collect them.
Additionally, for the spirit of the problem, certain countries like North Korea should just be left off, along with any country that doesn't have any visa free travel.
We could also reformulate the problem into a maximization problem where we try and maximize the number of countries we can visit visa free subject to a years waiting constraint, such as 20. So you could set a number of years and be given a solution that maximizes what you get for it.
We could also introduce weights to indicate countries of preference.
Have a great day!
Thanks for linking your websites! They look like interesting tools.
Scraping the data feels like it's the biggest of the 4 challeneges. From a brief glance at the website I think going through all the country codes and visiting the URL -> https://cdn.henleyglobal.com/storage/app/media/HPI/AD_visa_full.pdf and then somehow parsing the PDF is the best idea I can think of.
But once you have the data, say as a 2d array indexed on the countries, it seemed to be that this was a known problem. Using chatGPT I found out that this problem is the optimization version of the set cover problem. wiki link
Good news, everyone! The same data is also available as JSON: https://api.henleypassportindex.com/api/v3/visa-single/AD
Nice! A quick Google search also yielded this and this. Thanks again!
Sharing my scraping and problem setup Python script - there is no solution here. I try to be a good internet citizen, so it rate-limits outgoing requests and caches all responses in a shelf so it never requests the same country data more than once. Depends on the requests package.
cc @ra314 in case you find this helpful.
Scraping and problem setup
Note the API provides more than just a list of countries with visa-free access; there are also fields
visa_on_arrival
,visa_required
, andelectronic_travel_authorisation
on most (but not all) entries. Since my script above caches only the responses, you can still fetch those after-the-fact from the cache. See how I fetch thevisa_free_access
for reference.Some general thoughts about the challenge:
This is the dominating set minimization problem. After a simple reframing, it is the set cover minimization problem. There is a straightforward approximate greedy solution, but it does not always find the smallest cover. I don't know if that's an issue on this input. A strict solution is NP complete, but these problems are well-known so I'd like to do a bit of research on what the more sophisticated algorithms look like.