32 votes

Fun programming challenge: figure out which sets of passports grant visa-free access to the whole world

Hey there,

I wanted to know which sets of passports grant together visa-free access to every country in the world, but I could not easily find that info online. So I figured that I could try to write a small program to determine these sets of passports myself, and then it occurred to me that it would probably be a fun programming challenge to organize, so here we go.


Here's the challenge.

  1. Scrape the data you need for instance from The Henley Passport Index.
  2. Design a clever algorithm to efficiently find out which are the smallest sets of passports that will grant you visa-free access to every country in the world.
  3. Optional. Allow the user to specify which passports they already hold and find out which sets of passports would complement their passports well.
  4. Optional. Rank the sets of passports by how easy it is to acquire citizenship in those countries.

The choice of the programming language is yours, bonus points if you write it in assembly 😂

Feel free to collaborate and share your solutions (the algorithms and the actual results) in the comments, and feel free to share your own twists to the challenge that could make it even more fun & interesting.

The person with the most clever, efficient and elegant algorithm wins!

Happy coding folks!

29 comments

  1. [9]
    mordae
    Link
    Some countries are jealous and won't let you keep your old passport. For example Japan. EU countries tend to tolerate multiple nationalities as US does. Since United Nations has 193 members and...

    Some countries are jealous and won't let you keep your old passport. For example Japan. EU countries tend to tolerate multiple nationalities as US does.

    Since United Nations has 193 members and both EU and Japanese passports will give you about 181-194 visa-free accesses, the most practical approach would be to just get one of those.

    The top EU and Japanese passports grant you access to 194 countries, which means you get the whole Earth + one country on a different planet of your choice[citation needed].

    14 votes
    1. [3]
      hagi
      Link Parent
      Having two citizenships is pretty restricted in Germany too. There are plans to make it a bit easier though. A friend once told me, that there are ways around that, at least for them. Don't know...

      Having two citizenships is pretty restricted in Germany too. There are plans to make it a bit easier though.

      A friend once told me, that there are ways around that, at least for them. Don't know how accurate that is.

      • Send your passport and a renounciation of citizenship to the spanish embassy
      • Spanish embassy confirms the arrival of the renounciation
      • Use confirmation to get German citizenship
      • The Spanish embassy returns your passport to you like "We're sorry, but you are not allowed to renounce citizenship"
      • have two passports
      5 votes
      1. OrangeCorvus
        Link Parent
        In case of Germany, you are allowed to have dual citizenship if the other country is in EU. Also if your country of birth does not allow you to renounce the citizenship. I'm sure there are some...

        In case of Germany, you are allowed to have dual citizenship if the other country is in EU. Also if your country of birth does not allow you to renounce the citizenship. I'm sure there are some other deals or restrictions that I am not aware of.

        1 vote
      2. sparksbet
        Link Parent
        They actually recently passed a law changing this in Germany, now allowing dual citizenship, which goes into effect in April. Though as an immigrant I had heard stories of people pulling similar...

        They actually recently passed a law changing this in Germany, now allowing dual citizenship, which goes into effect in April.

        Though as an immigrant I had heard stories of people pulling similar tricks to what you describe in the past. Not sure how well they'd work nowadays, but ig no more need anyway.

        1 vote
    2. [5]
      hydravion
      Link Parent
      Even the most powerful passports won't let you travel in the whole world visa-free unfortunately.

      Even the most powerful passports won't let you travel in the whole world visa-free unfortunately.

      4 votes
      1. [2]
        Minty
        Link Parent
        UN diplomatic passport can be pretty handy https://i.imgur.com/ZGagXdy.jpeg but I suppose that doesn't count?

        UN diplomatic passport can be pretty handy https://i.imgur.com/ZGagXdy.jpeg but I suppose that doesn't count?

        5 votes
        1. hydravion
          Link Parent
          Well you can include it in your solution if you want, even though most people would not be able to get a hold of one.

          Well you can include it in your solution if you want, even though most people would not be able to get a hold of one.

          7 votes
  2. [2]
    Wolf_359
    Link
    I definitely won't have time to write anything for this but I did get curious about one issue related to this. Easy example, Afghanistan. Top of the list. If seems you can't travel to Afghanistan...

    I definitely won't have time to write anything for this but I did get curious about one issue related to this.

    Easy example, Afghanistan. Top of the list. If seems you can't travel to Afghanistan without a visa at all. In this hypothetical, would your solution include an Afghani passport then, since that's the only way to travel without a visa?

    9 votes
    1. hydravion
      Link Parent
      Good observation! I think that I'll just let participants decide how they want to handle such edge cases.

      Good observation! I think that I'll just let participants decide how they want to handle such edge cases.

      5 votes
  3. [11]
    krellor
    (edited )
    Link
    I won't have time until tonight, but this could be written as a linear program. Or, more specifically, a (0,1) integer linear program. The objective function would be a summation of Boolean...

    I won't have time until tonight, but this could be written as a linear program. Or, more specifically, a (0,1) integer linear program. The objective function would be a summation of Boolean variables that represent each passport, with the goal of minimizing the function. The constraints would be the trickier part, but I think I could get it going with a little tweaking.

    Alternately, I'd probably load the data into a database or some other tool that allows me to use set-based operations.

    6 votes
    1. [10]
      krellor
      Link Parent
      @hydravion As promised, I wrote an integer linear program to solve this problem. It's not exactly what you envisioned, I suspect, but it's what seemed to me the ideal way to solve the problem, and...

      @hydravion

      As promised, I wrote an integer linear program to solve this problem. It's not exactly what you envisioned, I suspect, but it's what seemed to me the ideal way to solve the problem, and linear programming is an underused tool in the IT world.

      Step 1: I downloaded a csv of the passport data, re-ordered a few columns so the matrix was symetric along the diagonal, and transposed the values so as to prepare the data for Octave's expected format.

      Step 2: I collapsed all the values on visas status to either 1 or 0; 1 if no visa is required (or granted automatically on entry), and 0 if a visa is required.

      Step 3: I used some Excel and Notepad++ trickery to prepare the text of the linear program.

      Before I show my linear program, which is large, let's look at a sample one I wrote. Below is a linear program to solve a related set covering problem (scroll down to numerical example) that is easier to conceptualize.

      # c is the objective function
      c = [ 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ];
      
      # A is the constraint matrix, i.e., the set of linear constraints that form our solution space.
      A = [ 1 0 0 1 0 0 1 1 ;
            0 0 1 1 0 0 1 1 ;
            1 0 0 0 1 0 0 0 ;
            1 1 0 0 0 0 0 1 ;
            0 0 1 0 0 0 0 0 ;
            1 0 0 0 1 0 1 1 ;
            1 1 0 0 0 0 0 0 ;
            0 1 0 0 0 1 0 1 ;
            0 0 1 0 0 0 0 0 ;
            0 0 0 0 1 0 0 0 ;
            0 0 1 0 0 0 1 0 ;
            0 1 0 0 1 0 0 1 ;
            0 0 1 0 0 0 0 0 ;
            0 0 0 1 1 1 0 0 ;
            0 0 0 1 0 1 0 0 ;
            ];
      
      # b is the RHS of the constraint equations.
      b = [ 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ; 1 ];
      
      # The rest of this isn't conceptually important, just book keeping for Octave.
      lb = [ 0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ; 0 ];
      ub = [ inf ; inf ; inf ; inf ; inf ; inf ; inf ; inf ];
      ctype = "LLLLLLLLLLLLLLL";
      vtype = "IIIIIIII";
      sense = 1;
      [xstar,zstar,st,ex]=glpk(c,A,b,lb,ub,ctype,vtype,sense)
      

      Basically, we sum the number of cameras needed to cover all the parts of the stadium. You should be able to see the correlation between the variables I've defined above, and the mathematical representation that I linked.

      Now, in contrast, my LP will have a 199x199 constraint matrix, and a bunch of 199x1 and 1x199 other variables. In all of its glory, here it is:

      code was too long for the post, so moved to pastebin

      Step 4: We save the above to a file, and run it in Octave.

      After running the program, which takes fractions of a second, we get a solution array with an entry of 1 in the spot corresponding to each country whose passport is part of the solution set. Those are:

      • Afghanistan
      • Australia
      • Barbados
      • Benin
      • Canada
      • Jordan
      • Kenya
      • Maldives
      • Mali
      • North Korea
      • Papua New Guinea
      • Singapore
      • Turkmenistan

      To understand the results, we look at our data, which for certain restrictive countries which only allow themselves or one or two others, the minimal solution set to visit every country in the world would require a passport from that country or one of the few it accepts.

      Additionally, we know this is an optimal solution, but we don't know if it is unique. I.e., there could be other sets of 13 passports that grant access to every country in the world.

      Honestly, after futzing with the data, the actual LP writing was just a few minutes. So this is probably the most efficient way in programmer time to solve these sorts of optimization problems.

      I also hand wave away all the questions such as whether the ILP is in a convex space, but I'm not going to get that deep into it.

      Have a great night!

      18 votes
      1. [6]
        krellor
        Link Parent
        I made a quick adjustment based on comments from @xk3 about the years to acquire the passports. I quickly went through the CIA factbook and updated my spreadsheet to include years to...
        • Exemplary

        I made a quick adjustment based on comments from @xk3 about the years to acquire the passports. I quickly went through the CIA factbook and updated my spreadsheet to include years to naturalization. When a country prohibits naturalization without proving ancestry, I gave them a penalty years of 50, which will effectively exclude them from the solution. When the requirement was a range of years, I went with the lower end to see what the best case would be if you were clever. When different years were provided based on ancestry, I went with the value that didn't discriminate based on ethnicity.

        I then changed the objective function in the ILP from 1 for each country's variable to the years for naturalization, which is basically a 199x1 array of years. After running it, the initial results are:

        Years required: 116* (actually 61, see below)
        Solution:

        • Afghanistan, 5 years
        • Argentina, 2 years
        • Cameroon, 5 years
        • Canada, 3 years
        • India, 5 years
        • Kenya, 4 years
        • Liberia, 2 years
        • Libya, 3 years
        • New Zealand, 3 years
        • North Korea, 50 years
        • Pakistan, 4 years
        • Papua New Guinea, 8 years
        • Saudi Arabia, 5 years
        • Singapore, 10 years
        • Turkmenistan, 7 years

        Now, the elephant here is North Korea, which recognizes no passport for visa-free entry, so let's exclude it as an infeasible constraint. Additionally, Afghanistan is likely only included for entry to itself as well, although technically I think it recognizes one or two others. So let's give it a penalty value of 50 and see what we get.

        After running, I found that Afghanistan was still included, meaning it doesn't accept any of the other solution members' passports, and no other passport inclusion was more optimal.

        So, removing Afghanistan and North Korea, the actual optimal year is 61 years to be able to travel visa-free to almost every country in the world.

        A fun extension would be to make a tool that prompted people to rate countries in the order they would like to live, rank them again as interested in visiting, and build a program that would optimize for that user's preferences.

        7 votes
        1. [2]
          Tardigrade
          Link Parent
          Presumeably one could force the program to use passports one already owns by putting them with a naturalization weight of 0? It could be interesting to see how it would change if you already had a...

          Presumeably one could force the program to use passports one already owns by putting them with a naturalization weight of 0? It could be interesting to see how it would change if you already had a starting point of an EU option or US as neither of those are included.

          1 vote
          1. krellor
            Link Parent
            Surprisingly, that wouldn't guarantee it being selected. Because the underlying algorithms often traverse the boundary of the solution space, assuming linearity or convexity, it isn't a guarantee...

            Surprisingly, that wouldn't guarantee it being selected. Because the underlying algorithms often traverse the boundary of the solution space, assuming linearity or convexity, it isn't a guarantee that a freebie would be encountered before another optimal solution. The introduction of a freebie would possibly introduce the existence of multiple optimal solutions.

            So force a specific country in the solution, we would make it 0 cost like you said, and add a constraint forcing its inclusion.

            Great comment!

            2 votes
        2. [3]
          hydravion
          Link Parent
          I'm wondering how we could easily tell which countries each passport adds to the set of countries one will be able to visit visa-free to know which passports we can choose to forgo to reduce the...

          I'm wondering how we could easily tell which countries each passport adds to the set of countries one will be able to visit visa-free to know which passports we can choose to forgo to reduce the number of years we need to dedicate to acquire them.

          I suppose that if you wanted to minimize the number of years dedicated to acquiring passports while still being able to access most of the world visa-free, you would need to go with a combination like Canada + Australia + Singapore, but this is just me guessing, and I don't really know how many countries (and which ones) would still be left inaccessible visa-free. According to the data given by @xk3 this would probably take only around 9 years? This seems much more manageable, although it may not be the most optimal combination if you already own a powerful passport, like the French or German ones?

          1 vote
          1. krellor
            Link Parent
            The best way to do that is probably a filterable spreadsheet with some conditional highlighting logic that highlights duplicates in one color, and unique value in a second. So filter to your...

            The best way to do that is probably a filterable spreadsheet with some conditional highlighting logic that highlights duplicates in one color, and unique value in a second. So filter to your optional solution, and visually inspect.

            We could do it programmatically, but it might be overkill.

            Your second part is exactly what I envisioned worth reformulating the problem to once where we maximize countries visited subject to the constraints that limit the time spent, and force inclusion of passports we already have, or countries we want to live in.

            Great questions!

            1 vote
          2. xk3
            Link Parent
            Also, it would be interesting to see in comparison how many days it would take / how much money you'd need to spend to get a set of visas rather than another passport. Of course getting a visa and...

            Also, it would be interesting to see in comparison how many days it would take / how much money you'd need to spend to get a set of visas rather than another passport. Of course getting a visa and being able to use it to enter a country is not a "given" but neither is getting a passport a "given" for many countries

      2. TangibleLight
        (edited )
        Link Parent
        For anyone interested - there's a great SoME video on linear programming that might explain what a linear program actually does: The Art of Linear Programming. The chapter Integer Linear...

        For anyone interested - there's a great SoME video on linear programming that might explain what a linear program actually does: The Art of Linear Programming. The chapter Integer Linear Programming - at 14:01 - is particularly relevant, but the rest of the video leading up to that is probably required reading if you aren't already familiar.

        Also see the docs on glpk to get a sense of the format of these constraints. I'm going through to try to understand exactly how this formulation works, since I've never written a linear program before. To really spell things out (and as an invitation for corrections), here's my understanding of the representation:

        Each row of A represents a destination, and each column represents a passport. The variable x will represent the number of a given passport held, and then the matrix multiplication A x represents the number of held passports that grant access to a given destination.

        ctype=L, b=1 indicates a lower bound only on the result the equation is of the form A x >= 1 - we want to hold at least one passport that grants access to each country. We could discard a given destination or require holding more than one passport that grants access by changing the value of b for that country.

        lb = 0, ub = inf constrains the variables - we can't have less than 0 or more than infinite passports. We could indicate we already hold a particular passport by changing the value of lb on that country.

        Aside - it doesn't matter here since this is a minimization problem and taking a given passport more than once doesn't affect the cover, but I wonder if it is better to set ub=1? I'm not familiar with Octave to know the impact on performance that would have.

        vtype=I indicates this is an integer problem, we can't hold fractional passports.

        sense=1 indicates this is a minimization problem, we want to hold as few passports as possible.

        Then the linear program runs and finds a vector x that lists the count of each passport that we need to hold.


        I made a few a bunch of edits here for clarity as I'm understanding the problem better. @krellor could you please let me know if I've misunderstood something?

        4 votes
      3. [2]
        hydravion
        (edited )
        Link Parent
        Hey krellor, that rocks! I like your unusual angle, I actually didn't even know about such methods! Is there a way to list all the optimal solutions that exist using this method? PS - Interesting...

        Hey krellor, that rocks! I like your unusual angle, I actually didn't even know about such methods!

        Is there a way to list all the optimal solutions that exist using this method?

        PS - Interesting to see that many powerful passports fail to grant you visa-free access to India, but the Maldives' passport does, and also that Singapore's passport grants you visa-free access to China.

        1. krellor
          Link Parent
          There are a few obscure algorithms that could be used to enumerate all optimal solutions in polynomial through the use of lattices in polyhedra. We could also determine the family of optimal...

          There are a few obscure algorithms that could be used to enumerate all optimal solutions in polynomial through the use of lattices in polyhedra. We could also determine the family of optimal solutions by calculating the convex hull (shape or box) around all optimal solutions. There are probably lots of visualization tools that would make it easy to inspect.

          But in this case I suspect that there are only a small number of optimal solutions, and we might find them by perturbing our initial position or objective function. For example, if we know the optimal is 61 years and we remove Afghanistan and North Korea, then we could change the problem to maximize the number of countries visited, while futzing with the constraints to selectively include our exclude countries in our original optimal. Or simply force the inclusion of one of the strong passports and see what happens.

          We could also use repeated iterations of things like simulated annealing whose initial condition is situated near the optional solution.

          Finally, we could perform a constraint reduction algorithm that would dramatically decrease the number of constraints, and may allow us to explore the solutions better.

          Sorry to give a complicated answer. The gist is that yes, there are ways to do it, but they aren't trivial.

          1 vote
  4. [3]
    xk3
    Link
    I wrote a website that helps plan multiple passport trips: https://unli.xyz/city/honeymoon/ I wrote it to help plan honeymoons with my wife but it might also be useful for people planning...

    I wrote a website that helps plan multiple passport trips: https://unli.xyz/city/honeymoon/ I wrote it to help plan honeymoons with my wife but it might also be useful for people planning conferences.

    For people that own multiple passports I have another website which lets you choose "solo travel" in the passport-visa country filter: https://unli.xyz/city/calc/

    krellor answered the programming challenge aspect of this question pretty well, though you will probably want to consider how difficult it is to get a passport. If they are ordinary passports most countries require you to be present in the country for a certain length of time--and you might not live long enough:

    Country Naturalization minimum requirement (years) Parallel Sequence
    Afghanistan 5 5 5
    Australia 4 4 4
    Barbados 5 of 7 5 7
    Benin 3 3 3
    Canada 3 of 5 3 5
    Jordan 15 15 15
    Kenya 7 7 7
    Maldives 12 12 12
    Mali 1 if u born a kid; 5 if exceptional service 1 5
    North Korea not public information
    Papua New Guinea 8 8 8
    Singapore 2 of 6 2 6
    Turkmenistan 7 7 7
    Total 72 84
    4 votes
    1. krellor
      Link Parent
      So there's a few ways we could incorporate this into my ILP. Since the objective function that we seek to minimize is a 199x1 array of boolean variables, simple changing the coefficients from 1 to...

      So there's a few ways we could incorporate this into my ILP. Since the objective function that we seek to minimize is a 199x1 array of boolean variables, simple changing the coefficients from 1 to the number of years it takes to get a visa from that country would then minimize the problem of the least years it takes to collect them.

      Additionally, for the spirit of the problem, certain countries like North Korea should just be left off, along with any country that doesn't have any visa free travel.

      We could also reformulate the problem into a maximization problem where we try and maximize the number of countries we can visit visa free subject to a years waiting constraint, such as 20. So you could set a number of years and be given a solution that maximizes what you get for it.

      We could also introduce weights to indicate countries of preference.

      Have a great day!

      3 votes
    2. hydravion
      Link Parent
      Thanks for linking your websites! They look like interesting tools.

      Thanks for linking your websites! They look like interesting tools.

  5. [4]
    ra314
    Link
    Scraping the data feels like it's the biggest of the 4 challeneges. From a brief glance at the website I think going through all the country codes and visiting the URL ->...

    Scraping the data feels like it's the biggest of the 4 challeneges. From a brief glance at the website I think going through all the country codes and visiting the URL -> https://cdn.henleyglobal.com/storage/app/media/HPI/AD_visa_full.pdf and then somehow parsing the PDF is the best idea I can think of.

    But once you have the data, say as a 2d array indexed on the countries, it seemed to be that this was a known problem. Using chatGPT I found out that this problem is the optimization version of the set cover problem. wiki link

    3 votes
    1. [3]
      Greg
      Link Parent
      Good news, everyone! The same data is also available as JSON: https://api.henleypassportindex.com/api/v3/visa-single/AD

      Good news, everyone! The same data is also available as JSON: https://api.henleypassportindex.com/api/v3/visa-single/AD

      7 votes
      1. [2]
        hydravion
        Link Parent
        Nice! A quick Google search also yielded this and this. Thanks again!

        Nice! A quick Google search also yielded this and this. Thanks again!

        4 votes
        1. TangibleLight
          (edited )
          Link Parent
          Sharing my scraping and problem setup Python script - there is no solution here. I try to be a good internet citizen, so it rate-limits outgoing requests and caches all responses in a shelf so it...

          Sharing my scraping and problem setup Python script - there is no solution here. I try to be a good internet citizen, so it rate-limits outgoing requests and caches all responses in a shelf so it never requests the same country data more than once. Depends on the requests package.

          cc @ra314 in case you find this helpful.

          Scraping and problem setup
          import logging
          import time
          
          import requests
          import shelve
          
          RATE_LIMIT = 0.5  # seconds to wait between requests
          
          URL = "https://api.henleypassportindex.com/api/v3"
          
          SESSION = requests.session()
          SESSION.headers.update({"Accept": "application/json"})
          
          SHELF = "cache"
          
          
          def main(graph: dict[str, list[str]], names: dict[str, str]):
              """
              Solve the minimal set cover problem.
          
              :param graph: for each country passport, list of countries with visa-free travel.
              :param names: mapping from country codes to country names
              """
          
              print(names)
              print(graph)
          
          
          def rate_limit():
              try:
                  time.sleep(time.time() - rate_limit.stamp + RATE_LIMIT)
              except AttributeError:
                  pass  # first call; don't need to sleep
              except ValueError:
                  pass  # negative delta; don't need to sleep
          
              rate_limit.stamp = time.time()
          
          
          def build_graph():
              names = {}
              graph = {}
          
              with shelve.open(SHELF) as shelf:
                  if "index" in shelf:
                      index = shelf["index"]
                  else:
                      response = SESSION.get(f"{URL}/countries")
                      response.raise_for_status()
                      index = shelf["index"] = response.json()
          
                  for entry in index["countries"]:
                      if entry["code"] in shelf:
                          detail = shelf[entry["code"]]
                      else:
                          rate_limit()
                          response = SESSION.get(f'{URL}/visa-single/{entry["code"]}')
                          response.raise_for_status()
                          detail = shelf[entry["code"]] = response.json()
          
                      names[detail["code"]] = detail["country"]
                      graph[detail["code"]] = [e["code"] for e in detail.get("visa_free_access", ())]
          
              return graph, names
          
          
          if __name__ == "__main__":
              logging.basicConfig(level=logging.DEBUG)  # to be sure we aren't spamming requests
              graph, names = build_graph()
              main(graph, names)
          

          Note the API provides more than just a list of countries with visa-free access; there are also fields visa_on_arrival, visa_required, and electronic_travel_authorisation on most (but not all) entries. Since my script above caches only the responses, you can still fetch those after-the-fact from the cache. See how I fetch the visa_free_access for reference.


          Some general thoughts about the challenge:

          This is the dominating set minimization problem. After a simple reframing, it is the set cover minimization problem. There is a straightforward approximate greedy solution, but it does not always find the smallest cover. I don't know if that's an issue on this input. A strict solution is NP complete, but these problems are well-known so I'd like to do a bit of research on what the more sophisticated algorithms look like.

          6 votes