12 votes

Adversarial policies beat professional-level Go AIs

1 comment

  1. skybrian
    From the summary:

    From the summary:

    We attack KataGo, a state-of-the-art Go AI system, by training an adversarial policy against a frozen KataGo victim. We achieve a >99% win rate against a KataGo victim playing without search, which is comparable in strength to a top-100 European Go player. We achieve a >80% win rate against a victim playing with 64 visits, which we estimate to have comparable strength to the best human Go players. Notably the games show that the adversarial policy does not win by playing a strong game of Go, but instead by tricking KataGo into ending the game prematurely at point favorable to the adversary. Indeed, our adversary is easily beaten by a human amateur, despite being able to exploit policies that usually match or surpass the performance of the best human Go players.

    3 votes