Pluribus is the first AI bot capable of beating human experts in six-player no-limit Hold’em, the most widely played poker format in the world.
For over 12 days, the AI system named Pluribus tussled against 12 pros in two different settings. In one, the AI played alongside five human players, and in the other, five versions of the AI competed with one human player.
“Pluribus is a very hard opponent to play against. It’s really hard to pin him down on any kind of hand,” said Chris Ferguson, a six-time World Series of Poker champion. For the first time in history, an AI bot has shown the potential capacity of defeating top professionals in any major criterion game that has more than two players.
For decades, poker has been a challenging hurdle for the field of AI. Poker involves obscure information, and one does not know his/her opponents’ cards. It also entails bluffing and other strategies that do not apply to other regular games. Pluribus triumphs because it can very efficiently handle the challenges of a game with both secluded information and more than two players. It uses self-play to teach itself how to win, with no examples or guidance on tactics.
AI has definitively overthrown humans at another one of our popular games. A poker bot, designed by researchers from Facebook’s AI lab and Carnegie Mellon University, has bested some of the world’s top players in a series of games of six-person no-limit Texas Hold ‘em poker.
In the words of Noam Brown, a research scientist at Facebook AI Research, “WE’RE AT A SUPERHUMAN LEVEL, AND THAT’S NOT GOING TO CHANGE.”
Machine learning is already at its superhuman levels in board games like chess, Go, Starcraft II, and Dota. However, in a published Science paper, the scientists behind Pluribus think that this victory is a highly remarkable breakthrough in AI research.
In some ways, Pluribus plays like humans, while in other ways, it plays using completely Martian strategies. Precisely, Pluribus comes up with odd bet sizes and is good at randomization. “Its major strength is its ability to use mixed strategies. That’s the same thing that humans try to do. It’s a matter of execution for humans, to do this in a perfectly random way and to do so consistently. Most people just can’t.”, says Darren Elias, a renowned poker champion.
Firstly, Pluribus was taught how to play poker by getting it to play against copies of itself(self-play). It’s a very common technique used in AI training. Here, the system learns the game through trial and error. Pluribus was created in just eight days using a 64-core server equipped with less than 512GB of RAM. Training this program on cloud servers would cost only $150.
PLURIBUS WAS TRAINED IN JUST 8 DAYS FOR AN ESTIMATED COST OF JUST $150.
“IT CAN BLUFF BETTER THAN ANY HUMAN.”
Brown believes that the statement mentioned above is justified, since traditionally, bluffing has been thought of as a uniquely human trait, and as something that relies on our ability to lie and outwit someone. But it’s an art that can still be reduced to mathematically optimal strategies. “The AI doesn’t see bluffing as deceptive. It just sees the decision that will make it the most money in that particular situation,” he says, adding, “What we show is that an AI can bluff, and it can bluff better than any human.”
Does this mean that an AI has definitively bested humans at the world’s most popular game of poker? Well, with a past few AI victories, humans can undoubtedly learn from computers. Brown and Sandholm, the developers of Pluribus, strongly expect that the methods they have manifested could hence be applied in realms like financial negotiations, fraud prevention, and cybersecurity.