This is an excerpt from GTO Poker Simplified by Dara O’Kearney and Barry Carter. The new book takes the most important lessons from the modern era of poker solvers and explains them in a way that anyone can understand.

What Rock, Paper, Scissors Teaches us About Poker

The best way to understand GTO strategy is to recognise its alternative, which is exploitative strategy. All players are essentially exploitative players because nobody can employ a perfect GTO strategy. You exploit your opponents whenever you adjust your strategy to capitalise on a weakness of theirs. You open yourself up to exploitation whenever you deviate from a perfect GTO strategy. You exploit others and open yourself up to exploitation all the time, however when you learn GTO you do so to a lesser degree.

Before we jump into poker it’s time to paint Mr Miyagi’s fence with an exercise that might seem pointless but ultimately will teach you the fundamental principles of GTO. You will no doubt be familiar with the schoolyard game Roshambo or Rock/Paper/Scissors. Rock blunts scissors, scissors cut paper, paper covers rock.

The Game Theory Optimal way to play Rock/Paper/Scissors is to pick each option an equal percentage of the time, but at random. If you pick Rock 1/3rd of the time, Scissors 1/3rd of the time and Paper 1/3rd of the time, while being random in their order, you cannot be exploited.

If you simulated Rock/Paper/Scissors 72 times and both players adopted this strategy, this is what the outcomes would be for Player 1:

		Player 2 (choose each option 1/3 of time, 24)
		Rock	Paper	Scissors	Result
Player 1 (choose each option 1/3 of time, 24)	Rock	Push	-8	+8	Breakeven
	Paper	+8	Push	-8	Breakeven
	Scissors	-8	+8	Push	Breakeven

This is what we call a balanced response, in that both players have a perfect balance of Rock/Paper/Scissors. What happens, however, if Player 2 has an unbalanced response? What if they have a preference for Rock, and will play it 36 times out of 72, playing Paper 18 times and Scissors 18 times?

This is what happens:

		Player 2 (choose rock 36, paper 18, scissors 18)
		Rock	Paper	Scissors	Result
Player 1 (choose each option 1/3 of time, 24)	Rock	Push	-6	+6	Breakeven
	Paper	+12	Push	-6	+6
	Scissors	-12	+6	Push	-6

Overall Player 1 breaks even again, but one of the plays is more profitable. When Player 1 picks Paper, they are up by six games overall, but when they pick Scissors they are down six games. By playing a Game Theory Optimal strategy, Player 1 gets the same outcome regardless of the strategy of Player 2, there is just more variance involved.

We know that Player 2 having a preference for Rock is a mistake though, so what can we do to capitalise on that? Pick more Paper, obviously. This is what happens if Player 1 picks Paper every single time, knowing what they know about Player 2’s strategy:

		Player 2 (choose rock 36, paper 18, scissors 18)
		Rock	Paper	Scissors	Result
Player 1 (choose paper 72)	Rock
	Paper	+36	Push	-18	+18
	Scissors

As you can see Player 1 gets crushed every time Player 2 picks Scissors, they lose 18 games. However, that is more than made up for every time Player 2 picks Rock. That leads to Player 1 winning 36 games, and being up 18 games overall.

Can you see a potential issue with Player 1 adopting this new strategy? Quite simply at some point Player 2 will realise that Player 1 is picking Paper every time, and adapt by picking Scissors more. In reality nobody would get away with this strategy for very long, so Player 1 would have to adopt a less extreme strategy. What if, for example, they chose to play Paper half the time and the other two options a quarter of the time each? That would look like this:

		Player 2 (choose rock 36, paper 18, scissors 18)
		Rock	Paper	Scissors	Result
Player 1 (choose rock 18, paper 36, scissors 18)	Rock	Push	-4.5	+4.5	Breakeven
	Paper	+18	Push	-9	+12
	Scissors	-9	+4.5	Push	-4.5

When Player 1 does this they are up 7.5 games overall, which is a long way off from the +36 winning streak but much more sustainable. Against a weak Roshambo player this could be a long term winning strategy which goes unnoticed. It also reminds me of a joke Scottish pro Ludo Geilich told me once:

A young boy enters a barber shop and the barber whispers to his customer: “This is the dumbest kid in the world. Watch while I prove it to you.”

The barber puts a dollar bill in one hand and two quarters in the other, then calls the boy over and asks, “Which do you want, son?”

The boy takes the quarters and leaves.

“What did I tell you?” said the barber. “That kid never learns!”

Later, when the customer leaves, he sees the same young boy coming out of the ice cream store.

“Hey, son! May I ask you a question? Why did you take the quarters instead of the dollar bill?”

The boy licked his cone and replied, “Because the day I take the dollar, the game is over!”

What if Player 1 misjudges Player 2, who starts to counter adjust? Player 2 notices Paper is coming up more often and makes a similar counter adjustment, switching to Scissors half the time and the other two options a quarter of the time each. The new outcome looks like this:

		Player 2 (choose rock 18, paper 18, scissors 36)
		Rock	Paper	Scissors	Result
Player 1 (choose rock 18, paper 36, scissors 18)	Rock	Push	-4.5	+9	+4.5
	Paper	+9	Push	-18	-12
	Scissors	-4.5	+4.5	Push	Breakeven

Now Player 1 has gone from winning +7.5 games to losing -7.5 games because of this counter adjustment. The exploitation strategy that saw them win +7.5 games has had the opposite effect when Player 2 noticed what was happening.

This is the core of the benefits and costs of an exploitative strategy. You stand to win much more when your assumptions are correct, but you open yourself up to exploitation. If your opponent adjusts, you lose. If your assumptions are incorrect you lose by exploiting yourself. If, however, you only play a GTO style you can only profit when your opponent leaves themselves open to exploitation. If you both play GTO you will end up playing to a stalemate, but if either of you divert from a GTO strategy you will leave yourself open to exploitation.

How you get exploited in poker

Poker is no different to Roshambo in this sense, other than it is much more complex because of the number of card combinations, the betting structure, the stack depths, multiple players and the variance involved. The same principles apply, if you adjust to exploit your opponent you win more when your assumptions are correct but leave yourself vulnerable to counter exploitation.

Let’s look at a typical example you will be familiar with as a player, which is when you flop the nut flush draw with an Ax suited type hand. This is a classic semi bluff situation and most good players know betting here is instantly profitable. If you take down the pot with an unmade hand, great. If you hit your flush you can get a lot of value in a bigger pot. If you hit your Ace that’s a good spot too. As such, most of us will bet in this spot and it will work out well most of the time.

What happens, however, if you check back on a board with a flush draw and the third card of the same suit hits the turn? Against a bad player you still can represent the flush but a thinking player who has shared some table time with you knows you always bet when you have the big draw. As such they can exploit you by check/raising when you bet the turn and put you in a tough spot, maybe even make you fold some of your better value hands. You cannot bluff in these spots because your opponent knows you never have the nuts.

The adjustment, therefore, is not to always check back with the nut flush draw but to mix the two strategies. Some of the time you bet with your semi bluff, sometimes you check back with it. This is what is known in poker as protecting your range or having a balanced range. Protecting a range means having the right balance of bluffs and value in all of your actions, so that your opponent does not know where you are in the hand.

If you semi bluff the flop some of the time with the nut flush draw, your opponent will call you more on the flop. This means you have a protected flop bet range and as a result you can value bet your made hands and they will get called, because your opponent knows you are capable of bluffing here. If you check back with the nut flush draw some of the time you will have a protected turn betting range. This means you can bluff more on the turn when you don’t have a hand because your opponent knows you are capable of having a flush here.

When you are capable of having bluffs and value in every spot, you become difficult to exploit. When you are only ever bluffing or only ever value betting in a spot, you become very easy to play against.

Click here to purchase your copy of GTO Poker Simplified