This chapter will examine a very simplified model of Bluff, in order to prove that never bluffing is not an optimal strategy.
The real Bluff game allows a wide range of bids, from nothing to five sixes. There are 252 possible bids, which would not be practical to examine here. Suppose instead the dice can only give three results: in ascending order of value, A, B, and C. All these results are equiprobable; a match consists of a single hand.
Player 1 will adopt the following bidding strategy:
Player 2, on the other hand, will never bluff.
As for the trusting strategy, player 1 will never call a bluff, since player 2 never bluffs. As for player 2, the most generic trust strategy is:
Player 2's trust strategy is thus determined by six numbers.
Suppose now that a large number of matches is played. For each possible situation (player 1 score, player 1 bid, player 2 score), what is the expected score?
For example, if player 1 has an A, but bluffed and bid a C, while player 2 has a B, then player 2 will lose one point in any case, because player 1 bid a higher score; with probability qB, player 2 will call the bluff, and gain two points. The expected score for player 2 in this situation is therefore 2qB-1.
Computing the same result for all the possible situations, one obtains the following table (where probabilities are relative, rather than absolute):
Player 1 score | A | A | A | B | B | C |
Player 1 bid | A | B | C | B | C | C |
Probability | 1-b-c | b | c | 1-c' | c' | 1 |
Player 2 score | ||||||
A | 0 | 2pA-1 | 2qA-1 | -pA-1 | 2qA-1 | -qA-1 |
B | 1 | 2pB | 2qB-1 | -pB | 2qB-1 | -qB-1 |
C | 1 | 2pC+1 | 2qC | -pC+1 | 2qC | -qC |
The expected score for player 2 over nine matches is just the sum of the numbers in the lower half of the table, each weighted by the number in the "probability" row.
One can immediately notice that this average score is a sum of seven contributions: some terms are independent of the pX and qX parameters, while some others are respectively proportional to pA, pB, pC, qA, qB and qC. There are no quadratic or higher-order terms in the parameters of player 2's strategy; therefore, maximizing player 2's score is achieved by maximizing each contribution separately.
For example, the term proportional to pA is the following:
pA (2b - 1 + c')
If 2b + c' > 1, this term is maximized by setting pA = 1 (the maximum allowed value); if 2b + c' < 1, player 2 had better set pA = 0 instead.
By repeating the same reasoning for all six coefficients, player 2 obtains the following recipe:
If player 1 decides to operate at threshold, with 2b + c' = 2c + 2c' = 1, player 2's score will be the same whatever values are chosen for pX and qX. By calculating it in the case pX = qX = 0, player 2's score turns out to be
S = 2 (1-b-c) -2 c -2 c' - 2 = = -2 (b + 2c + c') = = -2 ((b + c'/2) + (2 c + 2 c') - 3/2 c') = = -2 (1/2 + 1 - 3/2 c') = = 3 (c' - 1)
The best threshold strategy for player 1 consists in putting c' = 0, and hence b = c = 1/2. When player 1 rolls an A, he always bluffs and declares B or C with even probability; when a B or C is rolled, player 1 does not bluff.
Whatever strategy player two adopts, it will lead to an average loss of at least three points every nine matches.
This concludes the proof that, at least in this simplified model, never bluffing is not an optimal strategy. A similar result can be reasonably expected to hold true even in the more complicated case of the complete Bluff rules.
Go to the first, previous, next, last section, table of contents.