Hasbro Pass THE Pigs
|
|
Bookmark Hasbro Pass THE Pigs |
About Hasbro Pass THE PigsHere you can find all about Hasbro Pass THE Pigs like manual and other informations. For example: review.
Hasbro Pass THE Pigs manual (user guide) is ready to download for free.
On the bottom of page users can write a review. If you own a Hasbro Pass THE Pigs please write about it to help other people. [ Report abuse or wrong photo | Share your Hasbro Pass THE Pigs photo ]
Manual
Preview of first few manual pages (at low quality). Check before download. Click to enlarge.
Download
(English)Hasbro Pass THE Pigs, size: 595 KB |
Hasbro Pass THE Pigs
User reviews and opinions
| fourhead |
2:25am on Monday, September 13th, 2010 ![]() |
| I treated it with care. Boring My 7 yr old daughter loves it and loves the Super Marios Bros game. EASY TO USE EVERY DAY THE GAMES ARE TO SIMPLE | |
| Xword |
8:29am on Saturday, September 11th, 2010 ![]() |
| Nintendo is all i have to say lol none i can think of Takes some of the great features of the original Mario Go Kart None so far | |
| berkowij |
8:02am on Friday, September 3rd, 2010 ![]() |
| Great Company The company is very trustworthy. They sent my product out right away. | |
| atari |
7:55pm on Sunday, August 22nd, 2010 ![]() |
| I bought this for a 6 year old thinking "what have I just wasted my money on shes going to break it or lose it..".. | |
| Cristiano |
2:34pm on Monday, August 2nd, 2010 ![]() |
| The Nintendo DSis a dual-screen handheld game console developed and manufactured by Nintendo. It was released in 2004 in Canada, the United States. | |
| mygravity |
5:12am on Wednesday, July 14th, 2010 ![]() |
| I dont really like the Nintendo DS, but I can say that it does have some strengths. I would recommend the Nintendo DS Lite much more than the DS. | |
| philpem |
10:11pm on Wednesday, July 7th, 2010 ![]() |
| Electronic Arts (EA) Sports continued their excellence in the realm of game play when they released the 2006 edition of the Madden Football series. | |
| svenali |
11:52am on Saturday, June 19th, 2010 ![]() |
| I purchased this for my son,and it works great keeps him busy. Easy To Set Up, Excellent Gameplay, Fun For All Ages, Great Graphics. | |
| petersdj |
10:29am on Wednesday, June 2nd, 2010 ![]() |
| II buy first history of Nintendo DS and reall... Battery life is longer, stable hardware than DS Lite and DSi. | |
Comments posted on www.ps2netdrivers.net are solely the views and opinions of the people posting them and do not necessarily reflect the views or opinions of us.
Documents

Pig is a folk jeopardy dice game described by John Scarne in 1945, and is an ancestor of the commercial game Pass the Pigs ( David Moffat Enterprises and Hasbro, Inc.) PIG The object of the game PIG is to be the first player to score 100 points. Each turn, a player repeatedly rolls a single die until either the player decides to hold (stop rolling) or a 1 is rolled. (Both end the turn.) If a 1 is rolled, the player scores nothing. If the player holds before a 1 is rolled, the player scores the turn total, the sum of the rolls of that turn. Players take turns until one player wins by holding and reaching a score of 100 or more points.
For example, the first player, Ann, begins a turn with a roll of 5. Ann could hold and score 5 points, but chooses to roll again. Ann rolls a 2, and could hold with a turn total of 7 points, but chooses to roll again. Ann rolls a 1, and must end her turn without scoring. The next player, Bob, rolls the sequence 4-5-3-5-5, after which he chooses to hold, and adds his turn total of 22 points to his score.
If a single 1 is rolled, the player scores nothing and the turn ends. If two 1s are rolled, the players entire score is lost, and the turn ends.
BIG PIG This variation is the same as TWO-DICE PIG, except: If two 1s are rolled, the player adds 25 to the turn total. If other doubles are rolled, the player adds twice the value of the dice to the turn total. SWINE HERD
by Todd Neller, based on the group variation Skunk
This game is often played with special dice where the 1 is replaced by an image of a pig. TWO-DICE PIG This variation is the same as PIG, except: Two standard dice are rolled. If neither shows a 1, their sum is added to the turn total.
This variation allows a large group of people to play PIG together quickly. All players take turns at the same time as follows: At the beginning of a turn, all players stand up. One player (the swineherd) directs play and rolls a single die for everyone. After each roll, the swineherd calls out the current turn total, and pauses to allow players to hold. A player holds by sitting down. The swineherd directs holding players to call out their new scores. If all players are seated or a 1 is rolled, the turn is over. In the event of tied scores of 100 points or more, all tied players share the victory. PLAY PIG ONLINE http://cs.gettysburg.edu/projects/pig Play a perfect computer Pig player. See 3D visualizations of perfect strategy. Learn about the games history and more variations.

Solving the Dice Game Pig: an introduction to dynamic programming and value iteration
Todd W. Neller Ingrid Russell, Zdravko Markov , July 5, 2005
The Game of Pig
The object of the jeopardy dice game Pig is to be the rst player to reach 100 points. Each players turn consists of repeatedly rolling a die. After each roll, the player is faced with two choices: roll again, or hold (decline to roll again). If the player rolls a 1, the player scores nothing and it becomes the opponents turn. If the player rolls a number other than 1, the number is added to the players turn total, the sum of the rolls during the turn, and the players turn continues. If the player holds, the turn total is added to the players score, and it becomes the opponents turn. For such a simple dice game, one might expect a simple optimal strategy, such as in Blackjack (e.g., stand on 17 under certain circumstances, etc.). As we shall see, this simple dice game yields a much more complex and intriguing optimal policy. In our exploration of Pig we will learn about dynamic programming and value iteration, covering fundamental concepts of reinforcement learning techniques. For the interested reader, there is a companion Game of Pig website1 that features an optimal Pig computer player, VRML visualizations of the optimal policy, and information about Pig and its variants.
Simple Tactics
The game of Pig is simple to describe, but is it simple to play well? More specically, how can we play the game optimally? Knizia [5] describes simple tactics where each roll is viewed as a bet that a 1 will not be rolled:. we know that the true odds of such a bet are 1 to 5. If you ask yourself how much you should risk, you need to know how much there is to gain. A successful throw produces one of the numbers 2, 3, 4, 5, and 6. On average, you will gain four points. If you put 20 points at stake this brings the odds to 4 to 20, that is 1 to 5, and makes a fair game. Whenever your accumulated points are less than 20, you should continue throwing, because the odds are in your favor. [5, p. 129]
Corresponding author: tneller@gettysburg.edu, Gettysburg College, Department of Computer Science, Campus Box 402, Gettysburg, PA 17325-See URL http://cs.gettysburg.edu/projects/pig/index.html.
However, Knizia also notes that there are many circumstances in which one should deviate from this hold at 20 policy. Why does this reasoning not dictate an optimal policy for all play? The reason is that risking points is not the same as risking the probability of winning. Put another way, playing to maximize expected score is dierent from playing to win. For a clear illustration, consider the following extreme example. Your opponent has a score of 99 and will likely win in the next turn. You have a score of 78 and a turn total of 20. Do you follow the hold at 20 policy and end your turn with a score of 98? Why not? Because the probability of winning if you roll once more is higher than the probability of winning if the other player is allowed to roll. The hold at 20 policy may be a good rule of thumb, but how good is it? Under what circumstances should we deviate from it and by how much?
Maximizing the Probability of Winning
Let Pi,j,k be the players probability of winning if the players score is i, the opponents score is j, and the players turn total is k. In the case where i + k 100, we have Pi,j,k = 1 because the player can simply hold and win. In the general case where 0 i, j < 100 and k < 100 i, the probability of a player who plays optimally (an optimal player ) winning is Pi,j,k = max (Pi,j,k,roll , Pi,j,k,hold ), where Pi,j,k,roll and Pi,j,k,hold are the probabilities of winning for rolling or holding, respectively. That is, the optimal player will always choose the action yielding the higher probability of winning. These probabilities are Pi,j,k,roll = 1 ((1 Pj,i,0 ) + Pi,j,k+2 + Pi,j,k+3 + Pi,j,k+4 + Pi,j,k+5 + Pi,j,k+6 ) 6 Pi,j,k,hold = 1 Pj,i+k,0 The probability of winning after rolling a 1 or after holding is the probability that the other player will not win beginning with the next turn. All other outcomes are positive and dependent on the probabilities of winning with higher turn totals. At this point, we can see how to compute the optimal policy for play. If we can solve for all probabilities of winning in all possible game states, we need only compare Pi,j,k,roll with Pi,j,k,hold for our current state and either roll or hold depending on which has a higher probability of resulting in a win. Solving for the probability of a win in all states is not trivial, as dependencies between variables are cyclic. For example, Pi,j,0 depends on Pj,i,0 which in turn depends on Pi,j,0. This feature is easily illustrated when both players roll a 1 in subsequent turns. Put another way, game states can repeat, so we cannot simply evaluate probabilities from the end of the game backwards to the beginning, as in dynamic programming or its game-theoretic form, known as the minimax process (introduced in [12]; for a modern introduction to that subject, see [10, Ch. 6]).
Exercises
1. Solving by hand: Solve for the following win probabilities. Each set is dependent on the solution of the previous set (a) P99,99,0 , (b) P98,99,0 and P99,98,0 , and 2
(c) P97,99,0 , P97,99,2 , and P99,97,0. 2. The Gamblers Fallacy: In a Mathematics Teaching in the Middle School article [3], a variant of Pig called SKUNK was featured as part of an engaging middle-school exercise that encourages students to think about chance, choice, and strategy. The article states, To get a better score, it would be useful to know, on average, how many good rolls happen in a row before a one or double ones come up. (In this Pig variant, a one or double ones are undesirable rolls.) Dene the Gamblers Fallacy and explain why this statement is an example of a Gamblers Fallacy.
Dynamic Programming
Dynamic programming is a powerful technique which uses memory to reduce redundant computation. Although dynamic programming is not directly applicable to the solution of Pig, we will see that it can approximate the optimal policy through application to a very similar game. We will rst introduce dynamic programming through the simple task of computing Fibonacci sequence numbers. Then we will introduce and solve Progressive Pig, a slight variation of Pig.
Remembering Fibonacci Numbers
As a simple illustrative example, consider the Fibonacci sequence 1, 1, 2, 3, 5, 8, 13. dened as follows: bn = 2.1.1 Simple Recursive Approach 1 bn1 + bn2 n = 1, 2 n>2
From this recursive denition, we can write a simple recursive algorithm to compute fib(n) for some positive integer n: Compute b(n) public static long fib(int n) { if (n <= 2) return 1; else return fib(n - 1) + fib(n - 2); } Side note: This format of presenting code in chunks is called literate programming and is due to Donald Knuth. Code will be presented in named chunks that will appear inserted within other chunk denitions. The source document is used not only to generate the output you are now reading, but also to generate the example code. In this way, the code presented is both consistent with the example code, and has been tested for correctness. Observe the behavior of this algorithm as we test it for successively higher values of n. (You will need to stop the process.) Test recursive implementation for (int n = 1; n < MAX; n++) System.out.println("fib(" + n + ") = " + fib(n));
The reason that the process slows considerably is that the number of recursive calls increases exponentially with n. Consider a call to fib(5). This in turn causes recursive calls to fib(4) and fib(3). These in turn have recursive calls of their own, illustrated below: fib(2) fib(3) fib(4) fib(1) fib(2) fib(5) fib(3) fib(2) fib(1) Note that the computation for fib(3) is performed twice. This problem of redundant computation becomes even more noticeable as n increases. The following table shows how many times each recursive call is performed as n increases2. Recursive Calls for fib(n) fib(2) fib(3) fib(4) 21 13. 1597. 196418. 24157817
n 40 2.1.2
fib(1) 39088169
fib(5) 14930352
Dynamic Programming Approach
We can avoid this computational time ineciency by storing the Fibonacci numbers that have been computed and retrieving them as needed.
2 It is interesting to note that the Fibonacci sequence appears in each table column, with the ratio of successive pairs asymptotically approaching the golden ratio of (1 + 5)/2.
Compute public public public
b(n) with dynamic programming static final int MAX = 90; static boolean[] computed = new boolean[MAX]; static long[] result = new long[MAX];
public static long fibDP(int n) { // Compute and store value if not already stored if (!computed[n]) { if (n <= 2) result[n] = 1; else result[n] = fibDP(n - 1) + fibDP(n - 2); computed[n] = true; } // Retrieve and return stored value return result[n]; } Now observe the behavior of this dynamic programming algorithm as we test it for successively higher values of n. Test recursive implementation with dynamic programming for (int n = 1; n < MAX; n++) System.out.println("fibDP(" + n + ") = " + fibDP(n)); The full test code implementation which computes using dynamic programming rst is given as follows: FibDemo.java public class FibDemo { Compute b(n) with dynamic programming Compute b(n) public static void main(String[] args) { Test recursive implementation with dynamic programming Test recursive implementation } } The key tradeo to observe between these algorithms is that dynamic programming uses additional memory to cut computational time complexity from exponential to linear3. Memory is used to save time. This is a common tradeo in the art of algorithm design. Given the relative cheapness of memory there are many problems where it makes sense to store the results of computations to avoid recomputing them.
Progressive Pig
Now we turn our attention back to Pig. We cannot simply solve Pig using a recursive approach with dynamic programming, because there are cyclic dependencies between the variables. Dynamic pro3 Of course, there is a simpler linear time algorithm for computing Fibonacci numbers. However, many problems are based on a complex set of interacting subproblems not amenable to such an approach.
gramming depends on acyclic dependencies between computations that allow us to compute results sequentially from simple (base) computations without such dependencies, to those computations which depend on the base computations, etc. In the Figure 1, we visualize computations as nodes. Dependencies of computations on the results of other computations are indicated by arrows. Each stage of dynamic programming computation (concentric ellipses) is dependent only upon the computations of previous stages. The cyclic dependencies of Pig (e.g. Pi,j,0 Pj,i,0 Pi,j,0 ) prevent us from dividing its computation into such stages.
Figure 1: Partitioning dynamic programming computation into stages However, we can approximate optimal play for Pig by making a small change to the rules that makes the variable dependencies acyclic. That is, we slightly modify the game such that game states can never repeat and always progress towards the end of the game. We will call this modied game Progressive Pig. Optimal play for Progressive Pig will approximate optimal play for Pig. Progressive Pig is identical to Pig except that a player always scores at least 1 point each turn: If the player rolls a 1, the player scores 1 point and it becomes the opponents turn. If the player rolls a number other than 1, the number is added to the players turn total and the players turn continues. If the player holds, the greater of 1 and the turn total is added to the players score and it becomes the opponents turn. Thus the equations for Pi,j,k = max (Pi,j,k,roll , Pi,j,k,hold ), the probability of winning Progressive Pig with optimal play, are Pi,j,k,roll = 1 ((1 Pj,i+1,0 ) + Pi,j,k+2 + Pi,j,k+3 + Pi,j,k+4 + Pi,j,k+5 + Pi,j,k+6 ) 6 Pi,j,k,hold = 1 Pj,i+max (k,1),0
Solving Progressive Pig
To solve Progressive Pig (abbreviated P-Pig), we keep track of the goal score, and establish variables to manage and store the results of our computations. The 3D boolean array computed keeps track of which (i, j, k) states have been computed. The computed value p[i][j][k] corresponds to Pi,j,k. The computed value roll[i][j][k] indicates whether or not it is optimal to roll in state (i, j, k). P-Pig variable denitions int goal; boolean[][][] computed; double[][][] p; boolean[][][] roll; When constructing a solution to P-Pig, we supply the goal score as a parameter. Solve P-Pig PPigSolver(int goal) { this.goal = goal; computed = new boolean[goal][goal][goal]; p = new double[goal][goal][goal]; roll = new boolean[goal][goal][goal]; Compute all win probabilities } After we initialize the variables, we compute all win probabilities for all states. Compute all win probabilities for (int i = 0; i < goal; i++) // for all i for (int j = 0; j < goal; j++) // for all j for (int k = 0; i + k < goal; k++) // for all k pWin(i, j, k); In method pWin below, we rst check to see if one player has won, returning a win probability of 0 or 1 depending on which player reached 100. Note the implicit assumption that an optimal player with a winning turn total will hold and win. This limits us to a nite state space. Secondly, we check to see if this probability has already been computed and stored. If so, we return it. This is our dynamic programming step. Then, if we have not yet returned a result, we must compute it. However, these previous steps ensure that we will not be redundantly computing probabilities in the recursive calls. Compute public if if if the probability of winning with optimal play double pWin(int i, int j, int k) { (i + k >= goal) return 1.0; (j >= goal) return 0.0; (computed[i][j][k]) return p[i][j][k];
Recursively compute p[i][j][k] return p[i][j][k]; } To recursively compute p[i][j][k], we merely translate the equations of Section 2.2 to code. Below, pRoll and pHold represent the probabilities of winning with a roll and a hold, respectively.
Recursively compute p[i][j][k] // Compute the probability of winning with a roll double pRoll = 1.0 - pWin(j, i + 1, 0); for (int roll = 2; roll <= 6; roll++) pRoll += pWin(i, j, k + roll); pRoll /= 6.0; // Compute the probability of winning with a hold double pHold; if (k == 0) pHold = 1.0 - pWin(j, i + 1, 0); else pHold = 1.0 - pWin(j, i + k, 0); // Optimal play chooses the action with the greater win probability roll[i][j][k] = pRoll > pHold; if (roll[i][j][k]) p[i][j][k] = pRoll; else p[i][j][k] = pHold; computed[i][j][k] = true; We now include code to summarize results of the computation. We rst print the probability of a rst player win with optimal play. Then for each i, j pair, we list the k values where the player changes policy (e.g. from roll to hold). Summarize results public void summarize() { System.out.println("p[0][0][0] = " + p[0][0][0]); System.out.println(); System.out.println("i\tj\tPolicy changes at k ="); for (int i = 0; i < goal; i++) // for all i for (int j = 0; j < goal; j++) { // for all j int k = 0; System.out.print(i + "\t" + j + "\t" + Policy for (i,j,k) ); for (k = 1; i + k < goal; k++) // for all valid k if (roll[i][j][k] != roll[i][j][k-1]) System.out.print(k + " " + Policy for (i,j,k) ); System.out.println(); } } Where the policy string hold or roll is chosen using the Java selection operator: Policy for (i,j,k) (roll[i][j][k] ? "roll " : "hold ") The output line roll 25 hold 35 roll 38 hold 77 roll would indicate that when a player has a score of 15 their opponent has 70, the player should roll for turn total values 0-24, 35-37, and 77. Note that, in practice, an optimal player would never reach a turn total of 35 points, as there is is no way to pass from 0-24 to 35 without passing through a hold state. Finally, we put these pieces together and test the computation with the goal set to 100.
PPigSolver.java public class PPigSolver { P-Pig variable denitions Solve P-Pig Compute the probability of winning with optimal play Summarize results public static void main(String[] args) { new PPigSolver(100).summarize(); } }
3. Pig Solitaire: Consider the solitaire (single player) game of Pig where a player is challenged to reach a given goal score g within n turns. (a) Dene the state space. (b) Write the equations that describe optimal play. (c) Prove that the state space is acyclic, i.e. that states cannot repeat. (d) Compute the optimal policy for g = 100 and n = 10. Again, assume that an optimal player with a winning turn total will hold and win. (e) Summarize or visualize the policy, and describe it qualitatively in your own words. (f) For g = 100, what is the smallest n for which the optimal players initial win probability is .50? 4. Pig Solitaire 2: Consider the solitaire (single player) game of Pig where one is challenged to maximize ones score within n turns. Now, rather than seeking to maximize the probability of a win, one seeks to maximize the expected score. (a) Dene the state space. (b) Write the equations that describe optimal play. (c) Prove that the state space is acyclic, i.e. that states cannot repeat. (d) Compute the optimal policy for n = 5. In order to limit ourselves to a nite state space, assume that the player will always hold with a suciently high score or turn total (e.g. i, k 500). You will need to experiment with dierent limits to be assured that your policy is optimal and not aected by your limits. (e) Summarize or visualize the policy, and describe it qualitatively in your own words. 5. THINK Solitaire: In [4], Falk and Tadmor-Troyanski analyze a 2-dice Pig variant called THINK. THINK is identical to Pig, except that Two standard dice are rolled. If neither shows a 1, their sum is added to the turn total. If a single 1 is rolled, the players turn ends with the loss of the turn total. If two 1s are rolled, the players turn ends with the loss of the turn total and score. Each player gets only ve turns, one for each letter of THINK. The highest score at the end of ve turns wins.
In this exercise, you will compute optimal play for a solitaire player seeking to maximize their THINK score in ve turns. (a) Dene the state space. (b) Write the equations that describe optimal play. (c) Prove that the state space is acyclic, i.e. that states cannot repeat. (d) Compute the optimal policy. In order to limit ourselves to a nite state space, assume that the player will always hold with a suciently high score or turn total (e.g. i, k 500). You will need to experiment with dierent limits to be assured that your policy is optimal and not aected by your limits. (e) Summarize or visualize the policy, and describe it qualitatively in your own words.
Advanced Projects
The board game Risk R , rst published in 1959, is arguably the most popular war game internationally. Players seek global domination through a series of battles between adjacent territories on a simplied world map. The outcome of conicts between territories occupied by army pieces are determined by dice rolling. The attacker declares the intent of rolling 1, 2, or 3 dice. The attacker must have at least one more army than the number of dice rolled. The defender then declares the intent of rolling 1 or 2 dice. The defender must have at least as many armies as the number of dice rolled. Both players roll their declared number of dice and sort them. Highest dice and, if applicable, second highest dice of the players are compared. For each pair the player with the lower number loses an army. If the pair is tied, the attacker loses an army. For example, suppose the attacker rolls 5-3-1 and the defender rolls 4-3. Comparing 5 and 4, the defender removes an army. Comparing 3 and 3, the attacker removes an army. After each roll, the attacker decides whether to retreat or continue the attack, repeating the process. Risk rules4 may be found at Hasbros website. Several probabilistic analyses of Risk battles have been published in recent years. In [9], Jason Osborne of North Carolina State University computed odds of victory in a Risk battle under the assumption that the attacker never retreats, pressing the attack until victorious or reduced to 1 army, and both players always roll the maximum permissible number of dice. Conrm the results of [9]5. As a more advanced exercise, devise and compute a more advanced Risk analysis. For example, one Risk tactic is to eliminate a player, claiming the defeated players valuable Risk cards. Although one can often perceive a chain of attacks that might achieve this goal, it is dicult to assess the probable outcome of such a series of attacks. After conrming the results of [9], consider generalizing your work to answer the following question: Given a positive number of attacking armies a, and a sequence d1 ,. , dn of the positive number of defending armies in successively adjacent territories, What is the probability of victory, i.e. total occupation of the chain of territories? How many armies should the attacker expect to lose on average?
4 See 5 already
URL http://www.hasbro.com/common/instruct/Risk1999.PDF. conrmed by the author
How many armies should the defender expect to lose on average? Assume that the attacker never retreats, that both players always roll the maximum permissible number of dice, and that the attacker always moves all but one army into a conquered territory. The author has computed such probabilities and conrmed interesting properties of defensive congurations. Given such a program, the author recommends the following puzzle: In a chain of 6 territories, suppose an attacker occupies the leftmost territory with 30 armies, and the defender occupies the remaining 5 territories with 30 armies. How should the defender distribute these armies so as to minimize the attackers probability of successful chain occupation? (Each territory must contain at least one army.) It is often advantageous to maintain a strong front in the game. That is, ones armies should usually be concentrated in those territories adjacent to opponent territories. Compute win probabilities for congurations with 30 attackers and 30 defenders distributed in chains as follows: 26, 1, 1, 1, 1 22, 2, 2, 2, 2 18, 3, 3, 3, 3 14, 4, 4, 4, 4 10, 5, 5, 5, 5 6, 6, 6, 6, 6 What do you observe? Compare your observations with those of section 3.3 of the Risk FAQ6. 2.5.2 Yahtzee
In each turn of the popular dice game Yahtzee R , players can roll and reroll dice up to three times each turn in order to form a high-scoring combination in one of several scoring categories. Hasbros Yahtzee rules7 can be found online. Phil Woodward has computed optimal solitaire play for Yahtzee, i.e. the policy that maximizes score for a single player [13]. Although Yahtzee can in principle be solved for any number of players with dynamic programming, the size of the state space as well as details such as the bonus rules make this a challenging project for the basic solitaire case. For students wishing to compute optimal play for a simpler solitaire game similar to Yahtzee, there are a number of Yahtzee variants described in [5]. For example, the category dice game of Hooligan allows scoring in seven categories: six number categories (Ones, Twos, Threes, Fours, Fives, and Sixes), and Hooligan (a straight: 1-2-3-4-5 or 2-3-4-5-6). Rules for Hooligan can also be found online at DicePlays Hooligan Dice Game page8. You may even wish to invent your own simplied variant of Yahtzee. Optimal solitaire play for even the simplest variants may surprise you!
URL http://www.kent.ac.uk/IMS/personal/odl/riskfaq.htm. URL http://www.hasbro.com/common/instruct/Yahtzee.pdf. 8 See URL http://homepage.ntlworld.com/dice-play/Games/Hooligan.htm.
Value Iteration
Value iteration [11, 1, 2] is a process by which we iteratively improve estimates of the value of being in each state until our estimates are good enough. For ease of explanation, we will rst introduce a simpler game we have devised called Piglet. We will then describe value iteration and show how it is applied to Piglet.
Piglet
Piglet is very much like Pig except it is played with a coin rather than a die. The object of Piglet is to be the rst player to reach 10 points. Each turn, a player repeatedly ips a coin until either a tail is ipped or the player holds and scores the number of consecutive heads ipped. At any time during the players turn, the player is faced with two choices: ip or hold. If the coin turns up tails, the player scores nothing and it becomes the opponents turn. Otherwise, the players turn continues. If the player chooses to hold, the number of consecutively ipped heads is added to the players score and it becomes the opponents turn. The number of equations necessary to express the probability of winning in each state is still too many for a pencil and paper exercise, so we will simplify this game further. Now suppose the object is to be the rst player to reach 2 points. As before, let Pi,j,k be the players probability of winning if the players score is i, the opponents score is j, and the players turn total is k. In the case where i + k = 2, Pi,j,k = 1 because the player can simply hold and win. In the general case where 0 i, j < 2 and k < 2 i, the probability of a player winning is Pi,j,k = max (Pi,j,k,ip , Pi,j,k,hold ) where Pi,j,k,ip and Pi,j,k,hold are the probabilities of winning if one ips and holds, respectively. The probability of winning if one ips is Pi,j,k,ip =.5((1 Pj,i,0 ) + Pi,j,k+1 ) The probability Pi,j,k,hold is just as before. Then the equations for the probabilities of winning in each state are given as follows: P0,0,0 P0,0,1 P0,1,0 P0,1,1 P1,0,0 P1,1,0 = = = = = = max (.5((1 P0,0,0 ) + P0,0,1 ), 1 P0,0,0 ) max (.5((1 P0,0,0 ) + 1), 1 P0,1,0 ) max (.5((1 P1,0,0 ) + P0,1,1 ), 1 P1,0,0 ) max (.5((1 P1,0,0 ) + 1), 1 P1,1,0 ) max (.5((1 P0,1,0 ) + 1), 1 P0,1,0 ) max (.5((1 P1,1,0 ) + 1), 1 P1,1,0 ) (1)
Once these equations are solved, the optimal policy is obtained by observing which action maximizes max (Pi,j,k,ip , Pi,j,k,hold ) for each state.
Value iteration is an algorithm that iteratively improves estimates of the value of being in each state. In describing value iteration, we follow [11], which we also recommend for further reading. We assume that the world consists of states, actions, and rewards. The goal is to compute which action to take in each state so as to maximize future rewards. At any time, we are in a known state s of a nite set of states S. For each state s, there is a nite set of allowable actions A. For any two states s, s S 12
Output hold values public void outputHoldValues() { for (int i = 0; i < goal; i++) { for (int j = 0; j < goal; j++) { int k = 0; while (k < goal - i && flip[i][j][k]) k++; System.out.print(k + " "); } System.out.println(); } } Finally, we construct a solution for Piglet with a goal score of 10 and a convergence epsilon of 109 , and then output the hold values. Execute program public static void main(String[] args){ new PigletSolver(10, 1e-9).outputHoldValues(); } Putting this all together, the program to solve Piglet is as follows: PigletSolver.java public class PigletSolver { Variable denitions Construct solution Perform value iteration Return estimated probability of win Output hold values Execute program }
6. Pig: In the preceding text, a solution is outlined for solving Piglet. Now, you will modify this approach to solve 2-player Pig with a goal score of 100. (a) Dene the state space. (b) Write the equations that describe optimal play. (c) Compute the optimal policy. Assume that an optimal player with a winning turn total will hold and win. What is the probability that the rst player will win if both players play optimally? (d) Summarize or visualize the policy, and describe it qualitatively in your own words. 7. Pig Solitaire 3: Consider the solitaire (single player) game of Pig where a player is challenged to minimize the turns taken to reach a given goal score g. Hint: Let the only reward be a reward of -1 at the end of each turn. In this way the value of the initial state will be the negated expected number of turns to reach the goal score g. (a) Dene the state space.
(b) Write the equations that describe optimal play. (c) Compute the optimal policy for g = 100. Assume that an optimal player with a winning turn total will hold and win. What is the expected number of turns to reach 100 when playing optimally? (d) Summarize or visualize the policy, and describe it qualitatively in your own words. 8. Pass the Pigs: Pass the Pigs (a.k.a. Pigmania) is a popular commercial variant of Pig which involves rolling two rubber pigs to determine the change in turn total or score. Rules for Pass the Pigs can be found at the Hasbro website10. For simplicity, make the following assumptions: Assume that the player can throw the pigs so as to make the probability of an oinker or piggyback eectively 0. Assume probabilities for other pig rolls are accurately represented by the data at Freddie Wongs Pass the Pigs page11 :
1344 Right Sider 3939 Left (Dot) Sider 767 Razorback Trotter Snouter Leaning Jowler 3939
(a) Dene the state space. (b) Write the equations that describe optimal play. (c) Compute the optimal policy. Assume that an optimal player with a winning turn total will hold and win. What is the probability that the rst player will win if both players play optimally? (d) Summarize or visualize the policy, and describe it qualitatively in your own words.
Hog is a variation of Pig in which players have only one roll per turn, but may roll as many dice as desired. If no 1s are rolled, the sum of the dice is scored. If any 1s are rolled, no points are scored for the turn. It is as if a Pig player must commit to the number of rolls in a turn before the turn begins. (See exercise 2 on the Gamblers Fallacy.) For a goal score of 100, we recommend a simplifying assumption that a player may roll up to 30 dice. The state transition probabilities can be computed once initially using dynamic programming. (Outcome probabilities for n + 1 dice can be computed by taking outcome probabilities for n dice, and considering the eects of the 6 possible outcomes of one more die.) If one graphs the optimal number of dice to roll for each (i, j) pair, one will notice a striking similarity to the shape of the optimal roll/hold boundary for Pig (see the Game of Pig website12 ).
10 See 11 See
URL http://www.hasbro.com/common/instruct/PassThePigs.PDF. URL http://members.tripod.com/%7Epasspigs/prob.html. 12 See URL http://cs.gettysburg.edu/projects/pig/index.html.
Ten Thousand
Among dice games, Ten Thousand is what we call a jeopardy race game. Jeopardy refers to the fact that each turn we are putting our entire turn total at risk. Race refers to the fact that the object is to be the rst to meet or exceed a goal score. Pig is the simplest jeopardy race game. Most other jeopardy race games are variations of the game Ten Thousand. In such games, players roll a set of dice (usually 6), setting aside various scoring combinations with each roll that increase the turn total until a player either (1) holds and scores the turn total, or (2) rolls the remaining dice such that there is no possible scoring combination and thus loses the turn total. Generally, if all dice are set aside in scoring combinations, then the turn continues with all dice back in play. Rules for Ten Thousand can be found in [5] and also online at Dice-Plays Ten Thousand page13. Ten Thousand has much in common with Pig, and can also be solved with value iteration. However, writing a program to compute the 2-player solution is more dicult for the following reasons: The state space is larger. In addition to keeping track the player score, opponent score, and turn total, one needs to keep track of which subset of 6 dice are in play. There are more actions. A player only needs to score one combination per roll, and can therefore possibly choose a subset of several possible scoring combinations. There are more possible state transitions. In rolling multiple dice, the many outcomes increase the computation necessary in order to recompute value estimates. Greater rule complexity leads to more complex equations. These factors combine to make this a more advanced programming project. At the time of writing, optimal 2-player play of Ten Thousand is an open research problem. If one wishes to use value iteration to solve a dierent, simpler jeopardy dice game similar to Pig, additional variations are described in the appendix of [8], available at the Game of Pig website14.
Reinforcement Learning Explorations
Sutton and Bartos Reinforcement Learning: an introduction [11] is an excellent starting point for an advanced undergraduate or graduate exploration of reinforcement learning. The authors classify reinforcement learning methods in three categories, dynamic programming methods, Monte Carlo methods, and temporal-dierence learning methods. In their text, dynamic programming is dened more generally so as to include value iteration15. Dynamic programming and value iteration make a strong assumption that the programmer has a complete model of state transition probabilities and expected rewards for such transitions. One cannot approach a solution of Pass the Pigs (Exercise 8) without a probability model gained through experience. By contrast, Monte Carlo methods improve state value estimates through simulation or experience, using entire simulated or real games to learn the value of each state. In some cases, a complete probabilistic model of the system is known, yet it is easier to simulate the system than to express and solve equations for optimal behavior. Temporal-dierence (TD) learning methods blend ideas from value iteration and Monte Carlo methods. Like value iteration, TD learning methods use previous estimates in the computation of updates.
URL http://homepage.ntlworld.com/dice-play/Games/TenThousand.htm. URL http://cs.gettysburg.edu/projects/pig/index.html. 15 Most algorithm text authors dene dynamic programming as a technique for transforming recursive solutions. Recursive solutions require acyclic dependencies.
14 See 13 See
Like Monte Carlo methods, TD learning methods do not require a model and are informed through experience and/or simulation. The author has applied methods from each of these categories to the solution of Pig, and has found Pig a valuable tool for understanding their tradeos. For example, value iteration is advantageous in that all states are updated equally often. Monte Carlo and TD learning methods, by contrast, update states as they are experienced. Thus, value estimates for states that occur with low probability are very slow to converge to their true values. If the reader has enjoyed learning the power of these simple techniques, we would encourage continued study in the eld of reinforcement learning. With a good text, an intriguing focus problem, and a healthy curiosity, you are likely to reap a rich educational reward.
References
[1] Richard E. Bellman. Dynamic Programming. Princeton University Press, Princeton, New Jersey, USA, 1957. [2] Dmitri P. Bertsekas. Dynamic Programming: deterministic and stochastic models. Prentice-Hall, Upper Saddle River, New Jersey, USA, 1987. [3] Dan Brutlag. Choice and chance in life: The game of skunk. Mathematics Teaching in the Middle School, 1(1):2833, April 1994. ISSN 1072-0839. [4] Ruma Falk and Maayan Tadmor-Troyanski. THINK: a game of choice and chance. Teaching Statistics, 21(1):2427, 1999. ISSN 0141-982X. [5] Reiner Knizia. Dice Games Properly Explained. Elliot Right-Way Books, Brighton Road, Lower Kingswood, Tadworth, Surrey, KT20 6TD U.K., 1999. [6] Michael L. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the 11th International Conference on Machine Learning (ICML94), pages 157 163, San Francisco, CA, USA, 1994. Morgan Kaufmann. [7] Tom M. Mitchell. Machine Learning. McGraw-Hill, New York, New York, USA, 1997. [8] Todd W. Neller and Clifton G. M. Presser. Optimal play of the dice game pig. UMAP Journal (Journal of Undergraduate Mathematics and Its Applications), 25(1), Spring 2004. [9] Jason A. Osborne. Markov chains for the risk board game revisited. Mathematics Magazine, 76(2):129135, April 2003. http://www4.stat.ncsu.edu/%7Ejaosborn/research/osborne.mathmag.pdf. [10] Stuart Russell and Peter Norvig. Articial Intelligence: a modern approach, 2nd ed. Prentice Hall, Upper Saddle River, NJ, USA, 2003. [11] Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: an introduction. MIT Press, Cambridge, Massachusetts, 1998. [12] John von Neumann and Oskar Morgenstern. Theory of Games and Economic Behavior, 1st edition. Princeton University Press, Princeton, New Jersey, USA, 1944. [13] Phil Woodward. Yahtzee R : The solution. Chance, 16(1):1022, 2003.
Tags
FP723 Temporis 09 VF2058 KX-P1150 Digitech RP12 30325 JD-KS17 SA-AX710 GM-X552 A4000 CDX-1060 MHC-RG551S MS-7021 EMS26405X -g MC-505 NV-MX300B BD-C5500C MDX-C670 MAX-N22 E2120 CPM-kvmswit11 KM-5530 8502 NGV CE2713N Projectors WHP 360R DK7942P VGN-NS11s S HT-C555 Elefun 2004 C4150 WD-1245FHB DES-802 LH-T760PA Studio 12 Sbcvl1200 Niplette Magellan 8300 Enduro 8 Generale RTS Siemens EL71 Flash Photo IF-1500 DSC-350 Mixer AMP Nvgs280 250 WX Stylus 800 FWM35 ETE6730K RC1200 R1000H P1266P IC-F22SR R780 Js04 HRS-12 EL-337C Radio PV-DC252D VGP-PRS1 H6300 PSR300 FEC-11D Plus DF6260ML 1 S DM100 Microwave Oven NP-Q45 Machine LE40A552 SR7300 FTA Plus 1300 ACE Review CA-300 RT-44NZ23RB LX-M230A 80877 Mackie DFX C4000 COP III 5090 0 Lexmark W820 PRS-700 M2500 GA-5AX RSA1wtpe1 Eight 1996 KX-T7667 42FD9935 KX-TG8120FX 400SD4 Cezai998 WM-12230FB S115III ME645XE1 S5000 NV-HV61 Nokia 1662
manuel d'instructions, Guide de l'utilisateur | Manual de instrucciones, Instrucciones de uso | Bedienungsanleitung, Bedienungsanleitung | Manual de Instruções, guia do usuário | инструкция | návod na použitie, Užívateľská príručka, návod k použití | bruksanvisningen | instrukcja, podręcznik użytkownika | kullanım kılavuzu, Kullanım | kézikönyv, használati útmutató | manuale di istruzioni, istruzioni d'uso | handleiding, gebruikershandleiding
Sitemap
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101











