http://behind-the-enemy-lines.blogspot.com/2009/12/prisoners-dilemma-and-mechanical-turk.html
(quick copy/paste, click the link for proper formatting, in-text links, and such)
Prisoner's Dilemma and Mechanical Turk
I have been reading lately, about the differences between mathematical models of behavior and real human behavior. So, I decided to try on Mechanical Turk the classical game theory model of Prisoner's Dilemma. (See also Brendan's nice explanations and diagrams if you have never been exposed to game theory before.)
From Wikipedia:
In its classical form, the prisoner's dilemma ("PD") is presented as follows:
Two suspects are arrested by the police. The police have insufficient evidence for a conviction, and, having separated both prisoners, visit each of them to offer the same deal. If one testifies (defects from the other) for the prosecution against the other and the other remains silent (cooperates with the other), the betrayer goes free and the silent accomplice receives the full 10-year sentence. If both remain silent, both prisoners are sentenced to only six months in jail for a minor charge. If each betrays the other, each receives a five-year sentence. Each prisoner must choose to betray the other or to remain silent. Each one is assured that the other would not know about the betrayal before the end of the investigation. How should the prisoners act?
If we assume that each player cares only about minimizing his or her own time in jail, then the prisoner's dilemma forms a non-zero-sum game in which two players may each cooperate with or defect from (betray) the other player. In this game, as in all game theory, the only concern of each individual player (prisoner) is maximizing his or her own payoff, without any concern for the other player's payoff. The unique equilibrium for this game is a Pareto-suboptimal solution, that is, rational choice leads the two players to both play defect, even though each player's individual reward would be greater if they both played cooperatively.
My first attempt was to post to Mechanical Turk this dilemma in a setting of the following game:
You are playing a game together with a stranger. Each of you have two choices to play: "trust" or "cheat".
If both of you play "trust", you win $30,000 each.
If both of you play "cheat", you get $10,000 each.
If one player plays "trust" and the other plays "cheat", then the player that played "cheat" gets $50,000 and the player that played "trust" gets $0.
You cannot communicate during the game, and CANNOT see the final action of the other player. Both actions will be revealed simultaneously.
What would you play? "Cheat" or "Trust"?
Basic game theory predicts that the participants will choose "cheat" resulting in a suboptimal equilibrium. However, participants on Mechanical Turk did not behave like that. Instead, 48 out of the 100 participants decided to play "trust", which is above the 33% observed in the lab experiments of (Shafir and Tversky, 1992).
Next, I wanted to make the experiment more realistic. Would anything change if instead of playing an imaginary game, I promised actual monetary benefits to the participants? So, I modified the game, and asked the participants to play against each other. Here is the revised task description.
You are playing a game against another Turker. Your action here will be matched with an action of another Mechanical Turk worker.
Each of you have two choices to play: "trust" or "cheat".
If both of you play "trust", you both get a bonus of $0.30.
If both of you play "cheat", you both get a bonus of $0.10.
If one Turker plays "trust" and the other plays "cheat", then the Turker that played "cheat" gets a bonus of $0.50 and the Turker that played "trust" gets nothing.
What is your action? "Cheat" or "Trust"?
I asked 120 participants to play the game, paying just 1 cent for the participation. Interestingly enough, I had a perfect split in the results. 60 Turkers decided to cheat, and 60 Turkers decided to cheat. The final result was 20 pairs of trust-trust, 20 pairs of cheat-cheat, and 20 pairs of cheat-trust.
In other words, the theory prediction that people will be locked in a non-optimal equilibrium was not correct, neither in the "imaginary" game, nor in the case where the workers had to gain some actually monetary benefit.
Finally, I decided to change the payoff matrix, and replicate the structure of the TV game show "Friend or Foe". There, participants get $50K each if they cooperate, $0 if they do not, and if one chooses trust and the other cheat, the "cheat" gets $100K and the "trust" gets $0.
You are playing a game together with a stranger.
Each of you have two choices to play: "trust" or "cheat".
If both of you play "trust", you both win $50,000.
If both of you play "cheat", you both get $0.
If one player plays "trust" and the other plays "cheat", then the player that played "cheat" gets $100,000 and the player that played "trust" gets $0.
You cannot communicate during the game, and CANNOT see the final action of the other player. Both actions will be revealed simultaneously.
What would you play? "Cheat" or "Trust"?
Interestingly enough, in this setting ALL 100 players ended up playing "trust", which was quite different from the previous game and from the behavior of the players in the TV show, where, in almost 25% of the played games, both players chose "cheat" ending up with $0, and in 25% of the games the players collaborated and played "trust" getting $50K each.
So, in my final attempt, I asked Turkers to play this "Friend of Foe" game, having monetary incentives. Here is the task that I posted on Mechanical Turk.
You are playing a game against another Turker. Your action here will be matched with an action of another Mechanical Turk worker.
Each of you have two choices to play: "trust" or "cheat".
If both of you play "trust", you both get a bonus of $0.50.
If both of you play "cheat", you both get $0.
If one Turker plays "trust" and the other plays "cheat", then the Turker that played "cheat" gets a bonus of $1.00 and the Turker that played "trust" gets nothing.
What is your action? "Cheat" or "Trust"?
In this game, 33% of the users decided to cheat, resulting in 6/50 games where both players got nothing, 23/50 games where both players got a 50 cent bonus, and 21/50 games where one player got $1 and the other player got nothing.
I found the difference in behavior between the imaginary game and the actual one to be pretty interesting. Also, the deviation from the predictions of the game-theoretic model is striking.
Although I am not the first to actually observe that, this deviation got me wondering: Why do we use elaborate game theory models for modeling user behavior, when not even the simplest such models do not correspond to reality? How can someone take seriously the concept of an equilibrium when a game, introduced in the intro chapter of every game theory textbook, simply does not correspond to reality? Do we really understand the limitations of our tools, or mathematical and analytic elegance end up being more important than reality?
For the 100% trust there was a huge modifying variable that was overlooked.
It was a once off large sum, people would cheat each other for the low risk but social pressure skews their choice. They choose trust because no one wants to effectively lose a huge amount of money and then be exposed as the cheat. No one wants to feel morally and literally bankrupted.
The sums are also very small, in a more formal setting you'd us a sum more like 50$ for the big payoff and 25$ for the small. Some of the questions are badly phrased as well. In Junior high they had us play that game, most people agreed everyone should use the 'trust' option. When we actually played the game that went right out the window, one group played trust for 9 of the ten rounds, and they still both cheated on the tenth. Of course, 13 and 14 year olds aren't going to make the same choices as adults.
However, the answer to why game theory is that
A) Mathematicians don't always care if their stuff relates to the real world or not. Game theory works just fine in the world of math.
B) Mathematicians do care if they get grants, and pretending their stuff relates to the real world when talking to non mathematicians gets more grants.
This particular bit of game theory is also used to tear apart the idea of rational self interest, if each person only considers their own interest, they will both cheat, and thus both lose.
The way some of them were worded, I wonder if any of the mechanical turkers believed that the multi-thousand dollar bonuses were real money, like the $.30 bonuses.
Because if they did, the obvious solution is to requisition at least 51 of the 100 job offers and play "trust" every time, which guarantees you at least 1 pair paying 50k for both sides, for a grand total of 100k. I wouldn't be terribly surprised if the 100 trusters were all one person.
I believe you can prevent that from happening, but he didn't specifically say he restricted people from entering more than once - and if you get paid for every answer, just clicking on the first response every time is a great way to earn a quick dollar. That's at least a few meals for a family in India.
Quote from: Requia ☣ on December 07, 2009, 11:42:19 PM
The sums are also very small, in a more formal setting you'd us a sum more like 50$ for the big payoff and 25$ for the small. Some of the questions are badly phrased as well. In Junior high they had us play that game, most people agreed everyone should use the 'trust' option. When we actually played the game that went right out the window, one group played trust for 9 of the ten rounds, and they still both cheated on the tenth. Of course, 13 and 14 year olds aren't going to make the same choices as adults.
However, the answer to why game theory is that
A) Mathematicians don't always care if their stuff relates to the real world or not. Game theory works just fine in the world of math.
B) Mathematicians do care if they get grants, and pretending their stuff relates to the real world when talking to non mathematicians gets more grants.
This particular bit of game theory is also used to tear apart the idea of rational self interest, if each person only considers their own interest, they will both cheat, and thus both lose.
Game theory does apply to the real world. It may not predict anything, but it helps mathematicians sneak into economic psychologists' studies across the hall and win big bucks. Also with multiplayer card games and stuff - minimax may not describe what the others will do, but if you let it describe what you do, you win more.
I thought the point of PD was that there was a penalty for losing - you get stuck in prison and the other guy gets off free. A lack of bonus is not the same psychological pressure - it's (bad OR great) vs (pretty much the same as before OR great).
I think its mathematically equivalent, I could be wrong though.
um, are you all missing the point here or something?
damn
if I needed someone to debunk the methodology I wouldn't have posted it here.
yes it's a bit unorthodox, and there are methodological problems, but they are small, and every experiment has them. why don't we focus on the things we can learn from this information instead of taking the shortest route of dismissing it, hm? Goddess forbid, you might actually learn something new.
of course the Turkers didn't believe there were large sums involved when he said there were. that was just a poll on the forum he posted. but that's not the interesting part of the experiment (and IMO he could have left that out entirely).
then he repeated the experiment with small amounts of money so it was "for real".
here are some interesting bits of info I got from this:
- there is a huge gap between what people say they would do, and what they would actually do if the stakes are real. that's some deep (also obvious) insight into human nature right there.
- these experiments have been done before [in psychological experiments aso with real money]. The results were again different from these. Which shows just about how big a grano salis you gotta take when interpreting results in psychological experiments.
- as I said, the Game Theory model predicts different results than the Psychological model. the game theoretical prediction is based purely on the numbers in the payoff matrix. and indeed, as Requia mentioned, it's about the netto difference, gaining $9 versus $10 is mathematically equivalent to losing $1 versus playing quitte.
whereas in the psychological model, there's a big difference between winning and losing. as Taleb said in Fooled by Randomness, people perceive equal losses as worse than profit.
(what GA said), this experiment is also not about debunking Game Theory. GT is theoretically and mathematically solid. but when determining an optimal strategy, it assumes a rational agent with perfect reasoning. it should come as no surprise that humans are different. picking the option that's worst for everybody, creating inefficient markets, destroying the world and building prisons.
but that is my interest in these matters. where exactly lies the difference between this hypothetical theoretical perfect agent, and the warm-blooded human monkey?
and another question, is this another bar in our prison? when a group picks a solution that's obviously suboptimal. and let me be very specific here, we can learn from Game Theory that there are three different kinds of "suboptimal" (and currently I'm talking about the 3rd)
1 you know there's an optimal solution if everybody would play along but due to the nature of the game, you cannot depend on this. like in the prisoners dilemma, maximal total profit would be to both cooperate. but this can't happen because if the other knew you'd cooperate he would logically choose to defect because that'd increase his personal gain. this is classical game theory, and happens often enough in real life.
2 you know there's a stable optimal solution, if everybody would play along it would work, and nobody could break it on their own by defecting. problem is, current situation is at a less optimal stable point, and you need a certain majority on the playing field, playing the stable optimal solution to get there. but to do that, people need to lose some first. sacrifice. in short, you're stuck in a local optimum.
3 you know there's a stable optimal solution, but you don't go there because you're a thickheaded stubborn monkey that chooses to hurt others by hurting itself. remember the "stupidity quadrant", these are the stupid people. we are all stupid from time to time, don't get hung up on yourself.
there is also a 4th option, where there is a theoretically stable optimal solution, but you don't know about it, so you don't go there. I didn't list this, because it's a transient state. I personally believe that monkeys are smart enough to figure out the proper solutions to their problems, somebody will come up with the optimal solution sooner or later.
it's just that monkeys are stupid and stubborn in that when a good thing is staring them right in the face, they will shit on it.
so this 4th option will either move to the optimal solution, or to option 2 or 3, depending on the nature of the problem and the current mood of the group of monkeys.
that's the thing. I want you to look at options 1 2 and 3, and notice how they are different. I know you will say that one option is in some way equivalent to the other, but that's not the point. they are equivalent, but also different. focus on the difference.
focus on it.
inhale deeply.
smell that?
that's monkey poop slinging theory.
I once studied a paper somewhat similar to this. Basically, one person had a sum of money and the two other people had to negotiate how much of it they would like to share, between themselves. If they could not agree, then the money would not be given out.
Under these circumstances, by far the most beneficial strategy was to negotiate as close as possible to the 50% share value, with a 50/50 split being, by far, the most popular and successful choice. The conclusion by the authors was that people probably had an innate sense of fairness, and when this was unjustly exploited, they would rather have nothing than recieve a more minimal share (I believe they did versions of the test where one of the two players was instructed to hold out for a 75% share, and so on) through an injust allocation method.
Wasn't there also a BBC documentary about game theory and how it was predicated on a false model of human behaviour? It was by the same guy who did The Power of Nightmares and The Century of the Self.
Quote from: Triple Zero on December 07, 2009, 10:47:24 PM
http://behind-the-enemy-lines.blogspot.com/2009/12/prisoners-dilemma-and-mechanical-turk.html
(quick copy/paste, click the link for proper formatting, in-text links, and such)
Prisoner's Dilemma and Mechanical Turk
I found the difference in behavior between the imaginary game and the actual one to be pretty interesting. Also, the deviation from the predictions of the game-theoretic model is striking.
Although I am not the first to actually observe that, this deviation got me wondering: Why do we use elaborate game theory models for modeling user behavior, when not even the simplest such models do not correspond to reality? How can someone take seriously the concept of an equilibrium when a game, introduced in the intro chapter of every game theory textbook, simply does not correspond to reality? Do we really understand the limitations of our tools, or mathematical and analytic elegance end up being more important than reality?
Its actually kind of inevitable. There is no valid scientific way [that includes the whole range from the hardest to the softest sciences] of attempting to grapple with what actually constitutes "reality" without recourse to some form of numerical analysis and tabulation. Therefore - from a scientific point of view_ number is inherent in reality, one cannot exist without the other. Conversely I could write the most insightful and explanatory poem on the nature of "reality" but I'm suspecting it would never be taken as a serious contribution to our Knowledge/Sciences because there weren't any measurements involved. Or as my Psychology prof said when challenged on the inadequacy of some much that passes for our knowledge of what is real:-
"well its all just so much hand waving really"
Quote from: Triple Zero on December 08, 2009, 08:38:30 AM
1 you know there's an optimal solution if everybody would play along but due to the nature of the game, you cannot depend on this. like in the prisoners dilemma, maximal total profit would be to both cooperate. but this can't happen because if the other knew you'd cooperate he would logically choose to defect because that'd increase his personal gain. this is classical game theory, and happens often enough in real life.
2 you know there's a stable optimal solution, if everybody would play along it would work, and nobody could break it on their own by defecting. problem is, current situation is at a less optimal stable point, and you need a certain majority on the playing field, playing the stable optimal solution to get there. but to do that, people need to lose some first. sacrifice. in short, you're stuck in a local optimum.
3 you know there's a stable optimal solution, but you don't go there because you're a thickheaded stubborn monkey that chooses to hurt others by hurting itself. remember the "stupidity quadrant", these are the stupid people. we are all stupid from time to time, don't get hung up on yourself.
that's the thing. I want you to look at options 1 2 and 3, and notice how they are different. I know you will say that one option is in some way equivalent to the other, but that's not the point. they are equivalent, but also different. focus on the difference.
focus on it.
inhale deeply.
smell that?
that's monkey poop slinging theory.
1. Trusting a person leads to getting hurt.
2. Trusting the majority leads to getting hurt.
3. Since people keep hurting you, they deserve to be hurt back.
Solution: Since the game seems designed to make people hurt, stop playing. Punch the experimenter in the nose, and take all the money.
Quote from: LMNO on December 08, 2009, 06:51:26 PM
Solution: Since the game seems designed to make people hurt, stop playing. Punch the experimenter in the nose, and take all the money.
TITCM!
:lulz:
In the classic model, I keep my mouth shut for reasons having nothing to do with jail time.
In the cash model, I go trust. I stand to lose nothing, and I learn who can be trusted and who can't, which can be far more valuable than cash.
Douglas Hofstadter wrote a great deal on the Prisoner's Dilemma in Metamagical Themas.
The question he was exploring was, "Is cooperation rational?"
he pointed to a contest in 1980, where college kids competed to design the best computer program to play the prisoner's dilemma. It was basically a ten or twenty round game of the exact same scenario.
One robot's programming was to cooperate every round until his opponent defected, then defect for the rest of the game. Another robot randomly flipped back and forth, being totally unpredictable.
Hofstadter thought the most interesting program was one called Tit for Tat (http://en.wikipedia.org/wiki/Tit_for_tat). It's programming was to do the same thing its opponent did in the previous round.
This means that Tit for Tat almost always loses (although it can tie if both parties cooperate for every round), but only loses by a very narrow margin.
Quote from: Cramulus on December 08, 2009, 07:56:16 PM
Douglas Hofstadter wrote a great deal on the Prisoner's Dilemma in Metamagical Themas.
The question he was exploring was, "Is cooperation rational?"
On a related note, a paleontologist recently put forward the notion that altruism is a survival trait. I can't remember who it was, but I will try to dig it up. He makes a fairly good case, and we've seen what self-absorbed greed brings.
Quote from: The Good Reverend Roger on December 08, 2009, 07:58:25 PM
Quote from: Cramulus on December 08, 2009, 07:56:16 PM
Douglas Hofstadter wrote a great deal on the Prisoner's Dilemma in Metamagical Themas.
The question he was exploring was, "Is cooperation rational?"
On a related note, a paleontologist recently put forward the notion that altruism is a survival trait. I can't remember who it was, but I will try to dig it up. He makes a fairly good case, and we've seen what self-absorbed greed brings.
RAW argued for it in Prometheus Rising... tribal systems = bio-survival and tribal systems only work if the members are working for the good of all.
I think that cooperation can be the rational choice, but I also think that the "rationality" of the choice depends heavily on the environment that the participants grew up in.
JW's cooperate fantastically with each other, I spent several years on their "Quick Build" teams where we would build an entire Kingdom Hall, including plumbing, electric, masonry, flooring etc etc in 3 days. The altruistic cooperation was amazing to watch... and all volunteer.
Yet, those same JW's individually, with a group of non-JW's don't necessarily display those qualities at all.
So in Leary's labels... If you're 1st and 2nd circuit are programmed "positively" with the "I'm OK, You're OK" script, then altruism would likely be a natural option, whereas if you imprint "I'm OK, You're Not OK" or "I'm not OK, You're OK" or "I'm not OK, You're Not OK" then altruism seems much less likely.
Maybe?
Evolutionary psychologists (stop sniggering at the back) have suggested altruism, contrary to Ayn Rand et al, is indeed rational, for much the same reasons that Ratatosk and Roger have mentioned. 30,000 years ago, your survival was much enhanced if you were part of a group. Groups could lay ambushes for animals, pick berries, share child-care duties etc etc all of which enhanced survival compared to those who, for whatever reason, went it alone. Even only about 2500 years ago, banishment was still a favoured punishment, probably because of the greater risk an individual took by not being allowed to interact with his group. Of course, there are environmental and cultural inputs as to what is considered "rational", and as always, it doesn't seem to apply often to those outside the "tribe".
Quote from: Cain on December 09, 2009, 09:10:35 AM
Evolutionary psychologists (stop sniggering at the back) have suggested altruism, contrary to Ayn Rand et al, is indeed rational, for much the same reasons that Ratatosk and Roger have mentioned. 30,000 years ago, your survival was much enhanced if you were part of a group. Groups could lay ambushes for animals, pick berries, share child-care duties etc etc all of which enhanced survival compared to those who, for whatever reason, went it alone. Even only about 2500 years ago, banishment was still a favoured punishment, probably because of the greater risk an individual took by not being allowed to interact with his group. Of course, there are environmental and cultural inputs as to what is considered "rational", and as always, it doesn't seem to apply often to those outside the "tribe".
You don't need to look as far back as 30,000 years. Take a look at group dynamics in any species which operates a pack nowadays. Lions, monkeys, caribou ... etc. A degree of altruism is hardwired into them or Darwin steps in to sort things out. Humans are, as far as I'm aware, the only species who's intellect has evolved to such a degree that they can break the coding and thus fail hilariously to pull off the kind of balanced society that even fucking ants don't find hard. :lulz:
Quote from: Cramulus on December 08, 2009, 07:56:16 PM
Douglas Hofstadter wrote a great deal on the Prisoner's Dilemma in Metamagical Themas.
The question he was exploring was, "Is cooperation rational?"
he pointed to a contest in 1980, where college kids competed to design the best computer program to play the prisoner's dilemma. It was basically a ten or twenty round game of the exact same scenario.
One robot's programming was to cooperate every round until his opponent defected, then defect for the rest of the game. Another robot randomly flipped back and forth, being totally unpredictable.
Hofstadter thought the most interesting program was one called Tit for Tat (http://en.wikipedia.org/wiki/Tit_for_tat). It's programming was to do the same thing its opponent did in the previous round.
This means that Tit for Tat almost always loses (although it can tie if both parties cooperate for every round), but only loses by a very narrow margin.
For a little extra background:
The idea motivating the contest was that a repeated P.D. game is a different beast than a single instance of it, since the idea of establishing yourself as "trustworthy" over the long term came into play, and an opponent you screwed over could take revenge later, etc. So they (I think it might have been someone at RAND or something) just organized a contest to see which strategies were good.
Each contestant submitted a program that played the Prisoner's Dilemma repeatedly. I don't remember if every program played every other program, or just a sampling, but basically the scores (5-0 for defect/trust, 3-3 for trust-trust, and 0-0 for defect/defect) were cumulative across the entire tournament, each round of which was a 100 consecutive games of the Prisoner's dilemma. So if in one round you managed to get trust-trust pairs every time, you were +300 points towards winning the tournament.
A lot of the submitted programs were pretty complex, since they were trying to figure out the opponents program over the 100 games. The way the scoring was set up, it was better to cooperate twice (6 points) than to outright win or lose every other game (5 + 0 = 5 points.) So the winning strategy would cooperate as little as possible while still managing to convince the opposing program to cooperate, while not being fooled in the same way itself.
The overall winning strategy turned out to be "Tit-for-Tat," which is conceptually very simple:
1. Game one, play "cooperate."
2. Every game, after the first, play what your opponent played the previous game.
So if your opponent defects, he's guaranteed to get nothing in the next round, since you defect just to spite him. He might get 5 points for defecting when you cooperated, but the next round he gets 0, no matter what. He'd have gotten 6 points if he had just played "cooperate" twice, since you would have cooperated with him. Tit-for-tat "loses" or ties every round, in that the opposing program might get a few more points than Tit-for-tat did, but wins the overall tournament, since its matches delivered more points. Tit-for-tat might "lose" 297-302 in a round, but when all the other programs had rounds something like 150-50, it still gets more points overall.
After Tit-for-Tac won hands down in the tournament, they ran the tournament again, which the explicit goal of finding a strategy that would do better overall than Tit-for-Tat. No entrant sent in a program significantly better than Tit-for-tat.
Actually it has even been mathematically proven you cannot do better than Tit-for-tat. It may seem obvious, but the proof is rather complex.
Here's an interesting version, and one that may be better suited for real-life mapping.
You play for a small amount of money ($0.50 or so), but your personal information is given to the person you play with, and vice versa; that is to say, it's not anonymous. There could be ramifications to acting like a dick, like getting phone calls or emails from the person you fucked over.
The intention being that your actions are not limited to a single, anonymous event; they are usually determined by judging future outcomes. If you can be a dick to an anonymous person on the internet, there's a greater chance you will be. If you know that your actions now can affect you in the future, you might not be as much of a dick.
Quote from: Triple Zero on December 11, 2009, 12:41:51 PM
Actually it has even been mathematically proven you cannot do better than Tit-for-tat. It may seem obvious, but the proof is rather complex.
Then my lifestyle is vindicated.
Quote from: LMNO on December 11, 2009, 02:45:21 PM
Here's an interesting version, and one that may be better suited for real-life mapping.
You play for a small amount of money ($0.50 or so), but your personal information is given to the person you play with, and vice versa; that is to say, it's not anonymous. There could be ramifications to acting like a dick, like getting phone calls or emails from the person you fucked over.
The intention being that your actions are not limited to a single, anonymous event; they are usually determined by judging future outcomes. If you can be a dick to an anonymous person on the internet, there's a greater chance you will be. If you know that your actions now can affect you in the future, you might not be as much of a dick.
LMNO invents game theory karma ITT