News:

PD.com: We'll make you an offer you can't understand.

Main Menu

Prisoner's Dilemma and Mechanical Turk

Started by Triple Zero, December 07, 2009, 10:47:24 PM

Previous topic - Next topic

Triple Zero

http://behind-the-enemy-lines.blogspot.com/2009/12/prisoners-dilemma-and-mechanical-turk.html

(quick copy/paste, click the link for proper formatting, in-text links, and such)

Prisoner's Dilemma and Mechanical Turk

I have been reading lately, about the differences between mathematical models of behavior and real human behavior. So, I decided to try on Mechanical Turk the classical game theory model of Prisoner's Dilemma. (See also Brendan's nice explanations and diagrams if you have never been exposed to game theory before.)

From Wikipedia:

In its classical form, the prisoner's dilemma ("PD") is presented as follows:

Two suspects are arrested by the police. The police have insufficient evidence for a conviction, and, having separated both prisoners, visit each of them to offer the same deal. If one testifies (defects from the other) for the prosecution against the other and the other remains silent (cooperates with the other), the betrayer goes free and the silent accomplice receives the full 10-year sentence. If both remain silent, both prisoners are sentenced to only six months in jail for a minor charge. If each betrays the other, each receives a five-year sentence. Each prisoner must choose to betray the other or to remain silent. Each one is assured that the other would not know about the betrayal before the end of the investigation. How should the prisoners act?

If we assume that each player cares only about minimizing his or her own time in jail, then the prisoner's dilemma forms a non-zero-sum game in which two players may each cooperate with or defect from (betray) the other player. In this game, as in all game theory, the only concern of each individual player (prisoner) is maximizing his or her own payoff, without any concern for the other player's payoff. The unique equilibrium for this game is a Pareto-suboptimal solution, that is, rational choice leads the two players to both play defect, even though each player's individual reward would be greater if they both played cooperatively.

My first attempt was to post to Mechanical Turk this dilemma in a setting of the following game:

You are playing a game together with a stranger. Each of you have two choices to play: "trust" or "cheat".
If both of you play "trust", you win $30,000 each.
If both of you play "cheat", you get $10,000 each.
If one player plays "trust" and the other plays "cheat", then the player that played "cheat" gets $50,000 and the player that played "trust" gets $0.
You cannot communicate during the game, and CANNOT see the final action of the other player. Both actions will be revealed simultaneously.

What would you play? "Cheat" or "Trust"?

Basic game theory predicts that the participants will choose "cheat" resulting in a suboptimal equilibrium. However, participants on Mechanical Turk did not behave like that. Instead, 48 out of the 100 participants decided to play "trust", which is above the 33% observed in the lab experiments of (Shafir and Tversky, 1992).

Next, I wanted to make the experiment more realistic. Would anything change if instead of playing an imaginary game, I promised actual monetary benefits to the participants? So, I modified the game, and asked the participants to play against each other. Here is the revised task description.

You are playing a game against another Turker. Your action here will be matched with an action of another Mechanical Turk worker.

Each of you have two choices to play: "trust" or "cheat".
If both of you play "trust", you both get a bonus of $0.30.
If both of you play "cheat", you both get a bonus of $0.10.
If one Turker plays "trust" and the other plays "cheat", then the Turker that played "cheat" gets a bonus of $0.50 and the Turker that played "trust" gets nothing.
What is your action? "Cheat" or "Trust"?

I asked 120 participants to play the game, paying just 1 cent for the participation. Interestingly enough, I had a perfect split in the results. 60 Turkers decided to cheat, and 60 Turkers decided to cheat. The final result was 20 pairs of trust-trust, 20 pairs of cheat-cheat, and 20 pairs of cheat-trust.

In other words, the theory prediction that people will be locked in a non-optimal equilibrium was not correct, neither in the "imaginary" game, nor in the case where the workers had to gain some actually monetary benefit.

Finally, I decided to change the payoff matrix, and replicate the structure of the TV game show "Friend or Foe". There, participants get $50K each if they cooperate, $0 if they do not, and if one chooses trust and the other cheat, the "cheat" gets $100K and the "trust" gets $0.

You are playing a game together with a stranger.

Each of you have two choices to play: "trust" or "cheat".
If both of you play "trust", you both win $50,000.
If both of you play "cheat", you both get $0.
If one player plays "trust" and the other plays "cheat", then the player that played "cheat" gets $100,000 and the player that played "trust" gets $0.
You cannot communicate during the game, and CANNOT see the final action of the other player. Both actions will be revealed simultaneously.

What would you play? "Cheat" or "Trust"?

Interestingly enough, in this setting ALL 100 players ended up playing "trust", which was quite different from the previous game and from the behavior of the players in the TV show, where, in almost 25% of the played games, both players chose "cheat" ending up with $0, and in 25% of the games the players collaborated and played "trust" getting $50K each.

So, in my final attempt, I asked Turkers to play this "Friend of Foe" game, having monetary incentives. Here is the task that I posted on Mechanical Turk.

You are playing a game against another Turker. Your action here will be matched with an action of another Mechanical Turk worker.

Each of you have two choices to play: "trust" or "cheat".
If both of you play "trust", you both get a bonus of $0.50.
If both of you play "cheat", you both get $0.
If one Turker plays "trust" and the other plays "cheat", then the Turker that played "cheat" gets a bonus of $1.00 and the Turker that played "trust" gets nothing.
What is your action? "Cheat" or "Trust"?

In this game, 33% of the users decided to cheat, resulting in 6/50 games where both players got nothing, 23/50 games where both players got a 50 cent bonus, and 21/50 games where one player got $1 and the other player got nothing.

I found the difference in behavior between the imaginary game and the actual one to be pretty interesting. Also, the deviation from the predictions of the game-theoretic model is striking.

Although I am not the first to actually observe that, this deviation got me wondering: Why do we use elaborate game theory models for modeling user behavior, when not even the simplest such models do not correspond to reality? How can someone take seriously the concept of an equilibrium when a game, introduced in the intro chapter of every game theory textbook, simply does not correspond to reality? Do we really understand the limitations of our tools, or mathematical and analytic elegance end up being more important than reality?
Ex-Soviet Bloc Sexual Attack Swede of Tomorrow™
e-prime disclaimer: let it seem fairly unclear I understand the apparent subjectivity of the above statements. maybe.

INFORMATION SO POWERFUL, YOU ACTUALLY NEED LESS.

Faust

For the 100% trust there was a huge modifying variable that was overlooked.

It was a once off large sum, people would cheat each other for the low risk but social pressure skews their choice. They choose trust because no one wants to effectively lose a huge amount of money and then be exposed as the cheat. No one wants to feel morally and literally bankrupted.
Sleepless nights at the chateau

Requia ☣

The sums are also very small, in a more formal setting you'd us a sum more like 50$ for the big payoff and 25$ for the small.  Some of the questions are badly phrased as well.  In Junior high they had us play that game, most people agreed everyone should use the 'trust' option. When we actually played the game that went right out the window, one group played trust for 9 of the ten rounds, and they still both cheated on the tenth.  Of course, 13 and 14 year olds aren't going to make the same choices as adults.

However, the answer to why game theory is that

A) Mathematicians don't always care if their stuff relates to the real world or not.  Game theory works just fine in the world of math.

B) Mathematicians do care if they get grants, and pretending their stuff relates to the real world when talking to non mathematicians gets more grants.

This particular bit of game theory is also used to tear apart the idea of rational self interest, if each person only considers their own interest, they will both cheat, and thus both lose.
Inflatable dolls are not recognized flotation devices.

Golden Applesauce

The way some of them were worded, I wonder if any of the mechanical turkers believed that the multi-thousand dollar bonuses were real money, like the $.30 bonuses.

Because if they did, the obvious solution is to requisition at least 51 of the 100 job offers and play "trust" every time, which guarantees you at least 1 pair paying 50k for both sides, for a grand total of 100k.  I wouldn't be terribly surprised if the 100 trusters were all one person.

I believe you can prevent that from happening, but he didn't specifically say he restricted people from entering more than once - and if you get paid for every answer, just clicking on the first response every time is a great way to earn a quick dollar.  That's at least a few meals for a family in India.
Q: How regularly do you hire 8th graders?
A: We have hired a number of FORMER 8th graders.

Golden Applesauce

Quote from: Requia ☣ on December 07, 2009, 11:42:19 PM
The sums are also very small, in a more formal setting you'd us a sum more like 50$ for the big payoff and 25$ for the small.  Some of the questions are badly phrased as well.  In Junior high they had us play that game, most people agreed everyone should use the 'trust' option. When we actually played the game that went right out the window, one group played trust for 9 of the ten rounds, and they still both cheated on the tenth.  Of course, 13 and 14 year olds aren't going to make the same choices as adults.

However, the answer to why game theory is that

A) Mathematicians don't always care if their stuff relates to the real world or not.  Game theory works just fine in the world of math.

B) Mathematicians do care if they get grants, and pretending their stuff relates to the real world when talking to non mathematicians gets more grants.

This particular bit of game theory is also used to tear apart the idea of rational self interest, if each person only considers their own interest, they will both cheat, and thus both lose.

Game theory does apply to the real world.  It may not predict anything, but it helps mathematicians sneak into economic psychologists' studies across the hall and win big bucks.  Also with multiplayer card games and stuff - minimax may not describe what the others will do, but if you let it describe what you do, you win more.
Q: How regularly do you hire 8th graders?
A: We have hired a number of FORMER 8th graders.

Captain Utopia

I thought the point of PD was that there was a penalty for losing - you get stuck in prison and the other guy gets off free. A lack of bonus is not the same psychological pressure - it's (bad OR great) vs (pretty much the same as before OR great).

Requia ☣

I think its mathematically equivalent, I could be wrong though.
Inflatable dolls are not recognized flotation devices.

Triple Zero

um, are you all missing the point here or something?

damn

if I needed someone to debunk the methodology I wouldn't have posted it here.

yes it's a bit unorthodox, and there are methodological problems, but they are small, and every experiment has them. why don't we focus on the things we can learn from this information instead of taking the shortest route of dismissing it, hm? Goddess forbid, you might actually learn something new.

of course the Turkers didn't believe there were large sums involved when he said there were. that was just a poll on the forum he posted. but that's not the interesting part of the experiment (and IMO he could have left that out entirely).

then he repeated the experiment with small amounts of money so it was "for real".

here are some interesting bits of info I got from this:

- there is a huge gap between what people say they would do, and what they would actually do if the stakes are real. that's some deep (also obvious) insight into human nature right there.

- these experiments have been done before [in psychological experiments aso with real money]. The results were again different from these. Which shows just about how big a grano salis you gotta take when interpreting results in psychological experiments.

- as I said, the Game Theory model predicts different results than the Psychological model. the game theoretical prediction is based purely on the numbers in the payoff matrix. and indeed, as Requia mentioned, it's about the netto difference, gaining $9 versus $10 is mathematically equivalent to losing $1 versus playing quitte.
whereas in the psychological model, there's a big difference between winning and losing. as Taleb said in Fooled by Randomness, people perceive equal losses as worse than profit.


(what GA said), this experiment is also not about debunking Game Theory. GT is theoretically and mathematically solid. but when determining an optimal strategy, it assumes a rational agent with perfect reasoning. it should come as no surprise that humans are different. picking the option that's worst for everybody, creating inefficient markets, destroying the world and building prisons.

but that is my interest in these matters. where exactly lies the difference between this hypothetical theoretical perfect agent, and the warm-blooded human monkey?

and another question, is this another bar in our prison? when a group picks a solution that's obviously suboptimal. and let me be very specific here, we can learn from Game Theory that there are three different kinds of "suboptimal" (and currently I'm talking about the 3rd)

1 you know there's an optimal solution if everybody would play along but due to the nature of the game, you cannot depend on this. like in the prisoners dilemma, maximal total profit would be to both cooperate. but this can't happen because if the other knew you'd cooperate he would logically choose to defect because that'd increase his personal gain. this is classical game theory, and happens often enough in real life.

2 you know there's a stable optimal solution, if everybody would play along it would work, and nobody could break it on their own by defecting. problem is, current situation is at a less optimal stable point, and you need a certain majority on the playing field, playing the stable optimal solution to get there. but to do that, people need to lose some first. sacrifice. in short, you're stuck in a local optimum.

3 you know there's a stable optimal solution, but you don't go there because you're a thickheaded stubborn monkey that chooses to hurt others by hurting itself. remember the "stupidity quadrant", these are the stupid people. we are all stupid from time to time, don't get hung up on yourself.

there is also a 4th option, where there is a theoretically stable optimal solution, but you don't know about it, so you don't go there. I didn't list this, because it's a transient state. I personally believe that monkeys are smart enough to figure out the proper solutions to their problems, somebody will come up with the optimal solution sooner or later.
it's just that monkeys are stupid and stubborn in that when a good thing is staring them right in the face, they will shit on it.
so this 4th option will either move to the optimal solution, or to option 2 or 3, depending on the nature of the problem and the current mood of the group of monkeys.

that's the thing. I want you to look at options 1 2 and 3, and notice how they are different. I know you will say that one option is in some way equivalent to the other, but that's not the point. they are equivalent, but also different. focus on the difference.

focus on it.

inhale deeply.

smell that?

that's monkey poop slinging theory.
Ex-Soviet Bloc Sexual Attack Swede of Tomorrow™
e-prime disclaimer: let it seem fairly unclear I understand the apparent subjectivity of the above statements. maybe.

INFORMATION SO POWERFUL, YOU ACTUALLY NEED LESS.

Cain

I once studied a paper somewhat similar to this.  Basically, one person had a sum of money and the two other people had to negotiate how much of it they would like to share, between themselves.  If they could not agree, then the money would not be given out.

Under these circumstances, by far the most beneficial strategy was to negotiate as close as possible to the 50% share value, with a 50/50 split being, by far, the most popular and successful choice.  The conclusion by the authors was that people probably had an innate sense of fairness, and when this was unjustly exploited, they would rather have nothing than recieve a more minimal share (I believe they did versions of the test where one of the two players was instructed to hold out for a 75% share, and so on) through an injust allocation method.

Wasn't there also a BBC documentary about game theory and how it was predicated on a false model of human behaviour?  It was by the same guy who did The Power of Nightmares and The Century of the Self.

MMIX

Quote from: Triple Zero on December 07, 2009, 10:47:24 PM
http://behind-the-enemy-lines.blogspot.com/2009/12/prisoners-dilemma-and-mechanical-turk.html

(quick copy/paste, click the link for proper formatting, in-text links, and such)

Prisoner's Dilemma and Mechanical Turk

I found the difference in behavior between the imaginary game and the actual one to be pretty interesting. Also, the deviation from the predictions of the game-theoretic model is striking.

Although I am not the first to actually observe that, this deviation got me wondering: Why do we use elaborate game theory models for modeling user behavior, when not even the simplest such models do not correspond to reality? How can someone take seriously the concept of an equilibrium when a game, introduced in the intro chapter of every game theory textbook, simply does not correspond to reality? Do we really understand the limitations of our tools, or mathematical and analytic elegance end up being more important than reality?

Its actually kind of inevitable. There is no valid scientific way [that includes the whole range from the hardest to the softest sciences] of attempting to grapple with what actually constitutes "reality" without recourse to some form of numerical analysis and tabulation. Therefore - from a scientific point of view_ number is inherent in reality, one cannot exist without the other. Conversely I could write the most insightful and explanatory poem on the nature of "reality" but I'm suspecting it would never be taken as a serious contribution to our Knowledge/Sciences because there weren't any measurements involved. Or as my Psychology prof said when challenged on the inadequacy of some much that passes for our knowledge of what is real:-
"well its all just so much hand waving really"
"The ultimate hidden truth of the world is that it is something we make and could just as easily make differently" David Graeber

Reginald Ret

Quote from: Triple Zero on December 08, 2009, 08:38:30 AM
1 you know there's an optimal solution if everybody would play along but due to the nature of the game, you cannot depend on this. like in the prisoners dilemma, maximal total profit would be to both cooperate. but this can't happen because if the other knew you'd cooperate he would logically choose to defect because that'd increase his personal gain. this is classical game theory, and happens often enough in real life.

2 you know there's a stable optimal solution, if everybody would play along it would work, and nobody could break it on their own by defecting. problem is, current situation is at a less optimal stable point, and you need a certain majority on the playing field, playing the stable optimal solution to get there. but to do that, people need to lose some first. sacrifice. in short, you're stuck in a local optimum.

3 you know there's a stable optimal solution, but you don't go there because you're a thickheaded stubborn monkey that chooses to hurt others by hurting itself. remember the "stupidity quadrant", these are the stupid people. we are all stupid from time to time, don't get hung up on yourself.

that's the thing. I want you to look at options 1 2 and 3, and notice how they are different. I know you will say that one option is in some way equivalent to the other, but that's not the point. they are equivalent, but also different. focus on the difference.

focus on it.

inhale deeply.

smell that?

that's monkey poop slinging theory.

1. Trusting a person leads to getting hurt.

2. Trusting the majority leads to getting hurt.

3. Since people keep hurting you, they deserve to be hurt back.
Lord Byron: "Those who will not reason, are bigots, those who cannot, are fools, and those who dare not, are slaves."

Nigel saying the wisest words ever uttered: "It's just a suffix."

"The worst forum ever" "The most mediocre forum on the internet" "The dumbest forum on the internet" "The most retarded forum on the internet" "The lamest forum on the internet" "The coolest forum on the internet"

LMNO

Solution: Since the game seems designed to make people hurt, stop playing.  Punch the experimenter in the nose, and take all the money.

Dysfunctional Cunt

Quote from: LMNO on December 08, 2009, 06:51:26 PM
Solution: Since the game seems designed to make people hurt, stop playing.  Punch the experimenter in the nose, and take all the money.

TITCM!

:lulz:

The Good Reverend Roger

In the classic model, I keep my mouth shut for reasons having nothing to do with jail time.

In the cash model, I go trust.  I stand to lose nothing, and I learn who can be trusted and who can't, which can be far more valuable than cash.
" It's just that Depeche Mode were a bunch of optimistic loveburgers."
- TGRR, shaming himself forever, 7/8/2017

"Billy, when I say that ethics is our number one priority and safety is also our number one priority, you should take that to mean exactly what I said. Also quality. That's our number one priority as well. Don't look at me that way, you're in the corporate world now and this is how it works."
- TGRR, raising the bar at work.

Cramulus

#14
Douglas Hofstadter wrote a great deal on the Prisoner's Dilemma in Metamagical Themas.

The question he was exploring was, "Is cooperation rational?"


he pointed to a contest in 1980, where college kids competed to design the best computer program to play the prisoner's dilemma. It was basically a ten or twenty round game of the exact same scenario.

One robot's programming was to cooperate every round until his opponent defected, then defect for the rest of the game. Another robot randomly flipped back and forth, being totally unpredictable.

Hofstadter thought the most interesting program was one called Tit for Tat. It's programming was to do the same thing its opponent did in the previous round.

This means that Tit for Tat almost always loses (although it can tie if both parties cooperate for every round), but only loses by a very narrow margin.