American Psychological Association to publish controversial 'PSI' paper

The Good Reverend Roger · January 10, 2011, 09:00:14 PM

Quote from: LMNO, PhD on January 10, 2011, 07:51:00 PM
I suppose. Plus, as noted above, what is being suggested violates known physical laws; I'm guessing no one attempts to reconcile this.

Violate away.

As long as you have good, repeatable data, I'm willing to re-scrutinize physical laws.

Let me say that again: As long as you have good, repeatable data.

I'm not holding my breath.

Jasper · January 11, 2011, 12:30:09 AM

Experiment 1: Precognitive Detection of Erotic Stimuli

Basically, a computer test where you were supposed to look at two pictures of curtains, and then try to guess which one had the porn behind it. The actual methodology seems sound when reviewed in detail, but I would quibble over their conclusions. They feel that after 100 experimental observations, and a success rate of 53.1%, this is significantly higher than chance. I would not be so eager to claim to have found anything.

Experiment 2: Precognitive Avoidance of Negative Stimuli

The way this test was presented:

Quotethis is an experiment that tests for ESP (Extrasensory Perception). The experiment
is run entirely by computer and takes about 15 minutes....On each trial of the
experiment you will be shown a picture and its mirror image side by side and
asked to indicate which image you like better. The computer will then flash a
masked picture on the screen. The way in which this procedure tests for ESP will
be explained to you at the end of the session

and then,

Quotethe participant was shown a low-arousal, affectively
neutral picture and its mirror image side by side and asked to press one of two keys on the
keyboard to indicate which neutral picture he or she liked better

The computer did not determine which was which until after the choice was made, ruling out any remotely possible explanation by pattern matching.

Whenever the participant had indicated a preference for the target-to-be, the computer flashed a positively valenced picture
on the screen subliminally three times. Whenever the participant had indicated a preference for
the non-target, the computer subliminally flashed a highly-arousing, negatively valenced picture.

So yes, this test should test for whether a precognitive could foretell and avoid negative stimuli.

Page 19, table 2 shows their results.

QuoteAs Table 2 reveals, all four analyses yielded comparable results, showing significant psi
performance across the 150 sessions. Recall, too, that the RNG used in this experiment was
tested in the simulation, described above in the discussion of Experiment 1, and was shown to be
free of nonrandom patterns that might correlate with participants' responses biases.

QuoteStimulus Seeking. In the present experiment, the correlation between stimulus seeking and
psi performance was .17 (p = .02). Table 3 reveals that the subsample of high stimulus seekers
achieved an effect size more than twice as large as that of the full sample. In contrast, the hit rate
of low stimulus seekers did not depart significantly from chance: 50.7%–50.8%, t < 1, p > .18,
and d < 0.10 in each of the four analyses.

Which I don't have enough stats under my belt to interpret confidently. Anybody?

Requia ☣ · January 11, 2011, 12:57:26 AM

Correlation values of .2 are normal in psychology, and usually accepted. (The highest value I've ever seen is .4).

Cain · January 11, 2011, 01:15:59 PM

Still seems an overall very weak test though. Like Sigmatic, I think claiming a 53.1% success rate, over 100 observations, where the participants are offered two choices, is not anywhere near strong enough to stake claims on.

I'd want to see a lot more tests done, first of all replicating this one and seeing if the results are steady. I'd then want to introduce tests with more options. And then tests with brain scans, where the activity in the brain of both successful and non-successful participants could be studied and compared.

Because if ESP does exist, there must be a point at which it has a physical effect on the brain. And if a physical effect of some sort cannot be found, then the only reasonable conclusion is that the participants are guessing, and some are getting lucky and others are not, in roughly the ratios one would expect given the test parameters (near 50%).

Richter · January 11, 2011, 02:28:33 PM

53.1% out of 100 observations isn't signifigant in a test where only two choices are presented (If I'm following their methodology right). Flip a coin 100 times, and 53.1% of the flips coming up tails wouldn't be proof of anything.

Make it a 4 of 5 choice test, get the same results, replicate them, and THEN you've got somethign to base a claim on.

The Johnny · January 11, 2011, 03:32:18 PM

What i have learned about correlation value significance (if i did learn correctly

) is that the value has to be.....

2/3... that is 66.6%... OR above, to pass the "could go either way" threshold... otherwise its just a matter of insignificant correlation that may fall in the realms of the "error range".

And no, i dont know the proper terms.

Cramulus · January 11, 2011, 03:48:49 PM

Quote from: Sigmatic on January 11, 2011, 12:30:09 AM
Page 19, table 2 shows their results.

QuoteAs Table 2 reveals, all four analyses yielded comparable results, showing significant psi
performance across the 150 sessions. Recall, too, that the RNG used in this experiment was
tested in the simulation, described above in the discussion of Experiment 1, and was shown to be
free of nonrandom patterns that might correlate with participants' responses biases.

QuoteStimulus Seeking. In the present experiment, the correlation between stimulus seeking and
psi performance was .17 (p = .02). Table 3 reveals that the subsample of high stimulus seekers
achieved an effect size more than twice as large as that of the full sample. In contrast, the hit rate
of low stimulus seekers did not depart significantly from chance: 50.7%–50.8%, t < 1, p > .18,
and d < 0.10 in each of the four analyses.

Which I don't have enough stats under my belt to interpret confidently. Anybody?

if I recall my stats correctly...

the important thing to look at here is the p value. The lower p is, the lower chance it is that the data is caused by random chance.

Most psych studies consider something a "real" correlation at a p value of .05? maybe more like .02. Parapsychological studies generally use a more rigorous p threshhold because they need airtight proof that their finding isn't due to chance.

a p value of .009 (table 2 columm 1) is highly significant - basically it means that the odds of this data being due to random chance is less than 1%.

I've been digging around for replication - which is what will make or break this paper - so far it looks like 3 groups have registered replication attempts, but I can't find data on them. Somebody did a replication of experiment 8 and did not replicate results, but I didn't read the paper.

Requia ☣ · January 12, 2011, 03:36:38 AM

QuoteSomebody did a replication of experiment 8 and did not replicate results, but I didn't read the paper.

Title of the paper?

Kai · January 12, 2011, 04:52:11 AM

I took a look at the methods and results for experiment one, and I don't trust their use of a t-test or lack of a control. Were I designing the experiment, I would have devised some sort of control group with equal numbers so that there were two variable sample sets. An one way ANOVA would allow a much more rigorous test of significance. On their work alone I would not consider 53.1% to be significant, and from a common sense standpoint of what psi would really entail, why, if it exists, would it be just slightly higher than the cutoff point? Also, the sample size (given the size of the overall population (6 billion)) is way way too small. And furthermore, why oh why do they draw the conclusion of psi on so little evidence? Why did they even FUCKING USE that word in their paper?

Goddammit. Now I'm pissed off.

Requia ☣ · January 12, 2011, 08:45:34 AM

There is a control in the experiments, specifically the control group is the set of neutral pictures. Since the hypothesizes is that *all* humans have reverse causative abilities, a second group of test subjects cannot be used as a control. If I understand what a t-test is correctly, one cannot be used in this kind of experiment (at least, as I remember it a t-test assumes a control group with random assignment). This is perfectly normal scientific procedure.

On the 53.1% thing, as Cram has pointed out, the strength of the effect is irrelevant, what matters is that the effect not have a high chance of being obtained at random (p<.05 is the standard). Nor would it be rational to expect a large effect (if the effect was large it wouldn't take a tightly controlled experiment to detect it, we would already know from day to day experience).

The PSI thing... yeah that's fucking silly.

Requia ☣ · January 13, 2011, 03:03:23 AM

There's a shocking number of criticisms out there from PhDs who don't appear to have read the paper.

My favorite so far is a guy talking, in a peer reviewed paper, about how 'Bem's Pyschic' (Apparently he was under the impression that Bem was working with a single psychic instead of the typical group of students used for psych experiments) would be able to bankrupt a casino with a 53.1% accuracy rate unless there was some reason he couldn't use the power for roulette (which according to the paper in the OP, you can't, for a couple different reasons).

However, there's no reason to doubt his math, and while I'm not really qualified to judge his premise (that the statistical methods used in psychology should be replaced), what I do understand suggests that there's little reason to expect successful repetition even if Bem is being honest.

Kai · January 13, 2011, 01:17:00 PM

Quote from: Requia ☣ on January 12, 2011, 08:45:34 AM
There is a control in the experiments, specifically the control group is the set of neutral pictures. Since the hypothesizes is that *all* humans have reverse causative abilities, a second group of test subjects cannot be used as a control. If I understand what a t-test is correctly, one cannot be used in this kind of experiment (at least, as I remember it a t-test assumes a control group with random assignment). This is perfectly normal scientific procedure.

Then it's pseudoreplication. Either way, I don't trust the results on that account.

QuoteOn the 53.1% thing, as Cram has pointed out, the strength of the effect is irrelevant, what matters is that the effect not have a high chance of being obtained at random (p<.05 is the standard). Nor would it be rational to expect a large effect (if the effect was large it wouldn't take a tightly controlled experiment to detect it, we would already know from day to day experience).

If there is really something going on, and it's going on in all humans, then this experiment could be run many more times and obtain the same result, thus eliminating the issue of pseudoreplication. I'm waiting.

Requia ☣ · January 14, 2011, 03:59:28 AM

What exactly is pseudoreplication?

LMNO · January 14, 2011, 02:35:32 PM

From what I understand, lack of a suitable control.

Kai · January 14, 2011, 11:14:12 PM

Quote from: Requia ☣ on January 14, 2011, 03:59:28 AM
What exactly is pseudoreplication?

It's when the experimenter attempts to replicate, but the replicates aren't independent from one another. It could be that the control isn't independent from the treatment as in this case or it could be that the treatments aren't independent in space (ie there may be bias of place or study object/organism that introduced bias) or time (ie something about the time of day, or timing in general, is inserting bias). Other examples of pseudoreplication would be: taking multiple samples from the same organism and only that organism, conducting the experimental replicates on the same day, or in the same place. As an entomologist, the thoughts go through my head: what about weather? What if it was cloudy one day and sunny the next? I can't control it, but at least I can make it random by replicating across different days. How about my study location? Maybe it's just a bad location. I can eliminate location bias by setting up in several different places.

To have this be truly replicated, the authors would have had to had controls separate from treatment, OR, repeat the experiment on multiple occassions and locations with multiple /sets/ of people.

Principia Discordia

News:

American Psychological Association to publish controversial 'PSI' paper

The Good Reverend Roger

Jasper

Requia ☣

Cain

Richter

The Johnny

Cramulus

Requia ☣

Kai

Requia ☣

Requia ☣

Kai

Requia ☣

LMNO

Kai