A commentary on “Problems in using text-mining and p-curve analysis to detect rate of p-hacking” J. C. F. de Winter 31 July 2015

Bishop and Thompson (2015) concluded that “For uncorrelated variables, simulated p-hacked data do not give the signature left-skewed p-curve that Head et al. took as evidence of p-hacking.” The simulations by Bishop and Thomspon were conducted with one particular form of p-hacking, the “use of ghost variables”. Thus, their simulations show that if there is p-hacking by means of ghost variables, it will not be detected in the p-curve. It is useful to emphasize that other forms of p-hacking, such as ‘optional stopping’, do yield a distinct peak below 0.05. I performed simulations where one-sample t-tests were performed until reaching statistical significance (up to a maximum of 50 participants). 10,000 studies were simulated. The MATLAB code is shown below. The results in Figure 1 show that a left skewed distribution arises, even when the true effect size is large (d = 1.0). Figure 2 shows the consequences of another form of p-hacking: selecting a nonparametric test if it gives a statistically significant result while the parametric test does not. Again, a left skew can be seen. This time the simulations were conducted with 1,000,000 two-sample t-tests, with a sample size of 25 per group, and a true effect size of d = 0. Bishop and Thompson (2015) cite a relatively limited number of papers. It is important to note that a substantial number of papers have previously discussed/analysed p-values extracted from papers. Examples are: Brodeur et al. (2013), De Winter and Dodou (2015), Ioannidis (2014), Jager and Leek (2014), Krawczyk (2015), Lakens (2015), Leggett et al. (2013), Masicampo and Lalande (2012), Mathôt (2013), Nuijten et al. (2015), and Vermeulen et al. (in press). Some of these papers (e.g., Ioannidis, 2014) are negative towards the idea of automatic p-value mining and raise similar arguments as you did. Others, such as De Winter and Dodou (2015), discuss both strengths and weaknesses of automatically extracted p-values compared to manually harvested pvalues. 2500 d=0 d = 0.3 d = 1.0 d = 10

2000

Count

1500

1000

500

0

0

0.05

0.1

0.15

0.2

0.25

p-value

Figure 1. Distribution of p-values for ‘optional stopping’, for four effect sizes.

0.3

0.35

0.4

7500 7000 6500

Count

6000 5500 5000 4500 4000 3500 3000

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

p-value

Figure 2. Distribution of p-values when ‘strategically’ selecting a non-parametric test when the parametric test yields a result that is not statistically significant. References Bishop, D., & Thompson, P. A. (2015). Problems in using text-mining and p-curve analysis to detect rate of phacking. Retrieved from https://peerj.com/preprints/1266/ Brodeur, A., Lé, M., Sangnier, M., & Zylberberg, Y. (2013). Star wars: The empirics strike back (Discussion Paper No. 7268). Discussion Paper Series. Bonn: Forschungsinstitut zur Zukunft der Arbeit. De Winter, J. C. F., & Dodou, D. (2015). A surge of p-values between 0.041 and 0.049 in recent decades (but negative results are increasing rapidly too). PeerJ, 3, e733. Head, M. L., Holman, L., Lanfear, R., Kahn, A. T., & Jennions, M. D. (2015). The extent and consequences of p-hacking in science. PLOS Biology, 13, e1002106. Ioannidis, J. P. (2014). Discussion: Why “An estimate of the science-wise false discovery rate and application to the top medical literature” is false. Biostatistics, 15, 28–36. Jager, L. R., & Leek, J. T. (2014). An estimate of the science-wise false discovery rate and application to the top medical literature. Biostatistics, 15, 1–12. Krawczyk, M. (2015). The search for significance: a few peculiarities in the distribution of p values in experimental psychology literature. PLOS ONE 10, e0127872. Lakens, D. (2015). On the challenges of drawing conclusions from p-values just below 0.05. PeerJ, 3, e1142. Leggett, N. C., Thomas, N. A., Loetscher, T., & Nicholls, M. E. (2013). The life of p:“Just significant” results are on the rise. The Quarterly Journal of Experimental Psychology, 66, 2303–2309. Masicampo, E. J., & Lalande, D. R. (2012). A peculiar prevalence of p values just below. 05. The Quarterly Journal of Experimental Psychology, 65, 2271–2279. Mathôt, S. (2013). No particular prevalence of p values just below .05 [blog post]. Retrieved from http://www.cogsci.nl/blog/miscellaneous/221-no-particular-prevalence-of-p-values-just-below-05 Nuijten, M. B., Hartgerink, C. H. J., Van Assen, M. A. L. M., Epskamp, S., & Wicherts, J. M. (2015). The prevalence of statistical reporting errors in psychology (1985-2013). Retrieved from https://mbnuijten.files.wordpress.com/2013/01/nuijtenetal_2015_reportingerrorspsychology.pdf Vermeulen, I. E., Beukeboom, C. J., Batenburg, A. E., Stoyanov, D., Avramiea, A., Van de Velde, R. N., & Oegema, D. (in press). Blinded by the light: How a focus on statistical “significance” may cause p-value misreporting and an excess of p-values just below .05 in communication science. Communication Methods and Measures.

Matlab code %% tested in Matlab R2015a. It takes a few minutes to run on a standard laptop clear variables;close all;clc d=[0 0.3 1 10]; reps=10000;pp=NaN(length(d),reps); for k=1:length(d); for j=1:reps S=[]; for i=1:50 S(i)=randn+d(k); [~,pp(k,j)]=ttest(S,0); if pp(k,j)<.05;break;end end end end V=[0:0.005:1]; figure;plot(V+mean(V(1:2)),histc(pp',V),'-o','Linewidth',2); xlabel('\itp\rm-value') ylabel('Count') legend('\itd\rm = 0','\itd\rm = 0.3', '\itd\rm = 1.0', '\itd\rm = 10') set(gca,'xlim',[0 .4]) h=findobj('FontName','Helvetica'); set(h,'FontSize',20,'Fontname','Arial') %% pp2=NaN(1,reps*100); for i=1:reps*100 S=randn(25,1);S2=randn(25,1);SR=tiedrank([S;S2]); [~,p1]=ttest2(S,S2); [~,p2]=ttest2(SR(1:25),SR(26:end)); if (p1>.05 && p2 < .05) pp2(i)=p2; else pp2(i)=p1; end end figure;plot(V+mean(V(1:2)),histc(pp2',V),'k-o','Linewidth',2); xlabel('\itp\rm-value') ylabel('Count') set(gca,'xlim',[0 .4]) h=findobj('FontName','Helvetica'); set(h,'FontSize',20,'Fontname','Arial')

A commentary on “Problems in using text-mining and p ...

Jul 31, 2015 - Bishop and Thompson (2015) concluded that “For uncorrelated variables, simulated p-hacked data do not give the signature left-skewed ...

161KB Sizes 0 Downloads 39 Views

Recommend Documents

Weighted p-Laplacian problems on a half-line
Sep 9, 2015 - We study the weighted half-line eigenvalue problem ...... Now v and y have the same (constant) sign in (x0,∞), which we assume to be positive,.

ARTICLES COMMENTARY ON KELLY AND ...
cal model, custodial transfer, gender bias, DSM-/C: empirical studies, and the misapplication of PAS. In their reformulation of the parental alienation syndrome ...

PDF A Critical and Exegetical Commentary on the ...
PDF download Shoeless Joe & Me - Dan Gutman - Book,The Great Naturalists: From Aristotle to Darwin - Robert Huxley - Book,EPUB The Enchanted Apples of Oz - Eric Shanower - Book,PDF Minerva's Voyage - Lynne Kositsky - Book,PDF The Twilight Zone: Walki

A commentary on Building a Discipling Culture [BADC]
Dec 4, 2017 - Introduction. Mike Breen and his 3DM colleagues have produced a book about a very important subject: how to make disciples. The book, Building a Discipling Culture [BADC], is divided into 3 main parts: ...... for the same purposes – w

pdf-0961\the-way-of-awakening-a-commentary-on-shantidevas ...
... she met the late Geshe Yeshe. Page 3 of 11. pdf-0961\the-way-of-awakening-a-commentary-on-shantidevas-bodhicharyavatara-by-geshe-yeshe-tobden.pdf.

pdf-1830\a-commentary-on-the-executive-summary-of-mission ...
... the apps below to open or edit this item. pdf-1830\a-commentary-on-the-executive-summary-of-mi ... ciation-of-academic-health-centers-by-geoffrey-v.pdf.

Jameson - Representing Capital - A Commentary on Volume One.pdf ...
Page 3 of 15. Jameson - Representing Capital - A Commentary on Volume One.pdf. Jameson - Representing Capital - A Commentary on Volume One.pdf. Open.

Personality-based selection Commentary on
“Reconsidering the Use of Personality Tests in Personnel Selection Contexts”: ... analytic research findings regarding the validity of personality-based assessments, and ... worked for a Fortune 500 company to develop large scale ... be driven pr

Commentary on Synofzik, Vosgerau and Newen 2008
I offer an alternative reading of some of the data they use to argue against the ..... actually performed and those movements that were visually displayed to them .

Commentary on 'Synaesthesia' by Ramachandran and ...
phenomenon that many of my students have asked me to fill. Up to now I have ... in the brain or brain stem 'representations' must be able to get together. The evi- ... The dilemma would not have occurred if the human clinical data had been reli- ...

Commentary on Goldin-Meadow and Brentari
components into a single compositional system. Goldin-Meadow and Brentari highlight the relationship between imagistic properties and categorical properties in communication, focusing on how they emerge from and interact in speech, sign, and gesture.

Commentary on 'Synaesthesia' by Ramachandran and ...
The dilemma would not have occurred if the human clinical data had been reli- ... K.H. (1971b), Languages of the Brain (Englewood Cliffs, NJ: Prentice-Hall).

pdf-1365\critical-and-exegetical-commentary-on-the-new-testament ...
Try one of the apps below to open or edit this item. pdf-1365\critical-and-exegetical-commentary-on-the-new-testament-matthew-by-william-stewart.pdf.

Expert commentary on work-life balance and ... - Wiley Online Library
Mar 27, 2009 - Summary. Professor Mina Westman, the head of Organisational Behaviour Program at the Tel Aviv is a leading international expert on the crossover of emotions and experiences in the family and the workplace. In this interview with Paula

Commentary on Nolen-Hoeksema and Watkins (2011)
ioral account have failed spectacularly. ... In addition, we show that even a simple network structure naturally accounts ..... Software: Practice & Experience, 21,.

On Distributed and Parameterized Supervisor Synthesis Problems
regular language has a non-empty decomposable sublanguage with respect to a fixed ..... Proof: It is clear that the supremal element L. ↑. 1 of {L1 ⊆. Σ∗.