Last changed 14 Mar 2019 ............... Length about 900 words (8,000 bytes).
(Document started on 12 Mar 2019.) This is a WWW document maintained by Steve Draper, installed at http://www.psy.gla.ac.uk/~steve/talks/pedesize.html. You may copy it. How to refer to it.

Web site logical path: [www.psy.gla.ac.uk] [~steve] [talks] [this page]

Three cases of reasoning with effect sizes in pedagogical research: the good, the bad, and the downright disgraceful

By Steve Draper,   Department of Psychology,   University of Glasgow.

Title: Three cases of reasoning with effect sizes in pedagogical research: the good, the bad, and the downright disgraceful
Presenter: Passport photo Steve Draper,   School of Psychology,   University of Glasgow.
Date/time: Wednesday 20 March 2019. Session: 3-4pm.
Occasion: Pedagogical lab (monthly)
Place: The Level 6 Meeting Room, School of Psychology, 62 Hillhead Street.

Abstract

"Three cases of reasoning with effect sizes in pedagogical research: the good, the bad, and the downright disgraceful" Or to put them in reverse order:

1) Microscopic effect sizes.
Sievertsen et al. (2016) "Cognitive fatigue influences students' performance on standardized tests" PANAS (Proc. National Academy of Sciences of the USA) doi:10.1073/pnas.1516947113 https://www.pnas.org/content/113/10/2621
It reports a variety of (Cohen's d) effect sizes in the range 0.03 - 0.005; where 0.2 is "small".

2) The small but intriguing.
Perkins,K.K. and Wieman,C.E. (2005) "The Surprising Impact of Seat Location on Student Performance" The Physics Teacher vol.43 January pp.30-33 doi:10.1119/1.1845987 https://doi.org/10.1119/1.1845987

3) The inspirational.
Bloom, B.S. (1984) "The 2 Sigma Problem: The Search for Methods of Group Instruction as Effective as One-to-One Tutoring" Educational Researcher vol.13 no.6 (Jun. - Jul., 1984) pp.4-16 https://www.jstor.org/stable/1175554

It would be good if people had read the abstracts of these three in advance. Whether you read more depends on your degree of interest.

Provisional commentary

Practical uses of effect sizes are relative: which one is bigger when trying to choose between two "treatments" you might use or do research on. Bloom understood this, and exemplifies this, even though his paper was published before the term "effect size" came into use. (In his title, "sigma" means "standard deviation", "2 sigma" would today be called a Cohen's 'd' of 2: a pretty big effect.) Bloom exemplifies practical reasoning about research.

The recent seminar here by Daniel Lakens seemed rather depressing to me, because even though it was about improving the choices journal editors must make, it still reduced them to choosing different magic numbers, when the rational view is that there are no magic numbers, just practical decisions to make. The Sievertsen paper seems to show a researcher who in the end thinks a big number of participants (for him, 100% of the children in the Danish public school system over 4 years, giving over 0.5 million data points) must somehow make his analysis valuable; when in reality it shows that the effect size is so small, that if he picked any other published effect in the literature and acted on it, it would do more good than acting on his findings. The Perkins paper may give us a bit of sympathy for that, as it is an example of the irrational tug many of us feel when an effect is demonstrated in an RCT, but is really too small to be worth taking action on, yet we just want to know how it could be possible that the seats students have in a lecture theatre change their course grades measurably.

Bloom's paper is exemplary because:

Partly related material


Web site logical path: [www.psy.gla.ac.uk] [~steve] [talks] [this page]
[Top of this page]