Power Calculations for Replication Studies
The reproducibility crisis has led to an increasing number of replication studies being conducted. Sample sizes for replication studies are often calculated using conditional power based on the effect estimate from the original study. However, this approach is not well suited as it ignores the uncertainty of the original result. Bayesian methods are often used in clinical trials to incorporate prior information into power calculations. We propose to adapt this methodology to the replication framework and to use predictive instead of conditional power. Moreover, we describe how the methodology used in sequential clinical trials can be tailored to replication studies. The predictive interim power, i.e. the predictive power conditioned on the data already collected, is shown to be useful to decide whether to stop a replication study at interim. Predictive power generally leads to smaller values than conditional power and does not always increase when increasing the sample size. Adding more subjects to the replication study can in some cases decrease the predictive power. We illustrate these properties using data from a recent project on the replicability of social sciences
READ FULL TEXT