Economists often use balance tests to demonstrate that the treatment and control groups are comparable prior to an intervention. We show that typical implementations of balance tests have poor statistical properties. Pairwise t-tests leave it unclear how many rejections indicate overall imbalance. Omnibus tests of joint orthogonality, in which the treatment is regressed on all the baseline covariates, address this ambiguity but substantially over-reject the null hypothesis using the sampling-based p-values that are typical in the literature. This problem is exacerbated when the number of covariates is high compared to the number of observations. We examine the performance of alternative tests, and show that omnibus F-tests of joint orthogonality with randomization inference p-values have the correct size and reasonable power. We apply these tests to data from two prominent recent articles, where standard F-tests indicate imbalance, and show that the study arms are actually balanced when appropriate tests are used.
We use cookies to provide you with an optimal website experience. This includes cookies that are necessary for the operation of the site as well as cookies that are only used for anonymous statistical purposes, for comfort settings or to display personalized content. You can decide for yourself which categories you want to allow. Please note that based on your settings, you may not be able to use all of the site's functions.
Cookie settings
These necessary cookies are required to activate the core functionality of the website. An opt-out from these technologies is not available.
In order to further improve our offer and our website, we collect anonymous data for statistics and analyses. With the help of these cookies we can, for example, determine the number of visitors and the effect of certain pages on our website and optimize our content.