ARTÍCULO
TITULO

On Unbalanced Sampling in Bankruptcy Prediction

Marek Gruszczynski    

Resumen

The paper discusses methodological topics of bankruptcy prediction modelling?unbalanced sampling, sample bias, and unbiased predictions of bankruptcy. Bankruptcy models are typically estimated with the use of non-random samples, which creates sample choice biases. We consider two types of unbalanced samples: (a) when bankrupt and non-bankrupt companies enter the sample in unequal numbers; and (b) when sample composition allows for different ratios of bankrupt and non-bankrupt companies than those in the population. An imbalance of type (b), being more general, is examined in several sections of the paper. We offer an extended view of the relationship between the biased and unbiased estimated probabilities of bankruptcy?probability of default (PD). A common error in applications is neglecting the possibility of calibrating the PD obtained from a bankruptcy model to the unbiased PD that is population adjusted. We show that Skogsviks? formula of 2013 coincides with prior correction known for the logit model. This, together with solutions for other binomial models, serves as practical advice for obtaining the calibration of unbiased PDs from popular bankruptcy models. In the final section, we explore sample bias effects on classification.