How and when to stop the co-training process

Edita Grolman, Dvir Cohen, Tatiana Frenklach, Asaf Shabtai, Rami Puzis

Expert Systems with Applications 187, 115841, 2022

Co-training is a semi-supervised learning approach used when only a small set of the data that is available for training is labeled. By using multiple classifiers, the co-training process utilizes the small set of labeled data in order to label an additional set of samples. During this process, the classifiers gradually augment the training data in an iterative process in which a new co-training model is derived and used for labeling the unlabeled samples in each iteration. A few of the newly labeled samples are added in each iteration to the training dataset to improve the performance of the classifiers. The main challenge in applying co-training is to make sure that the co-trainer assigns accurate labels to the unlabeled samples. Many empirical studies showed that the performance (accuracy) of the co-trainer could not be further improved when a certain number of iterations was reached, and in some cases, the performance …