Detecting clickbait in online social media: You won’t believe how we did it

Aviad Elyashar, Jorge Bendahan, Rami Puzis

International Symposium on Cyber Security, Cryptology, and Machine Learning …, 2022

This paper proposes a machine learning approach to detect clickbait posts published in social media. Clickbait posts are short, catchy phrases pointing into a longer online article. Users are encouraged to click on these posts to read the full article in many cases. The suggested approach differentiates between clickbait and legitimate posts based on training mainstream machine learning (ML) classifiers. The suggested classifiers are trained in various features extracted from images, linguistic, and behavioral analysis. For evaluation, we used two datasets provided by Clickbait Challenge 2017. The XGBoost classifier obtained the best performance with an AUC of 0.8, an accuracy of 0.812, a precision of 0.819, and a recall of 0.966. Finally, we found that counting the number of formal English words in the given content is helpful for clickbait detection.