Intelligent technology for content monitoring on the Web

Mark Last, Bracha Shapira, Yuval Elovici, Omer Zaafrany, Abraham K, el

Computational Web Intelligence: Intelligent Technology for Web Applications …, 2004

International terrorists are increasingly using the Internet for covert communications, collecting information on their topics of interest, and spreading the word about their activities around the world. One way to detect terrorist activities on the Internet is by monitoring the content accessed by web users. This study presents an innovative, DM-based methodology. for web content monitoring. The normal behavior of a group of similar users is learned by applying unsupervised clustering algorithms to the textual content of publicly available web pages they usually view. The induced model of normal behavior is used in realtime to reveal anomalous content accessed at a specific computer. To speed-up the detection process, dimensionality reduction is applied to the content data. We evaluate the proposed methodology by ROC analysis.