Liron Samama-Kachko, Rami Puzis, Roni Stern, Ariel Felner
Proceedings of the International Symposium on Combinatorial Search 5 (1 …, 2014
The Target Oriented Network Intelligence Collection (TONIC) problem is the problem of finding profiles in a social network that contain publicly available information about a given target profile via automated crawling. Such profiles are called leads. Leads can be found by crawling the network using the profiles’ friend lists (immediate neighborhood) in order to decide which profile will be crawled next. Assuming that leads tend to cluster together, prior work limited the search for new leads only to immediate neighbors of the leads previously found. In this paper we relax this limitation, and extend the scope of the search to a wider neighborhood, including the possibility of crawling to non-leads, ie, profiles that have no publicly available information about the target. We propose a set of heuristics that guide this search. Experimental results show that with the new setting more leads can be found and leads are found faster. In addition, we perform a cost benefit analysis of the search, weighing the reward of finding leads with the costs of the search.