TY - JOUR
T1 - SCENT
T2 - Scalable compressed monitoring of evolving multirelational social networks
AU - Lin, Yu Ru
AU - Selçuk Candan, K.
AU - Sundaram, Hari
AU - Xie, Lexing
PY - 2011/10
Y1 - 2011/10
N2 - We propose SCENT, an innovative, scalable spectral analysis framework for internet scale monitoring of multirelational social media data, encoded in the form of tensor streams. In particular, a significant challenge is to detect key changes in the social media data, which could reflect important events in the real world, sufficiently quickly. Social media data have three challenging characteristics. First, data sizes are enormous; recent technological advances allow hundreds of millions of users to create and share content within online social networks. Second, social data are often multifaceted (i.e., have many dimensions of potential interest, from the textual content to user metadata). Finally, the data is dynamic; structural changes can occur at multiple time scales and be localized to a subset of users. Consequently, a framework for extracting useful information from social media data needs to scale with data volume, and also with the number and diversity of the facets of the data. In SCENT, we focus on the computational cost of structural change detection in tensor streams. We extend compressed sensing (CS) to tensor data. We show that, through the use of randomized tensor ensembles, SCENT is able to encode the observed tensor streams in the form of compact descriptors. We show that the descriptors allow very fast detection of significant spectral changes in the tensor stream, which also reduce data collection, storage, and processing costs. Experiments over synthetic and real data show that SCENT is faster (17.7x-159x for change detection) and more accurate (above 0.9 F-score) than baseline methods.
AB - We propose SCENT, an innovative, scalable spectral analysis framework for internet scale monitoring of multirelational social media data, encoded in the form of tensor streams. In particular, a significant challenge is to detect key changes in the social media data, which could reflect important events in the real world, sufficiently quickly. Social media data have three challenging characteristics. First, data sizes are enormous; recent technological advances allow hundreds of millions of users to create and share content within online social networks. Second, social data are often multifaceted (i.e., have many dimensions of potential interest, from the textual content to user metadata). Finally, the data is dynamic; structural changes can occur at multiple time scales and be localized to a subset of users. Consequently, a framework for extracting useful information from social media data needs to scale with data volume, and also with the number and diversity of the facets of the data. In SCENT, we focus on the computational cost of structural change detection in tensor streams. We extend compressed sensing (CS) to tensor data. We show that, through the use of randomized tensor ensembles, SCENT is able to encode the observed tensor streams in the form of compact descriptors. We show that the descriptors allow very fast detection of significant spectral changes in the tensor stream, which also reduce data collection, storage, and processing costs. Experiments over synthetic and real data show that SCENT is faster (17.7x-159x for change detection) and more accurate (above 0.9 F-score) than baseline methods.
KW - Multirelational learning
KW - Social media
KW - Social network analysis
KW - Stream mining
KW - Tensor analysis
UR - http://www.scopus.com/inward/record.url?scp=84863634824&partnerID=8YFLogxK
U2 - 10.1145/2037676.2037686
DO - 10.1145/2037676.2037686
M3 - Review article
SN - 1551-6857
VL - 7 S
JO - ACM Transactions on Multimedia Computing, Communications and Applications
JF - ACM Transactions on Multimedia Computing, Communications and Applications
IS - 1
M1 - 29
ER -