TY - GEN
T1 - Video genetics
T2 - 18th ACM International Conference on Multimedia ACM Multimedia 2010, MM'10
AU - Kender, John R.
AU - Hill, Matthew L.
AU - Natsev, Apostol
AU - Smith, John R.
AU - Xie, Lexing
PY - 2010
Y1 - 2010
N2 - We explore in a single but large case study how videos within YouTube, competing for view counts, are like organisms within an ecology, competing for survival. We develop this analogy, whose core idea shows that short video clips, best detected across videos as near-duplicate keyframes, behave similarly to genes. We report work in progress, on a dataset of 5.4K videos with 210K keyframes on a single topic, which traces sequences, not bags, of "near-dups" over time, both within videos and across them. We demonstrate their utility to: cleanse responses to queries contaminated by over-eager YouTube query expansion; separate videos temporally according to their responses to external events; track the evolution and lifespan of continuing video "stories"; automatically locate video summaries already present within a video ecology; quickly verify video copying via a direct application of the Smith-Waterman algorithm used in genetics - which also provides useful feedback for tuning the near-dup detection and clustering process; and quickly classify videos via a kind of Lempel-Ziv encoding into the categories of news, monologue, dialogue, and slideshow. We demonstrate a number of novel visualizations of this large dataset, including a direct use of the Matlab black-body "hot" false-color map, together with the GraphViz package, to display the gene-like inheritance of viral properties of keyframes. We further speculate that, as with genes, there are "functional roles" for semantic categories of clips, and, as with species, there are differing rates of "genetic drift" for each video genre.
AB - We explore in a single but large case study how videos within YouTube, competing for view counts, are like organisms within an ecology, competing for survival. We develop this analogy, whose core idea shows that short video clips, best detected across videos as near-duplicate keyframes, behave similarly to genes. We report work in progress, on a dataset of 5.4K videos with 210K keyframes on a single topic, which traces sequences, not bags, of "near-dups" over time, both within videos and across them. We demonstrate their utility to: cleanse responses to queries contaminated by over-eager YouTube query expansion; separate videos temporally according to their responses to external events; track the evolution and lifespan of continuing video "stories"; automatically locate video summaries already present within a video ecology; quickly verify video copying via a direct application of the Smith-Waterman algorithm used in genetics - which also provides useful feedback for tuning the near-dup detection and clustering process; and quickly classify videos via a kind of Lempel-Ziv encoding into the categories of news, monologue, dialogue, and slideshow. We demonstrate a number of novel visualizations of this large dataset, including a direct use of the Matlab black-body "hot" false-color map, together with the GraphViz package, to display the gene-like inheritance of viral properties of keyframes. We further speculate that, as with genes, there are "functional roles" for semantic categories of clips, and, as with species, there are differing rates of "genetic drift" for each video genre.
KW - near-duplicate keyframes
KW - video ecology visualization
KW - video evolution
KW - video genetics
KW - video mashups
KW - video memes
KW - video smith-waterman matching
KW - video species
UR - http://www.scopus.com/inward/record.url?scp=78650969012&partnerID=8YFLogxK
U2 - 10.1145/1873951.1874198
DO - 10.1145/1873951.1874198
M3 - Conference contribution
SN - 9781605589336
T3 - MM'10 - Proceedings of the ACM Multimedia 2010 International Conference
SP - 1253
EP - 1258
BT - MM'10 - Proceedings of the ACM Multimedia 2010 International Conference
Y2 - 25 October 2010 through 29 October 2010
ER -