TY - GEN
T1 - Picture tags and world knowledge
T2 - 21st ACM International Conference on Multimedia, MM 2013
AU - Xie, Lexing
AU - He, Xuming
PY - 2013
Y1 - 2013
N2 - This paper studies the use of everyday words to describe images. The common saying has it that a picture is worth a thousand words, here we ask which thousand? The proliferation of tagged social multimedia data presents a challenge to understanding collective tag-use at large scale { one can ask if patterns from photo tags help understand tag-tag relations, and how it can be leveraged to improve visual search and recognition. We propose a new method to jointly analyze three distinct visual knowledge resources: Flickr, ImageNet/WordNet, and ConceptNet. This allows us to quantify the visual relevance of both tags learn their relationships. We propose a novel network estimation algorithm, Inverse Concept Rank, to infer incomplete tag relationships. We then design an algorithm for image annotation that takes into account both image and tag features. We analyze over 5 million photos with over 20,000 visual tags. The statistics from this collection leads to good results for image tagging, relationship estimation, and generalizing to unseen tags. This is a first step in analyzing picture tags and everyday semantic knowledge. Potential other applications include generating natural language descriptions of pictures, as well as validating and supplementing knowledge databases.
AB - This paper studies the use of everyday words to describe images. The common saying has it that a picture is worth a thousand words, here we ask which thousand? The proliferation of tagged social multimedia data presents a challenge to understanding collective tag-use at large scale { one can ask if patterns from photo tags help understand tag-tag relations, and how it can be leveraged to improve visual search and recognition. We propose a new method to jointly analyze three distinct visual knowledge resources: Flickr, ImageNet/WordNet, and ConceptNet. This allows us to quantify the visual relevance of both tags learn their relationships. We propose a novel network estimation algorithm, Inverse Concept Rank, to infer incomplete tag relationships. We then design an algorithm for image annotation that takes into account both image and tag features. We analyze over 5 million photos with over 20,000 visual tags. The statistics from this collection leads to good results for image tagging, relationship estimation, and generalizing to unseen tags. This is a first step in analyzing picture tags and everyday semantic knowledge. Potential other applications include generating natural language descriptions of pictures, as well as validating and supplementing knowledge databases.
KW - Folksonomy
KW - Knowledge graph
KW - Social media
UR - http://www.scopus.com/inward/record.url?scp=84887420405&partnerID=8YFLogxK
U2 - 10.1145/2502081.2502113
DO - 10.1145/2502081.2502113
M3 - Conference contribution
SN - 9781450324045
T3 - MM 2013 - Proceedings of the 2013 ACM Multimedia Conference
SP - 967
EP - 976
BT - MM 2013 - Proceedings of the 2013 ACM Multimedia Conference
Y2 - 21 October 2013 through 25 October 2013
ER -