Abstract
For the Australasian Language Technology Association (ALTA) 2016 Shared Task, we devised Pairwise FastText Classifier (PFC), an efficient embedding-based text classifier, and used it for entity disambiguation. Compared with a few baseline algorithms, PFC achieved a higher F1 score at 0.72 (under the team name BCJR). To generalise the model, we also created a method to bootstrap the training set deterministically without human labelling and at no financial cost. By releasing PFC and the dataset augmentation software to the public1, we hope to invite more collaboration.
Original language | English |
---|---|
Title of host publication | Proceedings of Australasian Language Technology Association Workshop 2016 Workshop |
Editors | Trevor Cohn |
Place of Publication | Pennsylvania, USA |
Publisher | Association for Computational Linguistics |
Pages | 175−179pp |
Edition | Peer Reviewed |
ISBN (Print) | 9781510833166 |
Publication status | Published - 2016 |
Event | Australasian Language Technology Association Workshop (ALTA 2016) - Caulfield, Australia Duration: 1 Jan 2016 → … http://alta2016.alta.asn.au/U16/U16-1.pdf |
Conference
Conference | Australasian Language Technology Association Workshop (ALTA 2016) |
---|---|
Period | 1/01/16 → … |
Other | December 5–7 2016 |
Internet address |