Abstract
quanteda is an R package providing a comprehensive workflow and toolkit for natural language processing tasks such as corpus management, tokenization, analysis, and visualization. It has extensive functions for applying dictionary analysis, exploring texts using keywords-in-context, computing document and feature similarities, and discovering multi-word expressions through collocation scoring. Based entirely on sparse operations, it provides highly efficient methods for compiling document-feature matrices and for manipulating these or using them in further quantitative analysis. Using C++ and multithreading extensively, quanteda is also considerably faster and more efficient than other R and Python packages in processing large textual data.
| Original language | English |
|---|---|
| Pages (from-to) | 1-4pp |
| Journal | The Journal of Open Source Software |
| Volume | 3 |
| Issue number | 30 |
| DOIs | |
| Publication status | Published - 2018 |
Fingerprint
Dive into the research topics of 'quanteda: An R package for the quantitative analysis of textual data'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver