An analysis of student representation, representative features and classification algorithms to predict degree dropout

Rubén Manrique, Bernardo Pereira Nunes, Olga Marino, Marco Antonio Casanova, Terhi Nurmikko-Fuller

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    29 Citations (Scopus)

    Abstract

    Identifying and monitoring students who are likely to dropout is a vital issue for universities. Early detection allows institutions to intervene, addressing problems and retaining students. Prior research into the early detection of at-risk students has opted for the use of predictive models, but a comprehensive assessment of the suitability of different algorithms and approaches is complicated by the large number of variable features that constitute a student's educational experience. Predictive models vary in terms of their amplitude, temporality and the learning algorithms employed. While amplitude refers to the ability of the model to operate on multiple degrees, temporality is often considered due to the natural temporal aspect of the data. In the absence of a comparative framework of learning algorithms, the aim of this paper has been to provide such an analysis, based on a proposed classification of strategies for predicting dropouts in Higher Education Institutions. Three different student representations are implemented (namely Global Feature-Based, Local Feature-Based, and Time Series) in conjunction with the appropriate learning algorithms for each of them. A description of each approach, as well as its implementation process, are presented in this paper as technical contributions. An experiment based on a dataset of student information from two degrees, namely Business Administration and Architecture, acquired through an automated management system from a university in Brazil is used. Our findings can be summarized as: (i) of the three proposed student representations, the Local Feature-Based was the most suitable approach for predicting dropout. In addition to providing high quality results, the Local Feature-Based representations are simple to build, and the construction of the model is less expensive when compared to more complex ones; (ii) as a conclusion of the results obtained via Local Feature-Based, dropout can be said to be accurately predicted using grades of a few core courses, so there is no need for a complex features extraction process; (iii) considering temporal aspects of the data does not seem to contribute to the prediction performance although it increases computational costs as the model complexity increases.

    Original languageEnglish
    Title of host publicationProceedings of the 9th International Conference on Learning Analytics and Knowledge
    Subtitle of host publicationLearning Analytics to Promote Inclusion and Success, LAK 2019
    PublisherAssociation for Computing Machinery
    Pages401-410
    Number of pages10
    ISBN (Electronic)9781450362566
    DOIs
    Publication statusPublished - 4 Mar 2019
    Event9th International Conference on Learning Analytics and Knowledge, LAK 2019 - Tempe, United States
    Duration: 4 Mar 20198 Mar 2019

    Publication series

    NameACM International Conference Proceeding Series

    Conference

    Conference9th International Conference on Learning Analytics and Knowledge, LAK 2019
    Country/TerritoryUnited States
    CityTempe
    Period4/03/198/03/19

    Fingerprint

    Dive into the research topics of 'An analysis of student representation, representative features and classification algorithms to predict degree dropout'. Together they form a unique fingerprint.

    Cite this