The assessment of the results of a massive open online course using Data Mining methods

Information and Telecommunications Technologies in Education

The paper presents the results of a grade reports analysis for five sessions of a massive open online course “Data Management” at For our research, we used clustering and classification in the R programming environment. Clustering showed the presence of four groups of course participants with nearly similar course results. These clusters were similar for all five sessions of the course we analyzed. We also showed it is possible to predict whether a participant completes the course or drops out, based on the test results during the first half of the course. The course lecturers can use the results to plan measures for keeping the students in the course. Also, such a type of analysis helps to understand the reasons why the students drop out of the course. The lecturers can take them into account to modify the course structure and learning content. This new knowledge about the course participants can be used during the next course sessions. We expect that for other courses with a similar structure, the clustering results will be also similar. The approach to predict whether a student drops out or completes the course used in the paper is applicable for other courses as well.