Towards feature engineering at scale for data from massive open online courses. arXiv preprint arXiv:1407.5238
Contributor(s)The Pennsylvania State University CiteSeerX Archives
Full recordShow full item record
AbstractWe examine the process of engineering fea-tures for developing models that improve our understanding of learners ’ online behavior in MOOCs. Because feature engineering relies so heavily on human insight, we argue that extra effort should be made to engage the crowd for feature proposals and even their operationalization. We show two approaches where we have started to engage the crowd. We also show how features can be evalu-ated for their relevance in predictive accu-racy. When we examined crowd-sourced fea-tures in the context of predicting stopout, not only were they nuanced, but they also consid-ered more than one interaction mode between the learner and platform and how the learner was *relatively * performing. We were able to identify different influential features for stop out prediction that depended on whether a learner was in 1 of 4 cohorts defined by their level of engagement with the course discus-sion forum or wiki. This report is part of a compendium which considers different as-pects of MOOC data science and stop out prediction.