Essay Prompt Adherence Dataset

This page is a distribution site of essay data for the task of Essay Prompt Adherence Scoring. Data available on this page include annotated prompt adherence scores and errors for 830 essays from the International Corpus of Learner English (ICLE).

The problem of Essay Prompt Adherence Scoring for which this dataset is intended is described in:

Isaac Persing and Vincent Ng, Modeling Prompt Adherence in Student Essays, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2014.

Essay Prompt Adherence Dataset

Essay Prompt Adherence Scores Includes prompt adherence scores for 830 essays (one per line).

Essay Prompt Adherence Errors Includes prompt adherence error type annotations for 830 essays (one per line).

Cross Validation Folds Includes lists of the essays in each of the five folds used for cross validation in the experiments described in Modeling Prompt Adherence in Student Essays.

Thesis Clarity Keyword Features Describes the thesis clarity keyword features discussed in section 4.3 of Modeling Thesis Clarity in Student Essays.

Prompt Adherence Keyword Features Describes the prompt adherence keyword features discussed in section 4.3 of Modeling Prompt Adherence in Student Essays.

LDA Data Includes data used to generate LDA Topic features as discussed in section 4.3 of Modeling Prompt Adherence in Student Essays.

Manually Annotated LDA Data Includes data used to generate Manually Annotated LDA Topic features as discussed in section 4.3 of Modeling Prompt Adherence in Student Essays.

Note that we only own the annotations on the ICLE essay dataset. The essays themselves must be purchased here.

The creation of this website is based upon work supported in part by National Science Foundation (NSF) Grants IIS-1147644 and IIS-1219142. Any opinions, findings, and conclusions or recommendations expressed above are those of the authors and do not necessarily reflect the views of NSF and should not be interpreted as representing the official policies, either expressed or implied, of any sponsoring institution, the U.S. government or any other entity.