Organization Dataset
This page is a distribution site of essay data for the task of Essay Organization Scoring. Data available on this page include annotated organization scores for 1,003 essays from the International Corpus of Learner English (ICLE).
The problem of Essay Organization Scoring for which this dataset is intended was described in:
Essay Scoring Dataset
- Essay Scores Includes organization scores for 1,003 essays (one per line). Some essays were annotated by multiple graders. In these cases, the multiple organization scores are separated by commas. Note that, though multiple Organization scores are available for most essays, in our experiments, we only used the first Organization score listed for each essay.
- Cross Validation Folds Includes lists of the essays in each of the five folds used for cross validation in the experiments described in Modeling Organization in Student Essays.
- Sentence and Paragraph labeling heuristics describe the rules used in Modeling Organization in Student Essays for heuristically applying labels to paragraphs and sentences.
- Note that we only own the annotations on the ICLE essay dataset. The essays themselves must be purchased here.
The creation of this website is based upon work supported in part by National Science Foundation (NSF) Grant IIS-0812261. Any opinions, findings, and conclusions or
recommendations expressed above are those of the authors and do not
necessarily reflect the views of NSF and should
not be interpreted as representing the official policies, either expressed
or implied, of any sponsoring institution, the U.S. government or any other
entity.