Object: Automated essay scoring is the computer tech-niques and algorithms that evaluate and score essays automat-ically. Compared with human rater, automated essay scoring has the advantage of fairness, less human resource cost and timely feedback. In previous work, automated essay scoring is regarded as a classification or regression problem. Machine learning techniques such as K-nearest-neighbor (KNN), multi-ple linear regression have been applied to solve this problem. In this paper, we regard this problem as a ranking problem and apply a new machine learning method, learning to rank, to solve this problem. We will introduce detailed steps about how to apply learning to rank to automated essay scoring, such as feature extraction, scoring. Experiments in this paper show that learning to rank outperforms other classical machine learning techniques in automated essay scoring.
Keywords-Automated essay scoring; Learning to rank; Fea-ture extraction; Machine learning;
Automated Essay Scoring (AES) is defined as the com-puter techniques and algorithms that evaluate and score essays automatically . In this paper, we mainly discuss automated essay scoring on English essays. In recent years, with the growing need of essay scoring in large-scale English test and in the teaching of English writing skill, automated essay scoring has become a hot issue in the research of natural language processing. Nowadays, large-scale English test has been widely spread in the world such as GRE, GMAT, TOEFL . On one hand, the essay scoring task in such test costs huge human resources but the efficiency is low. On another hand, the essay score given by human rater is mostly determined by rater’s personal will, emotion and energy. An essay scored highly by one rater may recieve a low score from another rater. Even the same rater probably gives different scores for the same essay at different times. Thus, the fairness of essay scoring cannot be guaranteed. What’s more, automated essay scoring is also needed in the teaching of English writing skill. Usually, it
is a challenging task for one teacher to finish the essay scoring of all student essays in a short time. Thus, students cannot get feedback on their essays in time, leading to the situation that it is hard for them to improve their writing skill  . In such requirement background, researchers proposed automated essay scoring techniques. Automated essay scoring techniques has the advantage of fairness, less human resource cost and timely feedback . In general, automated essay scoring is a machine learning problem . More specifically, it is a supervised learning problem. Scored essays can be seen as labeled training data and unscored essays can be seen as unlabeled test data. The main process of automated essay scoring is to learn a scoring function or model from the training data and then use the scoring function or model to score essays in the test data. Previous solutions can be divided into mainly two categories: classification and regression . When automated essay scoring is regarded as a classification problem, the score is seen as the class label. Classical classification algorithms like KNN are applied to solve this problem. When it is regarded as a regression problem, the score is seen as a comparable value. Classical regression algorithms like multiple linear regression are applied to solve this problem. In this paper, we regard automated essay scoring as a ranking problem and plan to solve this problem by learning to rank algorithms. Learning to rank is a family of supervised learning algorithms that automatically construct a ranking model or function to rank objects such as the retrieved documents . The major advantage of learning to rank is its flexibility in incorporating diverse kinds of features into the process of ranking .
The major contributions of this paper are two-fold....