Loading…

A deep-learning-based grading system (ASAG) for reading comprehension assessment by using aphorisms as open-answer-questions

Today reading comprehension is considered an essential skill in modern life, therefore, higher education students require more specific skills to understand, interpret and evaluate texts effectively. Short answer questions (SAQs) are one of the relevant and proper tools for assessing reading compreh...

Full description

Saved in:

Bibliographic Details
Published in:	Education and information technologies 2024-03, Vol.29 (4), p.4565-4590
Main Authors:	Mardini G, Ivan D, Quintero M, Christian G, Viloria N, César A, Percybrooks B, Winston S, Robles N, Heydy S, Villalba R, Karen
Format:	Article
Language:	English
Subjects:	Computer Appl. in Social and Behavioral Sciences Computer Science Computers and Education Deep learning Education Educational Technology Grading Information Systems Applications (incl.Internet) Reading comprehension Skills Undergraduate Students User Interfaces and Human Computer Interaction
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Today reading comprehension is considered an essential skill in modern life, therefore, higher education students require more specific skills to understand, interpret and evaluate texts effectively. Short answer questions (SAQs) are one of the relevant and proper tools for assessing reading comprehension skills. Unlike multiple-choice questions, SAQs allow for the assessment of cognitive abilities such as attention, language, perception, and problem solving. However, the task of SAQs scoring is time-consuming and susceptible to ambiguity. Automatic Short Answer Grading (ASAG) is a new paradigm that could help solve these problems. This experimental analysis aims to implement ASAG using several approaches to sentence embedding based on deep learning with a multilayer perceptron regression layer on the top, trained with a reading comprehension dataset based on aphorisms. For experimental testing, the available dataset is composed of answers given by 199 undergraduate students in Spanish. BERT and Skip-Thought models are tested with different hyperparameters to find the best performance in terms of Pearson correlation coefficient and RMSE against human experts grades. The result of the current study showed that BERT model performed better than other approaches.
ISSN:	1360-2357 1573-7608
DOI:	10.1007/s10639-023-11890-7