7 August 2026
HUFLIT University
Asia/Ho_Chi_Minh timezone

ChatGPT-5.0 vs Human Scoring in IELTS Writing Task 2: A Comparison Across Prompt Designs and Criteria

Not scheduled
20m
Main Conferene Hall (HUFLIT University)

Main Conferene Hall

HUFLIT University

828 Sư Vạn Hạnh street, Hòa Hưng ward, Hồ Chí Minh city, Vietnam
Technology and Digital Support for ESL Development

Speaker

Thanh Trinh (HCMC University of Technology and Engineering)

Description

This study aims to explore whether different prompting designs produce significant differences in ChatGPT5.0-generated scores for IELTS Writing Task 2 essays and to examine to what extent ChatGPT5.0’s essay scorings are aligned with human ratings. Using a dataset of 56 essays, scores generated under two prompting designs (with or without calibration examples) were compared with each other and with a human benchmark derived from multiple raters. Scores were analyzed across four IELTS criteria (Task Response, Coherence and Cohesion, Lexical Resource, and Grammatical Range and Accuracy) as well as overall performance using descriptive statistics, repeated measures ANOVA, correlation, and intraclass correlation coefficients (ICC). Findings revealed that prompt design, i.e., inclusion of calibration examples, influenced automated essay scoring capabilities of ChatGPT since the scores significantly differed between the two prompting scenarios. While ChatGPT and human scores showed insignificant variation at the overall level, systematic discrepancies across the three marking criteria were detected, with the exception to Task Response. Specifically, ChatGPT tended to assign higher scores for Coherence and Cohesion and lower scores for lexical and grammatical aspects. Pearson Correlation analyses found moderate to strong relationships between ChatGPT-based scores and human ratings, suggesting ChatGPT could reliably rank writing performance whereas lower intraclass correlation coefficients signified a weaker alignment in terms of absolute scorings.
Keywords: ChatGPT5.0, IELTS Writing, reliability

Author

Thanh Trinh (HCMC University of Technology and Engineering)

Presentation materials

There are no materials yet.