Preprint
Article

This version is not peer-reviewed.

Low Inter-rater Reliability of a High Stakes Assessment of Teacher Candidates

A peer-reviewed article of this preprint also exists.

Submitted:

14 August 2021

Posted:

16 August 2021

You are already at the latest version

Abstract
The Performance Assessment for California Teachers (PACT) is a high stakes summative assessment that was designed to measure pre-service teacher readiness. We examined the inter-rater reliability (IRR) of trained PACT evaluators who rated 19 candidates. As measured by Cohen’s weighted kappa, the overall IRR estimate was .17 (poor strength of agreement). IRR estimates ranged from -.29 (worse than expected by chance) to .54 (moderate strength of agreement); all were below the standard of .70 for consensus agreement. Follow up interviews of 10 evaluators revealed possible reasons we observed low IRR, such as departures from established PACT scoring protocol, and lack of, or inconsistent, use of a scoring aid document. Evaluators reported difficulties scoring the materials that candidates submitted, particularly the use of Academic Language. Cognitive Task Analysis (CTA) is suggested as a method to improve IRR in the PACT and other teacher performance assessments such as the edTPA.
Keywords: 
;  ;  ;  ;  ;  ;  ;  
Subject: 
Social Sciences  -   Education
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated