Preserved in Portico This version is not peer-reviewed
Low Inter-rater Reliability of a High Stakes Assessment of Teacher Candidates
: Received: 14 August 2021 / Approved: 16 August 2021 / Online: 16 August 2021 (10:51:52 CEST)
A peer-reviewed article of this Preprint also exists.
Journal reference: Educ. Sci. 2021
The Performance Assessment for California Teachers (PACT) is a high stakes summative assessment that was designed to measure pre-service teacher readiness. We examined the inter-rater reliability (IRR) of trained PACT evaluators who rated 19 candidates. As measured by Cohen’s weighted kappa, the overall IRR estimate was .17 (poor strength of agreement). IRR estimates ranged from -.29 (worse than expected by chance) to .54 (moderate strength of agreement); all were below the standard of .70 for consensus agreement. Follow up interviews of 10 evaluators revealed possible reasons we observed low IRR, such as departures from established PACT scoring protocol, and lack of, or inconsistent, use of a scoring aid document. Evaluators reported difficulties scoring the materials that candidates submitted, particularly the use of Academic Language. Cognitive Task Analysis (CTA) is suggested as a method to improve IRR in the PACT and other teacher performance assessments such as the edTPA.
Inter-rater reliability; preservice teacher performance assessment; PACT; edTPA; weighted kappa; cognitive task analysis; qualitative; quantitative
SOCIAL SCIENCES, Education Studies
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.