Standard 19: Implementation of the Teaching Performance Assessment (TPA): Assessor Qualifications, Training, and Scoring Reliability

The teacher preparation program establishes selection criteria for assessors of candidate responses to the TPA. The selection criteria include but are not limited to pedagogical expertise in the content areas assessed within the TPA. 

Trained assessors who have substantial content area and pedagogical expertise score all FAST tasks. The FAST pool of assessors is limited to KSOEHD education faculty, Single Subject content faculty from many departments throughout the University, supervising teachers, field supervisors and local BTSA directors.

The program provides assessor training and/or facilitates assessor access to training in the specific TPA model(s) used by the program. The program selects assessors who meet the established selection criteria and uses only assessors who successfully complete the required TPA model assessor training sequence and who have demonstrated initial calibration to score candidate TPA responses. 

Training modules have been developed for each of the four FAST assessment tasks.  All scorers (including those who double-score for statistical purposes) must attend scorer training and calibration for the particular task they are scoring if the task or rubric changes or if it is determined that the scorer’s ratings are unfair or biased or if the scorer cannot recalibrate during the annual recalibration period. Two Kremen School of Education faculty conduct all training.  The FAST Coordinator maintains a list of task-specific trained and calibrated scorers. 

The program periodically reviews the performance of assessors to assure consistency, accuracy, and fairness to candidates within the TPA process, and provides recalibration opportunities for assessors whose performance indicates they are not providing accurate, consistent, and/or fair scores for candidate responses. 

As approved by CCTC, FAST assessment tasks and scoring patterns are reviewed on a two-year cycle. This cycle assures that scores from every task are carefully scrutinized every two years and that an in depth analysis is conducted to ensure accurate, consistent, and fair scoring. An annual Equity Analysis and Reliability Report focuses on two tasks and the degree of alignment among scorers with particular attention to any discrepancies between pass/fail and the influence of gender differences, ethnic group affiliation, and self-identified levels of English language fluency on the scoring of those tasks.

Two-Year Review Cycle of Task Review

Semester/ Phase

Task

08-09

09-10

10-11

11-12

SS Sem. 1/ MS Phase1

Comprehensive Lesson Plan Project

Double Score

Review

Double Score

Review

SS Sem. 1/ MS Phase 2

Site Visitation

Review

Double Score

Review

Double Score

SS Sem. 2/ MS Phase 3

Teaching Sample Project

Double Score

Review

Double Score

Review

SS Sem. 2/ MS Phase 3

Holistic Proficiency

Review

Double Score

Review

Double Score

The program complies with the assessor recalibration policies and activities specific to each approved TPA model, including but not limited to at least annual recalibration for all assessors, and uses and retains only TPA assessors who consistently maintain their status as qualified, calibrated, program-sponsored assessors. The program monitors score reliability through a double-scoring process applied to at least 15% of TPA candidate responses. 

The recalibration of assessors occurs annually.  For the Comprehensive Lesson Plan Project, assessors are recalibrated at the scoring session just prior to the actual scoring while recalibration on the other three tasks occurs at task-specific recalibration seminars scheduled prior to scoring sessions or, most recently, by electronic submission.  In all situations scorers score a common example of student work using the Task-specific rubric.  These scores are checked by one of the two FAST trainers.  If scorers do not initially recalibrate, they must attend formal scorer training and calibrate before they can score student work. The date of scorer’s most recent calibration is maintained on an assessor database.

The FAST system has established a bi-annual training, recalibration, and double-scoring schedule for each task in the FAST system. Based on a random sampling by the FAST Coordinator, fifteen to twenty percent of Single Subject Program and Multiple Subject Program candidate responses on each of the four tasks are double scored to evaluate inter-rater reliability.  Initially scorers must attend scorer training and calibrate to the rubric in order to score while annual recalibration allows scorers to continue to score candidates’ work.  The majority of FAST scorers are faculty fieldwork supervisors and since scoring is considered a responsibility for supervising student fieldwork, there is a professional impetus to be trained and to maintain one’s calibration.  Both the Single Subject and Multiple Subject Programs have consistently high levels of reliability on all four tasks no doubt due in part to the discreet nature of the scoring rubrics as well as a highly skilled and consistent scorer pool.  

The program establishes and maintains policies and procedures to assure the privacy of assessors as well as of information about assessor scoring reliability. In addition, the program maintains the security of assessor training materials and protocols in the event that the program uses its own assessors (such as, for example, a designated Lead Assessor) to provide local assessor training.

The FAST Coordinator assures the privacy of assessors and information about assessor scoring reliability by maintaining records of each trained and calibrated assessor by task.  Only those who have been trained and calibrated are allowed to score candidate responses to the tasks. Records of assessor scoring patterns are maintained on the Excel database that records all scores earned by each candidate in the program.  Periodic checks are made of assessor scoring patterns.  Access to aggregated candidate information and to scorer records is limited to the FAST Coordinators, a designated staff member responsible for data entry, and to the Dean, Associate Dean, Department Chairpersons, and credential program coordinators of the KSOEHD on a need-to-know basis and upon request.

Two designated FAST trainers who maintain the continuity of training through scripted training modules for each of the four tasks conduct assessor training.  The script for each task is designed to be applicable regardless of the educational role of the scorer.  Recognizing minor, albeit important, programmatic modifications in training procedures and materials for each task, all university faculty, field supervisors, supervising teachers, and BTSA directors scoring or double scoring the same task receive the same scorer training.  All training modules include the same basic training elements: assessor guidelines, bias training, and calibration to the rubric.  Specifically, trainers provide an overview of the Project using the FAST Manual.  Assessor guidelines such as the need for confidentiality, a reliance on the rubric, and a warning not to be fooled by writing skills, etc. are reviewed followed by a discussion of scorer bias and how scorers must maintain an awareness of their own personal biases and how they can influence scoring. The training then focuses on the rubric, having scorers identify specifically what elements of the TPE are required and highlighting the qualitative descriptors that differentiate the four levels of performance. Once completed, scorers work in groups of three or four to reach a consensus score for each TPE evaluated by the task using a piece of real student work.  Group consensus scores are then compared to the actual scores awarded by experienced scorers.  Once the group is in alignment with the actual score, each individual calibrates to the rubric by independently scoring a candidate’s actual Task response. If the scorer calibrates, he/she is allowed to score student’s work; if they do not calibrate, they return to a rubric discussion group with the Trainer, repeating the highlighting, reaching consensus and independent scoring training steps.  Materials are uniform and maintained in a secure location. Such uniformity in scorer training enhances reliability and provides for input from an array of expert scorers with multiple pedagogical perspectives.

Back to Top