This Year’s STAAR Tests in Texas to Utilize Computerized Grading for Written Responses

computers, grade, STAAR tests, Texas, written answers

Automated Scoring Engine for STAAR Exams Saves Texas Education Agency Millions

The Texas Education Agency (TEA) is implementing a new automated scoring engine for the State of Texas Assessment of Academic Readiness (STAAR) exams, which will allow written answers on open-ended questions to be graded automatically by computers. This move is expected to save the TEA an estimated $15-20 million per year that would have been spent on hiring human scorers through a third-party contractor.

The decision to introduce this automated scoring system comes after the redesign of the STAAR test in 2023. The test now includes fewer multiple choice questions and more open-ended questions, known as constructed response items. However, grading these constructed responses was time-consuming and required a large number of temporary scorers. In 2023, the TEA hired approximately 6,000 temporary scorers, but with this new automated scoring process, they will need fewer than 2,000 scorers this year.

To develop the scoring system, the TEA collected 3,000 responses that underwent two rounds of human scoring. The automated scoring engine learns the characteristics of responses from this field sample and is programmed to assign the same scores that a human would have given. During the current testing period, the computer will first grade all the constructed responses. Then, a quarter of the responses will be rescored by humans. The computer will automatically reassign responses to human scorers if it has “low confidence” in the score it assigned or if it encounters a response that uses slang or words in a language other than English. Additionally, a random sample of responses will be handed off to humans to verify the computer’s work.

It is important to note that despite similarities to artificial intelligence chatbots like GPT-4, TEA officials do not consider the scoring engine to be artificial intelligence. The system does not “learn” from responses and always defers to its original programming set up by the state.

The move to automated scoring reflects a growing trend in the education sector to utilize technology and artificial intelligence to improve efficiency and reduce costs. With advancements in natural language processing technology, automated scoring engines have become increasingly reliable in assessing open-ended responses. By adopting this technology, the TEA is streamlining the grading process, allowing for faster feedback to students and making it more efficient for the state agency to evaluate Texas schools.

One of the primary advantages of automated scoring is its ability to provide consistent and standardized grading. Human scoring can be subjective, influenced by individual biases or fatigue. Automated scoring engines, on the other hand, follow pre-determined criteria and algorithms consistently, ensuring fairness in evaluating students’ responses. This consistency is particularly crucial in high-stakes examinations like the STAAR test, where the results have significant implications for students, teachers, and schools.

Another benefit of automated scoring is its potential to provide faster feedback to students. With traditional manual scoring, it could take weeks or even months for students to receive their results. However, with automated scoring, results can be processed and delivered almost immediately after the completion of the exam. This not only allows students to gauge their performance but also enables teachers to identify areas of improvement and provide targeted support promptly.

While automated scoring offers several advantages, it is essential to address potential concerns and limitations. One common concern is the reliance on artificial intelligence and its potential biases. Critics argue that AI systems may perpetuate existing biases in grading, potentially disadvantaging certain student groups. Therefore, it is crucial to ensure that automated scoring engines are regularly audited, monitored, and refined to minimize biases. The TEA’s decision to manually rescore a subset of responses demonstrates their commitment to ensure the accuracy of the automated system.

Additionally, some argue that automated scoring may not capture the full complexity of students’ responses. Open-ended questions often require critical thinking, creativity, and nuanced arguments, which can be difficult for a machine to assess accurately. However, automated scoring engines have significantly improved in recent years, employing sophisticated algorithms and natural language processing techniques to evaluate responses more effectively. The TEA’s field sample and human rescoring process further enhance the system’s accuracy and reliability.

The introduction of the automated scoring engine for STAAR exams in Texas is a significant step towards a more efficient and cost-effective evaluation process. By leveraging technology, the TEA is reducing the burdensome task of manual scoring, reallocating resources, and providing timely feedback to students. As with any technological innovation, continuous monitoring and improvement are crucial to address any potential biases or limitations. Nonetheless, automated scoring holds great promise in enhancing the assessment process in education and ensuring a fair and objective evaluation of students’ knowledge and skills.

Source link

Leave a Comment