Measured Progress logo and motto: It's all about student learning. Period.
K-12 Assessments: We supply standards-based assessments tailored to meet the needs of statewide-testing programs and classrooms. Professional Development: We provide professional development services that build assessment literacy and promote standards-based classrooms. Educational Resources: Supplement your educational programs with our advisory services, research, publications, and newsletter. About Us: Meet the people. Sense the passion. See the possibilities.

Holistic Scoring Guides, Part 2

A reader has requested a deeper look at holistic rubric scoring. This year, Measured Progress scored the Massachusetts Comprehensive Assessment System (MCAS) test, which uses a holistic system to score all of its mathematics items. We will use two MCAS math items and their rubrics (released in 1999) to further delve into the realm of holistic math scoring.

After reading about the “points-for-parts” system of scoring mathematics, many of you may be wondering how mathematics can be scored any way other than quantitatively. When I first encountered holistic scoring within such a quantitative content area, I asked the same question. However, after using the holistic system extensively, I can help you understand both the method and the advantages.

We have two four-point constructed-response items from MCAS grade eight (on the Massachusetts Department of Education Web site): Mix and Match Clues (Question 8) and Pendulum (Question 23). Use this link to find the table and graphic for question 23. (You might find it helpful to print out the items, graphics, and sample student responses, and look at the material as we refer to it throughout the article.) The first item, Mix and Match Clues, gives the students different clues, labeled A through E, and requires them to provide number answers that fit certain clues as an indication of understanding various concepts of number theory.

The rubric indicates that to receive a 4-score on this item, the student must demonstrate a “comprehensive understanding” of number theory and problem solving. What does that mean? What constitutes a comprehensive understanding, as opposed to a general understanding (at the 3-score level) or a basic or minimal understanding at lower score levels? These descriptor words found in the rubric are the most important part of holistic scoring; they provide us with a framework in which to score. However, we need to begin somewhere in defining these words in terms of the student responses that fall into each category.

The best way to “anchor” a solid understanding of this when doing holistic scoring is to begin by viewing student responses (referred to as anchor papers). Please note that hundreds of student responses are considered before any are chosen to represent each score-point group in a process called “benchmarking.” Eventually, this process produces various samples of student work for use in training scorers to see the proper holistic view of each item and arrange the responses into qualitative categories.

The first step, however, is to create the defining lines of identifying the differences between the “comprehensive” answer a 4-score provides, the “general” understanding a 3-score provides, the “basic” response of a 2-score, and the “minimal” 1-score student response. During training sessions, scorers view all of the student responses in succession and compare their value in addressing the necessary standard and required parts. Clear differences in the quality of the work are evident.

Unlike points-for-parts scoring, under holistic scoring we are not held to the strict guidelines of giving points for certain material that the student presents. It is more effective to get a sense of the complete response as the student has written it. After assessing the whole response, a holistic scorer must “take a step back” and see the response as a whole. Does the student evidence show you complete understanding of the standard being assessed? (Or, in this case, the word “comprehensive” is used.) Once a scorer is comfortable with the score given to each response, he or she returns to viewing the student responses separately, noting any errors or weaknesses that might change the holistic score. (A benefit of using holistic scoring is that there is room for a range within the qualities of “comprehensive,” “general,” “basic,” and “minimal,” without disrupting the score point given.)

Scoring annotations for Mix and Match Clues are provided below.

Mix and Match Clues

Score Point 4: The student shows exemplary, or comprehensive, understanding of the concepts of number theory (divisibility and prime numbers) by accurately answering all parts of the prompt. The student consistently chooses a correct solution that is between 150 and 200 and demonstrates a thorough understanding of prime numbers. 

Score Point 3: The student provides a correct solution for parts a through c, which deal with divisibility. Understanding of prime numbers in part d is not clear; the student has stated it is “not possible,” but the explanation is flawed. (Please note that the question does not ask the students to prove divisibility.) This is a flaw in the response and this general understanding merits a score of 3. 

Score Point 2: These responses show mixed evidence of understanding the concepts involved. In the first three parts, the student consistently includes the numbers 150 and 200, where the prompt specifically said greater than 150 and less than 200. While the solutions provided by the student are correct, the inclusion of these numbers shows some lack of understanding. In part d, the student shows a weak understanding of prime numbers. This evidence of basic, or partial, understanding merits a 2-score. 

Score Point 1: This response shows only a minimal understanding of number theory. All of the solutions provided for parts a, b, and c are incorrect. The definition of a prime number in part d is enough evidence of understanding prime numbers to warrant a score of 1. 

The same procedure we followed in Mix and Match Clues can be used to begin to assess any mathematics item with use of a prepared holistic rubric. Take this opportunity to follow the same procedure for the Pendulum item. Scoring annotations for this item are provided below.

Pendulum 

Score Point 4: This response shows an exemplary, or comprehensive, understanding of the concepts involved by correctly answering all parts of the question. The graph is correctly labeled and plotted, the verbal description in part c is clear, and the equation in part d is correct. 

Score Point 3: This response shows a good general understanding of the concepts involved, but there is a flaw that reduces this to a 3-score paper. In part c, the student specifically describes the calculation for a time of 12 seconds instead of generalizing the statement as requested. All other parts are correct, and while part c is correct at 12 seconds, it is not what students were asked to do. As per the rubric, the student shows general understanding of the pattern by identifying extensions and describing the pattern, with an error in part c. 

Score Point 2: This response shows an overall partial, or basic, understanding of the concepts involved. The graph in part b is not properly labeled on either axis, therefore the three points the student plotted have no reference and the graph cannot be evaluated. The student's demonstrated inability to describe this pattern graphically shows mixed evidence of understanding and merits a 2 score. 

Score Point 1: This response shows only minimal evidence of understanding by only answering part b. While the points are plotted correctly, there is no continuous curve. This response does not contain enough other work to merit more than a 1 score. 

I hope this article has helped your understanding of holistic scoring. Having students create, refine, and use a holistic scoring guide in the classroom to assess their own work can be a powerful teaching tool. This type of guide may be easier for students to create, and can provide discussion of assessment terms such as thorough, complete, partial, general, minimal, and irrelevant in relation to the work they are assessing.

E-mail us any questions you have about scoring so we can address them in future issues.


About the Author
Sarah Gagnon is an acting chief reader at Measured Progress
benchmarking:
process of identifying samples of student work representing each score-point group