Standards Observation Form: Feedback to teachers on classroom implementation of the Standards, The

Stonewater, Jerry K

This article describes the Standards Observation Form, a new instrument developed to assess the degree to which teaching performance is consistent with the criteria of the NCTM Standards. The instrument is described, two variations of the instrument are presented, and portions of actual teacher assessments are illustrated as examples of the kinds of data that can be obtained when the Standards Observation Form is used.

NCTM’s Curriculum and Evaluation Standards for School Mathematics (1989) challenges mathematics teachers to change the focus of their teaching. Standards-based teaching is grounded in content based on worthwhile mathematical tasks that enhance reasoning and problem solving, in student and teacher discourse that creates an environment in which mathematics is thought about and communicated in a collaborative way, in appropriate tools for such mathematics learning and an environment that supports it, and in a continuous analysis of the teaching and learning process. Helping teachers implement these changes is complex and multifaceted, and includes such activities as inservice training to model effective and perhaps new teaching strategies, discussions and planning for revised curricula and classroom lessons that are consistent with the Standards (NCTM, 1989), practice at implementing new approaches, and feedback on the extent to which new teaching practices are meeting the criteria of the Standards. The focus of this article is on the feedback part of the process.

This article describes the Standards Observation Form1 (Stonewater, 1993), an instrument developed to gather information about classroom implementation of the criteria of the NCTM Standards. The instrument has been used in two projects (Johnson, 1992; Walters, 1991) designed to assess implementation of the Standards in classroom teaching. Results indicate that the Standards Observation Form: (1) reliably assesses the degree to which classroom instruction is consistent with the Standards; (2) provides adequate criterionreferenced information about what aspects of each standard have been met; (3) focuses on the interconnected nature of teaching and learning; and (4) sets a very high standard for teaching excellence. These results indicate that the Standards Observation Form can be readily used by teachers who want to gather Standards-based feedback on their classroom teaching and student learning. In the following, the Standards Observation Form is described, its development and inter-rater reliability are discussed, and two variations of the form are presented. Then, portions of three example assessments from Standard 1, Worthwhile Mathematical Tasks, are presented as illustrations of the kind of data that can be obtained using this form. Finally, three ways in which teachers can use this form to gather information about their own teaching are discussed (self-assessment, peer-assessment, and expert observer assessment). It should be emphasized that the instrument is intended to stimulate a climate of growth and development for teachers, not to build a punitive system where the information is used for evaluation of teacher performance (raises, tenure, etc.).

Description of Standards Observation Form

An example page from the Standards Observation Form for Standard 1, Worthwhile Mathematical Tasks, appears in Figure 1 and includes sample comments. In this particular example, an observer other than the classroom teacher was completing the form. The form has a similar page for each of the remaining five standards: teacher’s role in discourse; student discourse; tools for discourse; the learning environment; and analysis of teaching and learning. Additionally, there is a cover page (Figure 2) where the person completing the form can keep track of relevant data about the class (observer, teacher, school, date, etc.) and can describe the class. Note under “other comments” that this observer listed what to look for during the next observation. This was an important comment, because in that particular project teachers were observed numerous times in order to document the process of changing teaching approaches.

At the top of each page is a brief description of the particular standard and in the boxes below are the specific criteria for that standard, which are included as a prompt for the person completing the form. Below the boxes is a place for descriptive comments about what occurred during the lesson relative to the standard under consideration. For the sample in Figure 1 where an expert observer completed the form, it was noted that there were three mathematical tasks covered in the observed class: a review of mixed number arithmetic; a probability cooperative group experiment; and a worksheet on theoretical probability. Following this description is room for positive and negative examples of the implementation of the standard. For example, in the sample in Figure 1, the observer noted a positive example in the teacher’s implementation of criteria 1.3: “tasks convey ‘doing’ math” by noting that students were actively engaged in the cooperative probability experiment even though there was evidence of off-task behavior. The observer also noted a negative example in that the experiment appeared void of a purpose or focus. There was no problem solving emphasis as expected in criteria 1.4: “skill developed in context of problem solving/reasoning.”

Finally, each page includes a quantitative overall rating for how well the criteria for the particular standard were implemented. The rating ranges from “effectively implemented; no improvement needed” (rating of 1), to “very poor implementation; much improvement needed” (rating of 5). Again, for the Figure 1 example, the observer rated the implementation of standard 1 as a “3,” mixed implementation.

Development of the Instrument

The form was originally developed based on the text of NCTM’s Curriculum and Evaluation Standards for School Mathematics (1989). The criteria used on the form came directly from this document. While the form was under development prior to the publication of NCTM’s Professional Standards for Teaching Mathematics (1991), the criteria are certainly consistent with this later document.

After an initial draft of the form was developed, four expert teachers and one university mathematics faculty member pilot tested the instrument as expert observers. After watching a videotape of a teacher conducting a middle school mathematics lesson, the group used the form to analyze her instruction. On the basis of this trial, all participants felt the form was usable, was accurately based in the Standards, and would be a useful device for gathering information for feedback to teachers. The four teachers involved in the pilot test included three from a leadership team in the state’s Project Discovery (the statewide systemic initiative) and one veteran teacher who had returned to the university for a master’s degree. One of the Discovery leadership team members had extensive industrial experience as a trained engineer.

Inter-rater reliability results indicated that when using different expert observers, the form can be used to obtain consistent results. Pairs of observers rated eight different teachers on each of the six standards. Thus, there were 48 pairs of data, i.e. the ratings of each observer for each teacher for each standard. Of these 48 pairings, fifty percent were identical and an additional 39.6% had a difference of one unit. (For example, one observer might rate the implementation of a standard as a “4,” the other observer as a”5.”) Only 4 pairings were different by two units and one was different by three units. Overall, the average difference between ratings was 0.625 units, less than one unit difference, indicating very good inter-rater agreement.

Alternative Versions of Instrument

The original Standards Observation Form is used primarily to document classroom instruction. lf the form is being used by a teacher for self-assessment, the documentation provides description that can be useful later when the teacher redesigns the lesson. If it is being used by a peer, the documentation can be useful for recalling specific examples about the teaching in discussions the peer might have with the observed teacher. Or, if an expert observer is using the form, documentation is useful in preparing descriptive reports where clarity is needed about what actually occurred in the classroom.

Figures 3 and 4 present two variations of the Standards Observation Form and both provide more quantitative data than the original version. The example in Figure 3, based on Standard 2, Teacher’s Role in Discourse, uses a Likert-scale type rating for each of the criteria for the standard. While the criteria are exactly the same as in the original version of the instrument, they are each assigned a numerical rating based on a scale of 1 to 5: effectively implemented to very poor implementation.

The instrument version in Figure 4 for Standard 6, Analysis of Teaching and Learning, is also a replication of the original form, only the criteria are now presented in checklist fashion. Here, the person completing the form merely checks off whether or not the specific criterion occurred during the instruction. This version is easy to use, quick, and can pinpoint areas of strength or weakness, but does not provide the level of detail of either of the other two versions.

Example Assessments

The form has been used in two projects where the main goal was to assist teachers in developing instructional practices based on the Standards. In each case, the comments were made by an expert observer other than the classroom teacher. The following example assessments from Standard 1, Worthwhile Mathematical Tasks, are excerpts from the Standards Observation Form of observers’ comments of three different teachers. These examples are presented to illustrate the kind of descriptive data that can be obtained using this form and to show how the descriptions are keyed to specific Standards-based criteria. The first is an example of an “excellent” implementation of this standard as it was rated 1 by the observer. By way of contrast, the second example is a “middle of the road” implementation (rated 3), and the last is a “poor” implementation (rated 5). Thus, these examples also illustrate how the observation instrument can be used to discriminate between various levels of excellence in the implementation of the various criteria of the Standards.2

Example 1 – Mary’s “Excellent” Implementation

Mary’s eighth grade class was working on a geometry lesson in which they were discovering the angle measures in various polygons by determining, with a mirror, which vertex angle arrangement fit exactly to 360 degrees. Some angles worked, in which case the students knew to divide by the number of angles appearing in the mirror to determine each angle’s measure, while other angles did not work, requiring the students to develop a new method for deciding the angle’s measure. Students worked in pairs and were observed freely conjecturing and testing out ideas. The teacher provided directions for the activity and moved easily between whole class discussion and individual group consultation. The observers comments from the Standards Observation Form follow. Note particularly the detailed examples that relate to the numbered criteria on the sample page in Figure 1.

With respect to Standard 1, two criteria were very well implemented: promoting student understanding, reasoning and communication (1.1); and developing skill in context of problem solving and reasoning (1.4). With respect to the first criterion, students did some routine measurements and then got stuck on an obtuse angle in a rhombus. They had decided that the acute angle was 30 degrees, but found that the obtuse angle would not tessellate (since it was 150 degrees, but they did not know this yet). When two girls observed that this would not tessellate, they were not sure what to do next. The tessellation model failed, but they didn’t have another model in its place. They went on to another angle. Next, they discovered that in the trapezoid (“redzoid,” one called it, since it was made of red construction paper) had a 120 degree angle. They used this to compare the troublesome angle in the rhombus. Then the two girls reasoned that the troublesome obtuse angle must be between 120 and 180 degrees. At this point, they hypothesized that the angle must tessellate 2.5 times, which ultimately turns out to be incorrect. Criterion 1.1, promoting student understanding, was clearly met.

Then they started talking with the boys working together nearby. The boys had figured out that the obtuse angle in the rhombus was 150 degrees because the sum of the angles in a quadrilateral is 360 degrees. (How they knew this was not obvious; later, the teacher was even surprised.) Then the girl said, “We were right! 150 is between 120 and 180.” She then checked their conjecture that the obtuse angle would tessellate 2.5 times and concluded that this conjecture was false.

This was tremendous reasoning, conjecturing, and testing, as well as collaboration with another group.

Example 2 Martha’s “Middle of the Road” Implementation

Martha was also teaching an eighth grade class in a suburban school. The content was pre-algebra and the task was for students to write an algebraic equation from the sentence: “Start with a number and add four to it. Multiply this sum by three and you will get 39.” The observer’s comments follow:

This task, while important in algebraic reasoning, was presented without much attention to what the teacher knows about students and how they learn (criterion 1.5). Students had difficulty with this task. Of the three students who put their “algebra” on the board, none had the appropriate algebraic expression, nor did it appear from discussion that any other student had either. Something was missing between their numerical understanding of the problem and writing an algebraic statement. It was important, however, that the class was working on the idea of variable.

It was interesting to note student work on this task. Two of them had worked the problem backwards numerically, i.e., they divided 39 by 3 to get 13, then figured out what they had to add to 4 to get 13. The teacher seemed unaware that some students had implemented a reasonable numerical working backwards strategy and did not help them move from this approach to using a variable. Rather, the instruction was presented in one big leap. Thus, criterion 1.6, considering student prerequisite knowledge, was weak.

In general, the teacher was not thinking about how the students seem to think. She had predetermined answers and approaches. Finally, this particular problem was void of context and was not presented in a problem solving situation (criterion 1.4).

Example 3 – Nancy’s “Poor” Implementation

Nancy’s class was a self-contained sixth grade class in a rural setting. This class started with a review of fraction to percent conversion and decimal and mixed number arithmetic. Homework, which was about 30 text problems on fraction operations, was reviewed by putting answers only on the board and students reciting the answers when called on. Then, they went over answers on a worksheet, again by the teacher calling on students to give the answer. The teacher responded to incorrect answers by taking the class through the algorithm. No new lesson was introduced. Observer comments are based primarily in negative examples:

Problems students worked were void of context and meaning (criterion 1.4) and included unrealistic computations like dividing (by hand) 76.82 by 6.5 or converting 42 6/7 % to a decimal. No use of technology was observed and instruction can be characterized as routine, procedural, and algorithmic with no opportunity for reasoning, problem solving or cooperative learning (criteria 1.1 and 1.3). Feedback to students was non-diagnostic. This presentation was routine and algorithmic and did not promote student understanding (criterion 1.1) or develop skill in a problem solving context (criterion 1.4). This was particularly true when going over homework.

Explanations were not based on underlying concepts (criterion 1.2). Changing decimals to percent was explained as, “Move two places to the right,” or when multiplying two mixed numbers like 4 5/9 * 6 7/8, she recited an algorithm: “Denominator times whole number plus numerator remember that step?” She also recited the long division algorithm when dividing 76.82 by 6.5. In general, problems and instruction were meaningless and void of mathematical context.

Use of the Instrument

As illustrated by the above examples, one use of the Standards Observation Form is as a data collection devise for projects where specific examples are needed about classroom teaching. The examples and expert observer comments can be used later in redesigning training programs or in preparing reports for funding agencies. However, the instrument can also be used in amore localized setting, where the intent is for teachers themselves to gather information that they can use to more closely align their teaching with the Standards. For example, teachers who are new to the Standards, or who have had little experience trying out new types of teaching, might only use the instruments for selfassessment. On the other hand, teachers who are more comfortable working in teams to implement the criteria might use the instruments in a peer-assessment context. Finally, teachers who are really familiar with the Standards, who have been working to implement them for some time, and who have an environment of trust and openness with their principals, might use the instrument in a more formative context and build it into the actual system of evaluation in the district or building. Thus, a staggered implementation of the use of the instrument based on teachers’ readiness and comfort level with the Standards, can be used depending on local needs.


These three examples point out the richness and detail that can be obtained when using the Standards Observation Form and exemplify how the comments can be directly related to effective and ineffective implementation of the various criteria for each standard. Thus, the form keys instructional assessment to the criteria embedded in the Standards and provides information not only about teaching behaviors but also about how students are reacting. The form can be used if detailed examples are needed, or one of the variations can be used if more quantitative information is required. It is suggested that these instruments can be used by teachers themselves when conducting peer or self-improvement assessments, or by expert observers or researchers when documentation is needed about the extent to which a teaching sample meets the criteria of the Standards.

Footnotes: 1 The Standards Observation Form and the alternate versions described can be obtained by contacting the author.

2 In this example, the instrument was used with expert observers and the data were used in the preparation of reports to the funding agency, not to provide individual teachers with feedback.


Johnson, I.D. (1992). Miami Preble County Middle Grade Mathematics Project. Ohio Board of Regents grant no. 2-56.

National Council of Teachers of Mathematics. (1989). Curriculum and Evaluation Standards for School Mathematics. Reston, VA: The Council.

National Council of Teachers of Mathematics. (1991). Professional Standards for Teaching Mathematics. Reston, VA: The Council.

Stonewater, J.K. (1993). Standards Observation Form. Unpublished instrument.

Walters, E.G. (1991). The Ohio Mathematics/Science Discovery Project. NSF grant.

Author’s Note: The work for this article was partially supported by Ohio Board of Regents grant no. 2-56, the Miami Preble County Middle Grade Mathematics Project (Johnson, 1992).

Editor’s Note: Jerry K. Stonewater’s postal address is Miami University, Department of Mathematics and Statistics, Oxford, OH 45056, and e-mail address is

Copyright School Science and Mathematics Association, Incorporated Oct 1996

Provided by ProQuest Information and Learning Company. All rights Reserved

You May Also Like

Middle school mathematics teachers learning to teach with calculators and computers–Part I: Background and classroom observations

Middle school mathematics teachers learning to teach with calculators and computers–Part I: Background and classroom observations Bright,…

Using sociocultural theory to teach mathematics: A Vygotskian perspective

Using sociocultural theory to teach mathematics: A Vygotskian perspective Steele, Diana F This study describes an elementary teacher…

Views of three artificial intelligence concepts used in modeling scientific systems

Fuzzy logic, neural networks, genetic algorithms: Views of three artificial intelligence concepts used in modeling scientific systems Suna…

Macromedia Flash as a Tool for Mathematics Teaching and Learning

Macromedia Flash as a Tool for Mathematics Teaching and Learning Garofalo, Joe Macromedia Flash is a powerful and robust development…