Will standardized testing of students really weed out Seattle’s bad teachers?

In Seattle, the MAP test and the new MSP, which replaced the controversial WASL, are part of the red
Elaine Porterfield  |   April 2011   |  FROM THE PRINT EDITION

Taking the measure of a teacher: It’s the cry of education reform in the 21st century—in Seattle and nearly everywhere else. If you can figure out who is a good teacher, or at least a good-enough teacher, you can reward him or her while simultaneously culling the bad ones from the classroom. And when you do that, the thinking continues, student achievement improves along with the country’s competitiveness, and we’ll all live happily ever after, as in Lake Wobegon, where all the children are above average, every one of them.

It sounds so simple and sensible. Few disagree on the need for accountability in teaching. But exactly how do you do that? And how precisely do you know if such measures are accomplishing what they’re supposed to?
One big push under way in education reform circles—including here in Seattle and within the Obama administration—is evaluating teacher performance in core subjects like math and reading via the results of standardized tests taken by students. If a teacher’s class scores are going up over time or coming in at a high level on statewide tests like the Measurements of Student Progress (MSP) or Seattle Public Schools’ additional Measures of Academic Progress (MAP), then that teacher must be doing well, correct? Maybe. Maybe not.

Both tests are in their second year in the Seattle district. The statewide MSP, given in May, replaces the controversial WASL (Washington Assessment of Student Learning) for students in third through eighth grades. (The MSP test is more streamlined, with fewer long-form answers.) The MAP test is given to students in kindergarten through ninth grade by computer three times a year. Critics say that neither test is always a decent measure of student learning. English-language learners, for example, must take the test despite having limited fluency, as must students with learning differences.

The topic constituted a big chunk of the negotiations late last summer between Seattle Public Schools and the teachers’ union. The same idea is being debated in other districts around the country. This winter in Massachusetts, for example, the state’s largest teachers’ union broke ranks with other major teacher labor groups around the nation and offered a proposal to tie students’ test results to which teachers should get promotions and which should be fired.

Talk about gasoline on a fire! To get an idea of how polarizing that notion is within teaching circles, consider that the American Federation of Teachers solidly opposes the idea of evaluating teachers predominantly on single measures of student achievement, while the Obama administration required that states do away with any rules that prevent schools from evaluating teachers on student performance if they wanted to be considered for hundreds of millions of dollars in education grants offered in a competition among the states.

Seattle Public Schools, in its latest three-year contract with teachers, agreed to last summer, has for the present ended up somewhere in the great middle (though the exact details were still being hashed out between the union and the district at press time). Yes, standardized scores will be used as part of evaluating teachers, but only as part of a greater whole. Teachers will now be evaluated as unsatisfactory, basic, proficient or innovative, instead of just satisfactory or unsatisfactory, within the framework of the grades or subjects they teach (which naturally varies considerably from kindergarten to high school).

The way it will work here looks something like this: “If [a given teacher’s] student growth over a two-year period of all students’ averages doesn’t meet a certain mark, it will trigger more scrutiny of your teaching practices,” says Olga Addae, president of the Seattle Education Association. Seattle Public Schools says that evaluating teachers in part on test scores is an essential component of ensuring educational excellence. “Research shows that teaching quality is the single most important school-based factor in student success,” the district says in a statement. “The new agreement will create a system that recognizes teaching excellence and offers specific support to teachers who need help to ensure that all of their students are learning. A robust new evaluation system, developed in collaboration with teachers, will support them as professionals, recognize their excellence, and encourage more collaboration to strengthen instruction....We will work with teachers to develop measures of student growth so that principals and teachers have reliable and timely information to inform their conversations about student learning in every classroom.”
In Chicago, a pilot project under way in 100 schools seeks a multidimensional way to evaluate teachers. The project, called Excellence in Teaching, examines factors such as classroom performance on many levels. A detailed 22-item list is used to rate teachers, including in areas such as classroom discussion: Is the teacher engaging in real discussions with students or merely encouraging the recitation of facts?

While education reformers are gonzo for using test scores to evaluate teachers, other influential groups urge caution. The Board of Testing and Assessments of the National Academy of Sciences believes attaching high stakes to standardized tests is a misuse of the tests, says Seattle parent and activist Joan Sias. Relying on standardized tests to evaluate teachers, as well as individual schools and school-district practices, leads to score inflation and invalidation of the tests so they’re no longer genuine measures of student achievement, she says. It also fosters a narrowing of the curriculum and encourages adult cheating to ensure that kids get better scores so teachers and schools won’t face punitive sanctions, says Sias, who has a background in data analysis. As well, the MAP is simply not designed to precisely evaluate teachers, she says. It’s a test to determine how much a student has learned in certain areas and is ill suited to formative assessment.

Studies have examined standardized test scores and how they correspond to teachers over time, and found that it’s common to have a teacher with students scoring in the top one-quarter one year, only to have that teacher drop to the lowest one-quarter the following year with a different group of students who bring different educational backgrounds and abilities to a class, she says.

Another school district critic, Melissa Westbrook of the community blog SaveSeattleSchools.blogspot.com, agrees that looking at results exclusively from tests like the MAP probably isn’t the way to go in evaluating teachers.

“There are a whole bunch of issues about appropriateness of the test, but put that aside,” Westbrook says. Using it for tasks such as evaluating teachers means the district is not using the test for how it was designed. A basic problem is that the test may not inform educators very well on what a student is actually supposed to be learning in class, she says. For example, an English teacher can give a middle school class an assessment to see how well students did in comprehending a novel they just read. The teacher crafts the exam based on the book, with the intent of knowing how well the students understood the novel. But a standardized assessment such as the MAP or MSP, with generic questions, could be used to determine how “well” that teacher taught, based on the students’ scores. “Do those assessments quantifiably say how well a given teacher is teaching?” Westbrook asks. Is that test really getting at the class’ subject matter?

The Economic Policy Institute, a nonprofit, nonpartisan organization based in Washington, D.C., issued a report on this very subject last fall, concluding that heavy reliance on student test scores when evaluating teachers is misguided and generally not helpful. In fact, some harm can come from the practice, it says, such as teachers “teaching to the test,” which leads to “narrowing and oversimplifying the curriculum to only the subjects and formats that are tested, reducing the attention to science, history, the arts, civics and foreign language, as well as to writing, research, and more complex problem-solving tasks.” In other words, the challenging stuff we actually want our kids to embrace, such as clear writing, creativity, how to go at an issue or problem from a number of angles, communicating in another world language or being an active citizen.

Another problem, according to the Economic Policy Institute report, is that standardized test scores can swing widely from year to year in schools with at-risk students. This means teachers can become demoralized and even discouraged from working with needy students if their evaluations are tied to standardized test results. It also means it’s tougher to retain teachers who work with poor students. “Legislatures should not mandate a test-based approach to teacher evaluation that is unproven and likely to harm not only teachers, but also the children they instruct,” the report says.

Retired Seattle high school math teacher Dan Dempsey, a perpetual observer of math instruction in the district, goes a little ballistic at the thought of evaluating teachers on standardized test results. Dempsey, who was involved in a winning lawsuit last year against the district’s adoption of a controversial “discovery”-based high school math curriculum, says teachers’ hands are already tied by being forced to use an inadequate method of math instruction. In discovery-type math, concepts are emphasized over calculating skills, and students are encouraged to work in groups and to explain their answers. Dempsey said many of the students who came to his high school classes with a discovery-math background struggled to recall basic concepts, and were often way behind where they should have been in their studies.
“They [the district officials] are spending an enormous amount of money to get data,” Dempsey says. “The point is, we ought to be doing what works in the first place. They say it’s basically the teacher’s fault that students aren’t learning [math], and we just need to figure out which teachers aren’t doing the job, get rid of them. But we don’t have a tool to do that.”

So what’s next for teacher evaluations? The pressure to evaluate both students and teachers by standardized tests isn’t likely to go away, as politicians in a time of budget woes search for results and accountability. The toughest question, then, would seem to be: Can we figure out how to do it right?