Popham on standardized testing

Konopak (jkonopak who-is-at ou.edu)
Tue, 27 Apr 1999 14:55:41 -0500

Fy eieio--------this off the AASA site:
http://www.aasa.org/Issues/assessment8-26-98.htm
==============begin forwarded material=====================
Your School Should Not Be Evaluated
By Standardized Test Scores!

By W. James Popham (Prof. Emeritus, UCLA)

Teachers these days are experiencing almost relentless pressure to show
that they are effective.Unfortunately, in many communities, the chief
indicator by which people judge a school staff's
success is the performance of that school's students on standardized
achievement tests.

If a school's standardized test scores are high, it is thought the
school's staff is competent. If a school's standardized test scores are
low, the school's staff is seen as ineffectual. In either case, because
school success is being measured by the wrong yardstick, those evaluations
are apt to be in error.

One of the reasons that students' standardized test scores continue to be
used as the most important factor in evaluating a school is that teachers
and administrators do not really understand why it is that standardized
test scores provide a misleading estimate of a school staff's effectiveness.

What's in a Name?
A standardized test is any examination that's administered and scored in a
predetermined, standard manner. There are two major kinds of standardized
tests, aptitude tests and achievement tests.

Standardized aptitude tests predict how wellstudents are likely to perform
in some subsequenteducational setting. The most common examples ofthese are
the SAT (Scholastic Assessment Test)and the ACT Assessment, both of which
attempt to forecast how well high school students will perform in college.

But standardized achievement test scores are what parents and school board
members rely on whenthey evaluate a school's effectiveness. Nationally,
there are five such tests in use, namely, the California Achievement Test,
Comprehensive Test of Basic Skills, Iowa Test of Basic Skills, Metropolitan
Achievement Test and Stanford Achievement Test.

A Standardized Test's Function
The measurement mission of a standardized achievement test is to permit
comparisons among students. In order to do so, a standardized achievement
test is initially administered to a nationally representative sample of
students. Scores of these students, known as the test's norm group,
provide a comparative scale by which the scores of future test-takers are
interpreted. So, when a teacher hears that "Sally scored at the 75th
percentile" on such a test, this means that Sally's score was better than
scores of 75 percent of the students in the norm group.

Standardized achievement tests usually cover four subject areas:
mathematics, language arts, social studies and science. The mission of such
tests is to allow comparisons among students with respect to the knowledge
and skills they possess. In social studies, for example, such a comparison
would be based on how a student's test performance stacks up to the test
performances of students in the norm group.

To make these comparisons properly, standardized achievement tests must
"spread out" students'
scores. If too many students earn similar scores, it is difficult to make
the fine-grained comparisons among those students that are necessary for
such tests to do their sorting job.

Three Strikes and You're Out!
Teachers need to understand three central reasons why a school's staff
shouldn't be judged via
standardized test scores. The first of these is the enormous, and
usually unrecognized mismatch between what's tested and what's taught.
Because of the considerable curricular diversity in this nation, what's
taught in different school districts often varies substantially. Test
companies do their best to accommodate these curricular differences by
making their content-coverage very general. Yet, in many instances more
than half of what's tested is not even supposed to be taught. It's obvious
that teachers should not be judged on the basis of tests that don't
measure what those teachers ought to be teaching. That's strike one.

The second problem with standardized tests is that they fail to measure
the most important things teachers teach. This problem stems from the need
of these tests to produce a sufficient spread of scores. Test items that do
the best job of spreading out students' scores are those items that are
answered correctly by about half of the test-takers. Test items that are
answered by large proportions of students, for instance, 80 percent or
more, are usually not put in standardized tests in the first place, and
will most likely be eliminated when the tests are revised.

But here's the catch. The items on which many students score well usually
cover the most important content that teachers try to teach. Yet, the more
effectively teachers teach such content, the better students will perform
on items dealing with that content. As a consequence, the test items
covering this content will often be discarded. In short, if teachers do a
great job in promoting students' mastery of important knowledge and skills,
it is unlikely that this content will be measured on a standardized
achievement test. That's strike two.

A third problem with using students' standardized achievement test
scores to judge a school staff's success may make some teachers a mite
uncomfortable, but it must be acknowledged. The problem is that teachers'
familiarity with a test's content can, over time, artificially inflate
students' test scores. Here's how this takes place. These tests are
frequently reused in a school district for quite some time, often as long
as 5-10 years. Teachers themselves usually administer these tests each
year. In many instances, teachers are directed by test publishers to first
"take" the standardized test themselves so they'll be familiar with the
test when they administer it.

Think about an elementary teacher (let's call her Mrs. Hill.) who, each
spring gives her students an end-of-school-year nationally standardized
achievement test on which there are two items dealing with "the origins of
the steam engine." After a year or two of administering the test, don't
you think Mrs. Hill might want to teach her future students about the
ancestry of steam engines? It's not that she's trying to teach students how
to respond to the test's two steam-engine items. Instead, she simply might
conclude that if the developers of a "national" test believe the origins of
steam engines are so significant, perhaps she should be spending a bit of
classroom time on that important topic.

Over time, therefore, students' scores on standardized achievement
tests almost always get better. And these increases in test scores take
place even if not one teacher in a school is improperly "teaching to the
test items." Such increases in students' scores often reflect teacher
familiarity with est content rather than genuine rises in student
accomplishments. This difficulty has received enough attention from the
media that, frankly, many itizens are downright skeptical about sharp rises
in a school's standardized test scores.

What's Really Measured?
Consistent with their mission to compare students, standardized
achievement tests are designed to permit individual students' performances
to be contrasted with the performances of a norm group. By doing so,
relative comparisons can be made among students with regard to their
mastery of a very small sample of content. Sometimes there are s few as 40
items on a standardized achievement test. Those items are intended to
represent a huge domain of content (for example, all the mathematical
skills and knowledge possessed by a typical 8th-grader).

Performances on standardized achievement tests, however, are most
influenced by (1) students' genetically transmitted intellectual abilities
and (2) the extent to which students were raised in a stimulus-rich
environment. If students in a school are socioeconomically well off, the
school's test scores invariably will be high. The opposite is also true.
Indeed, if you want to predict how well a school's students will perform on
a standardized achievement test, simply find out what the school's average
parental income is.

In light of these assessment realities, it is clearly inappropriate to
judge a school staff's effectiveness on the basis of students' standardized
achievement test scores. A school's teachers and administrators can be
doing a superb instructional job, but their school's standardized test
scores may not be all that high. Conversely, even if an affluent school's
students score well on standardized achievement tests, that does not
necessarily signify that the school's educators are all that effective.
Standardized tests measure what students come to school with, not what they
learn there.

A Matter of Morality
Parents and other citizens have a right to know how well our schools are
doing. But those individuals need to be informed about how to judge a
school staff's skill.

It is almost immoral to ask teachers to boost students' scores on tests
that are fundamentally impervious to detecting the effects of even
superlative instruction.

Teachers who wish their schools to be evaluated properly must (1) educate
themselves as well as other concerned constituencies about the deficits of
standardized test scores as indicators of school effectiveness and (2)
provide other credible evidence of instructional effectiveness. That's a
ough order. But the current use of standardized tests to evaluate schools
must be stopped.

W. James Popham, IOX Assessment Associates,
301 Beethoven St., Suite 208, Los Angeles, CA
90066-7601, is an emeritus professor in the
UCLA Graduate School of Education and
Information Studies. A former president of the
American Educational Research Association, Dr.
Popham is the author of more than 20 books,
many of which are devoted to educational
testing.

This essay may be duplicated.

For more information see our resources on
"Testing and Assessment."