| |
March/April 2007
by Sharon L. Nichols and David C. Berliner
Since the fall of 2003, after NCLB required high-stakes
testing in all 50 states, we have systematically scoured news outlets
and scholarly journals for accounts of the impact of high-stakes
testing. We have amassed a significant collection of evidence highlighting
the distortion, corruption, and collateral damage that occur when
high-stakes tests become commonplace in our public schools.
We found reports and research about individuals
and groups of individuals from across the nation whose lives have
been tragically and often permanently affected by high-stakes testing.
We found hundreds of instances of adults who were cheating, including
many instances of administrators who “pushed” children
out of school, costing thousands of students the opportunity to
receive a high school diploma. We also found administrators and
school boards that had drastically narrowed the curriculum, and
who forced test-preparation programs on teachers and students, taking
scarce time away from genuine instruction. We found teacher morale
plummeting, causing many to leave the profession.
Supporters of high-stakes testing might dismiss
these anecdotal reports as idiosyncratic or too infrequent to matter.
But all of these problems could have been foretold. A little-known
but powerful social science law known as Campbell’s law explains
the etiology of the problems we document. Ignorance of this law
endangers the health of our schools and erodes the commitment of
those who work in them.
Campbell’s Law
Campbell’s law was formulated in 1975 by
the late Donald T. Campbell, a respected social psychologist, evaluator,
methodologist, and philosopher of science. Campbell’s law
stipulates that “the more any quantitative social indicator
is used for social decision-making, the more subject it will be
to corruption pressures and the more apt it will be to distort and
corrupt the social processes it is intended to monitor.”
Testing experts George Madaus and Marguerite Clarke
agree with Campbell, noting that whenever you have high stakes attached
to some indicator of performance, you have a corrupted measurement
system. The higher the stakes, the more uncertain are the conclusions
you can draw from the measures you have. Put another way, the higher
the stakes, the more likely it is that the construct being measured
has somehow been changed. High stakes, therefore, lead inexorably
to invalidity.
Evidence of Campbell’s law is everywhere.
In business, if stock market price is the indicator and incentives
such as big bonuses are given for short-term stock gains, then a
system has been created to encourage poor or even counterproductive
management practices, as well as outright fraud. In medicine, malpractice
suits are an indicator of the quality of health care received and
determine the reputations of physicians. So high stakes are associated
with the threat of malpractice suits and thus contribute to the
spiraling costs of health care, as physicians prescribe unnecessary
tests and interventions. At the same time, financial incentives
reward those who spend less time with patients, eroding the quality
of care. Examples of corruption, cheating, gaming the system, taking
short cuts, and so forth are found wherever high stakes are attached
to performance in athletics, academia, politics, government agencies,
and the military.
High-stakes testing is exactly the kind of practice
Campbell warned us about (see Campbell’s
Law in Action). Serious, life-altering decisions that affect
teachers, administrators, and students are made on the basis of
testing. Tests determine who is promoted and who is retained; who
will receive a high school degree and who will not. Test scores
can determine if a school will be reconstituted and whether there
will be job losses or cash bonuses for teachers and administrators.
Under these conditions, we must worry that the process that is being
monitored by these test scores—the quality of our children’s
education—is also becoming corrupted and distorted, rendering
the test scores themselves meaningless.
Alternatives to High-Stakes Testing
It is a legitimate request for the citizenry who
have designed and paid for schools to want external measures of
how those schools, teachers, and students are doing. However, there
are many forms of evaluation that, separately or in combination,
can avoid the pitfalls associated with high-stakes tests. A more
effective system of assessment could combine low-stakes tests with
some or all of the following:
Formative assessments.
Most tests in the United States are assessments of learning. The
tests are designed to tell us what and how much students know at
any one point in time. By contrast, formative assessment is assessment
for learning, used to improve teaching and learning. They often
entail a range of activities embedded into the curriculum. Tests
and other classroom activities (classroom discussion, projects,
homework) are specifically designed to provide feedback to teachers
and students regarding what they know, what they don’t know,
and where they might go next.
An independent inspectorate.
Australia, England, Holland, Germany, Sweden, and a few other countries
have a school inspectorate devoted to visiting schools and providing
feedback on their performance. To evaluate whether a school is performing
satisfactorily means, first and foremost, having inspectors watch
teachers teach. Inspectors make judgments about the depth and breadth
of the curriculum, its conformity to national or state standards,
and the competency of teachers to implement it in an exemplary manner.
They also check to see if improperly certified teachers are employed
at the school, and may hold focus groups to determine community
satisfaction. Inspectors visit with students to evaluate whether
their motivational needs are being met and assess the school’s
plans for staff development.
End-of-course examinations.
Yet another alternative to high-stakes testing is to build a low-stakes
accountability sys-tem that involves teachers at the district level
in making the tests themselves. Imagine local teachers meeting and
working on understanding the subject-matter standards, sharing designs
and teaching tips for the classroom teaching of the stan-dards,
sharing course syllabi, and making decisions about text selections.
Imagine also that teachers are paid for these activities, for picking
the cut scores to determine student proficiency, and for scoring
the tests. Having teachers score tests in groups is a great way
to stimulate discussion of curriculum content and student capabilities.
Several states have taken steps to implement these types of end-of-course
evaluation systems.
Performance tests. Performance
tests are student projects or portfolios of student work that are
presented for evalua-tion by a panel of judges. The judges are asked
to determine whether a student has mastered a sufficient body of
knowledge to be considered competent. The format places the teacher
in the role of mentor, coach, and advisor rather than judge, and
teachers invariably work hard to prepare students to do well. This
is a democratic form of accountability, since the public is invited
in to see what has been learned. New York’s Central Park East
School, the Coalition
of Essential Schools, and International
Baccalaureate programs use performance tests.
Value-added assessment.
More and more educators and politicians are pushing for value-added
assessment, which looks at the achievement of individual students
and schools over time and perhaps—if the statistics ever are
refined enough—can pinpoint the effects of particular teachers.
Although value-added models of growth still need to be refined,
they appear promising. However, if achievement-growth reports become
high-stakes, as now occurs with the NCLB test scores used throughout
the nation, then value-added models of assessment will suffer the
same problems as the current accountability tests.
We believe that the costs associated with high-stakes
testing are simply not worth it. Campbell’s law informs us
that high-stakes testing of the type associated with NCLB can never
be used successfully in our schools. Despite the sheer number of
examples showing negative effects, however, many people still believe
high-stakes testing is a ¬viable way to improve education. They
defy a perfectly valid social science principle—at their peril.
Sharon L. Nichols is an assistant professor
at the University of Texas at San Antonio. David C. Berliner is
the Regents’ Professor of Education at Arizona State University
in Tempe. This article is adapted from their book Collateral
Damage: How High-Stakes Testing Corrupts America’s Schools
(Harvard Education Press, 2007).
|
|