| |
September/October 2000
Amid reports of test-score gains, researchers ask some tough questions
about the consequences for Latino and African American students.
By Michael Sadowski
Choose the best answer to complete the following sentence: Standardized tests that are linked to graduation, promotion, and other high-stakes outcomes are . . .
a) a good idea because they create incentives for students, teachers, and schools to meet high achievement standards.
b) a good idea because they help to ensure that all students will graduate with at least a basic foundation of academic skills.
c) a bad idea because they stigmatize students who do poorly and ex acerbate educational inequities along socio economic, racial, and ethnic lines.
d) a bad idea because they encourage a curriculum driven by fact memorization and test-taking "tricks" instead of critical thinking and other higher-order skills.
Poll the staff of any elementary, middle, or high school and you will probably get the full range of responses to this question in equal numbers. Similarly, education researchers are far from reaching a consensus about whether testing students for high-stakes outcomes actually improves learning. While some researchers seem to focus primarily on the potential and others on the pitfalls, many seem to agree that some key questions are not being asked in the current rush toward high-stakes testing. According to the latest figures released by the
Education Commission of the States, a bipartisan policy group, 24 states now require students to pass exit tests before they receive their high school diplomas, and this number continues to grow as additional states phase in such requirements.
Such assessments appear to be popular-and becoming more so-with the majority of Americans: in a 1999 poll by the ICR Survey Research Group, 93 percent of respondents agreed with "making students meet adequate academic standards to be promoted or graduated." Taken in the light of a 1994 poll by
Public Agenda that found that 80 percent of adults believed students should have to pass standardized tests in order to graduate from high school, those results suggest that public support for more student accountability is growing.
Why are testing programs so attractive to the general public? In part because they have a largely unquestioned reputation for objectivity, says Aaron M. Pallas, professor of sociology and education at Teachers College, Columbia University, and co-author of a report on high-stakes testing for the Civil Rights Project at Harvard University.
"Most standardized tests are viewed by the public at large as objective, which means several things: there are right and wrong answers to the test questions; unlike grades, which are awarded at the 'whim' of a teacher, standardized tests are standardized-scores don't depend on who is performing the assessment; tests yield numerical scores, which are precise measures of performance; and, like a laboratory measurement, test scores are reliable. Testing experts acknowledge that some of these assumptions are questionable," says Pallas. "Test construction is a social and political process, and we cannot afford to lose sight of that fact."
Indeed, policymakers and political candidates have responded to-if not partly created-the public appetite for high-stakes testing as more and more of them propose accountability programs modeled after those already established in other states. Two of the most extensively studied-and controversial-of these programs are those in Texas and Chicago.
Texas Miracle or Mirage?
Education is the single most important issue to voters in this year's presidential election, according to a July Gallup Poll, so Texas governor George W. Bush's bid for the White House has turned all eyes
to the Lone Star State's schools-and helped turn high-stakes testing into one of this year's big issues. The Texas Assessment of Academic Skills (TAAS), a group of tests required for graduation and used diagnostically in the lower grades, has both strong supporters and detractors.
Proponents of the TAAS, some of whom have referred to the tests as the "Texas miracle," point to the rise in test scores as evidence that the system is working. According to data released this year by the Texas Education Agency, students in that state set their seventh straight record-high passing rate on the TAAS. Preliminary results show that 80 percent of all students tested in grades three through ten passed the English version of the TAAS this past spring, a rise of two percentage points over last year and 27 percentage points over the 1994 pass rate of 53 percent. In addition, the score gap has narrowed between white youth and African American and Latino students since the tests were first implemented in 1989. Between 1996 and 1998 alone, pass rates for African American students rose from 76 to 82 percent, for Latino students from 76 to 83 percent, and for white students from 92 to 94 percent.
In July, Texas education officials received more good news when a report by the RAND research organization ranked the state second (only to North Carolina) among 44 states for its score gains on the National Assessment of Educational Progress (NAEP). Laura Bush, wife of the presidential candidate, cited the RAND report in her speech at the Republican National Convention in Philadelphia, saying it showed "that education reforms in Texas have resulted in some of the highest achievement gains in the country among all racial, socioeconomic, and family backgrounds."
However, some researchers contend that the rise in TAAS scores may be nothing more than that: an improvement in students' ability to perform well on that particular test. "Texas has made much of its claims of narrowing test-score gaps, but there is a lack of evidence of improvement except for the tests taught to," says Monty Neill, executive director of the National Center for Fair and Open Testing (FairTest) in Cambridge, MA. As for the RAND report, Neill says it paints an incomplete picture because, as RAND acknowl edges, it does not include 1998 data and is heavily weighted toward mathematics. While RAND produced its rankings based on five NAEP math scores in three testing years (1990, 1992, and 1996), only two grade-four reading tests for 1992 and 1994 were included in the analysis.
"If you look at 1998 reading data, you see that Texas has not had a statistically significant gain in reading on the NAEP," Neill says. He adds that the NAEP reading score gap between African American and white students in Texas widened between 1992 and 1998, and reading scores for African American students actually went down. "One has to be careful of these kinds of analyses when they leave out reading," Neill says. "The data certainly suggest that the vaunted gains the TAAS is supposed to be having on learning do not show up elsewhere, except possibly in math." Whatever students' scores on the NAEP or the TAAS, Neill is quick to add that these kinds of tests are hardly adequate measures of what students should know and be able to do: "Though they are very hard to track, we should be asking ourselves instead what our schools do that makes a difference in terms of real-world outcomes."
Other researchers have also suggested that the passing rates tell only half the story. The other half, they say, is told by dropout and retention statistics for Texas students. In a recent report for
National Board on Educational Testing and Public Policy at Boston College authors Marguerite Clarke, Walter Haney, and George Madaus note that high school dropout rates in Texas, particularly among minority students, are considerably higher than they were before the TAAS, and they speculate that there may be some connection between the attrition figures and the high-stakes assessments. The researchers cite previous research by Haney that showed minor fluctuations in dropout rates through the late 1970s and 1980s, but a sudden, sharp decline in the 1990- 1991 school year, the first year the TAAS was required for graduation. Dropouts among black and Hispanic students were about 50 percent greater than among whites. According to Haney's findings, which are based on Texas Education Agency statistics, about 60 percent of all black and Hispanic 9th-graders in Texas went on to complete high school on schedule through most of the late 1970s and early 1980s, but in the years under TAAS, the numbers for each group hovered around 50 percent or slightly lower. White students graduated on schedule at a rate of about 70 percent in 1998 (the last year for which data were available), down slightly from the 72-78 percent range of figures seen in the late 1970s and early 1980s.
Also significant, the researchers say, are retention statistics for 9th grade, the year before students are required to take the exit-level TAAS. According to a Texas Education Agency report, 18 percent of all 9th-graders in Texas were retained at that level in 1997, and roughly one in four African American and Latino students were held back. This 9th-grade retention rate has been dramatically higher than the rate for all other grade levels through most of the 1990s. By contrast, only
2 percent of 8th-graders, 8 percent of 10th-graders, 5 percent of 11th-graders, and 4 percent of 12th-graders were retained in Texas schools in 1997. While it is difficult to draw conclusions about the reasons for the high 9th-grade retention rate, some have suggested that the TAAS is a major contributing factor.
At a recent Harvard Graduate School of Education forum on high-stakes testing, Angela Valenzuela, a research associate at the University of Texas at Austin, suggested that there may be strong reason to believe that some weaker students are being held at the 9th-grade level so that they will not lower their schools' average TAAS scores. "The state's accountability system was originally designed to hold school administrators and teachers accountable, but the main people who are being punished here are the children," Valenzuela said.
Displacement and Distortion
Some researchers are also documenting what they consider to be the detrimental effects of the Texas tests on curriculum. Linda M. McNeil, professor of education and director of the
Center for Education at Rice University, has studied the effects of TAAS on Texas schools and sees in her case studies a pattern of "displacement and distortion" of curriculum to make way for TAAS preparation. "There are classrooms where children read no prose from September to February," she says. Instead, McNeil adds, students read short, disconnected passages and answer questions about them in patterns similar to those seen on the TAAS exam. "They study information they are meant to forget. It's all artificial content to raise test scores."
Curriculum changes, McNeil says, when superintendents and school boards respond to political pressure to raise tests scores by passing that pressure on to teachers and building-level administrators. A recent study by James V. Hoffman, Julie Pennington, and Lori Assaf of the University of Texas at Austin and Scott G. Paris of the University of Michigan supports this finding. In their survey of 200 Texas teachers, 85 percent agreed that areas not directly tested on the TAAS "receive less and less attention in the curriculum." The modification of curriculum is affecting poor and minority youth the most, says McNeil, since many of them attend schools where scores are lowest and the pressure to raise them is greatest. "[Students of color] are not getting the same educational experience as kids in suburban schools," she says.
Finally, the TAAS poses special challenges to the large number of Latino students in Texas for whom English is their second language. Catherine E. Snow, a professor at the Harvard Graduate School of Education and president of the American Educational Research Association (AERA), says tests like the TAAS pose a difficult conundrum regarding the inclusion of these second-language students: "We can't ask questions about how these tests affect language-minority kids unless we include them, but how do we do this without bringing negative consequences down on them?"
Despite such complications, however, some researchers contend that a test-driven curriculum is better than no real curriculum at all. Lauren Resnick, director of the Center on Education at the University of Pittsburgh, has done extensive work in the areas of standards and accountability. While agreeing that "teaching to the test" is not the most effective approach to instruction, Resnick suggests that testing offers a kind of structure and coherence that is lacking in some teachers' classrooms, especially those teaching in poorly funded schools. "There are certainly some places where the curriculum is being dramatically narrowed to whatever types of items are on the test," Resnick says. "There are also places that five years ago were hardly teaching kids at all, especially poor kids. So now at least they're teaching them something, and it appears this is coming in the wake of high-stakes testing."
Under a new measure against social promotion passed by the Texas legislature, the TAAS will soon affect more than just 10th-graders seeking the state's permission to graduate: 3rd-, 5th-, and 8th-graders will also be required to pass TAAS exams in order to advance to the next grade.
Chicago Hope?
A similar promotional testing requirement has been in place in the Chicago Public Schools since the 1996-1997 school year, and research results are in on its first two years of implementation. Under the policy, Chicago 3rd-, 6th-, and 8th-grade students must achieve a certain cut score on the Iowa Test of Basic Skills (ITBS) in reading and mathematics to advance to the next grade. Because of low baseline test scores among Chicago students, school officials set the score requirements for promotion at one year below grade level for grade three, 1.5 years below grade level for grade six, and 1.8 years below grade level for grade eight. Students who do not meet the required score are required to attend a six-week program called Summer Bridge and repeat the test at the end of the summer. If they fail the test again, they are retained in grade for that year. Students can also attend an extended-day remediation program called Lighthouse. School officials have made exceptions to the policy for students participating in bilingual and special education programs.
In a study called "Ending Social Promotion: Results from the First Two Years," researchers from the Consortium on Chicago School Research have reported some encouraging preliminary results. The data show that ITBS scores have improved significantly under just two years of the policy, with 20 percent more 6th-graders and 21 percent more 8th-graders reaching the minimum cut score in 1997 than in 1995 (before the scores were used as promotion criteria). The evidence suggests that the high stakes of the tests and the remediation programs are in some way combining to help students raise their scores, the researchers say. They also note that the positive news about test scores has brought a great deal of attention to the Chicago policy, including this mention in President Clinton's 1999 State of the Union address: "When we promote a child from grade to grade who hasn't mastered the work, we do that child no favors. It is time to end social promotion in America's schools. Last year in Chicago, they made that decision. . . . I propose to help other communities follow Chicago's lead."
Some educational researchers, however, are less eager to call for an end to social promotion based on these findings. Jay P. Heubert, associate professor of education at Teachers College, reports that the news we get from research about retention is almost all bad: "Nearly all of the research on retention shows that it has strong negative effects on kids," he says. Heubert and Robert M. Hauser, a sociology professor at the University of Wisconsin at Madison, cite numerous studies on the effects of retention in a 1999 National Research Council report they edited, entitled High Stakes: Testing for Tracking, Promotion, and Graduation. The preponderance of studies, they note, link retention to such negative student outcomes as lower levels of academic and social success and much higher risk of dropping out. (See "Retention vs. Social Promotion: Schools Search for Alternatives," HEL, January/February 1999.)
The Chicago consortium's findings, though impressive in terms of test scores, also seem to suggest that retention may be having a detrimental effect on students. Under the promotion test policy, the researchers note, "only one-fourth of retained 8th-graders and one-third of retained 3rd- and 6th-graders in 1997 made 'normal' progress during the following school year, meaning that they stayed in the school system, were again subject to the policy, and passed the test cutoff the next May."
Like the rise in TAAS scores, the Chicago students' higher scores on the ITBS have also led some researchers to wonder if they represent real gains in academic skill or just improved test-taking ability. The data are inconclusive, but lend some support to the latter hypothesis. According to the consortium's report, "the picture is mixed on whether getting students up to a test-score cutoff in one year allows them to do better the next year." Test-score increases for students participating in the Summer Bridge program, for example, were not followed by improved performance during the subsequent school year and may be the result of "testing effects versus learning gains," the researchers say.
Finally, Heubert notes that the publishers of the ITBS, Riverside Publishing, have said that the tests are invalid for retention and promotion decisions. Heubert says, "Chicago is failing tens of thousands of kids each year, almost all minority and almost all likely dropouts. And the top brass knows the test they're using isn't even valid." Philip Hansen, chief accountability officer for the Chicago Public Schools, disputes that claim: "The tests are used to determine who goes to summer school, and students get three chances to retake the test, with remediation in between, before promotion decisions are made. There's no evidence our promotion policy doesn't work, but those who are philosophically opposed to standardized tests will blast anything we do."
New Guidelines
Even those researchers who are holding high-stakes testing programs up to the closest scrutiny insist they are not against testing; they simply want a more critical review of its results and a more careful consideration of all its consequences. "Used properly, tests can be very helpful. Used poorly, they can do considerable harm," says Heubert.
To provide guidance to educators and policymakers on the fair and appropriate use of testing, the AERA issued a position statement this past summer outlining a set of conditions that should be met by any educational testing program (see sidebar, page 4). These include using more than a single test for making high-stakes decisions about students, the provision of adequate resources and opportunities to learn, the alignment of tests with curriculum, and the full disclosure of the likely negative consequences of testing. The U.S. Department of Education's Office for Civil Rights is also preparing a resource guide on the use of high-stakes testing for educators and policymakers. It will focus on considerations for appropriate test use and the legal ramifications of high-stakes testing, especially those affecting second-language learners and students with disabilities.
"Tests can be a valuable part of a student's education," says Marguerite Clarke, associate director of the National Board on Educational Testing and Public Policy and assistant professor of research at Boston College. "But when they become the driving force behind educational reform, they can become corrupted. In this kind of environment, attention focuses almost exclusively on the test at the expense of other aspects of the education system. High-stakes testing can then lead to low-level learning." That's an outcome a public hungry for accountability may not be able to stomach.
Michael Sadowski, former HEL assistant editor, is a doctoral student at the Harvard Graduate School of Education.
For further information
M. Clarke, W. Haney, and G. Madaus. "High Stakes Testing and High School Completion." Boston: Boston College, National Board on Educational Testing and Public Policy, January 2000.
R.F. Elmore and R. Rothman, eds. Testing, Teaching, and Learning:
A Guide for States and School Districts. Washington, DC: National Research Council, 1999.
J.P. Heubert and R.M. Hauser, eds. High Stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: National Research Council, 1999.
L.M. McNeil. "Creating New Inequalities: Contradictions of Reform." Phi Delta Kappan 81, no. 10 (June 2000): 729-734.
G. Natriello and A.M. Pallas. "The Development and Impact of High-Stakes Testing." Paper presented at High Stakes K-12 Testing Conference sponsored by The Civil Rights Project, Harvard University, December 1998. Revised November 1999.
M. Neill et al. Testing Our Children: A Report Card on State Assessment Systems. Cambridge, MA: FairTest, 1997.
M. Roderick, A.S. Bryk, B.A. Jacob, J.Q. Easton, and E. Allensworth. "Ending Social Promotion: Results from the First Two Years." Consortium on Chicago School Research, December 1999.
U.S. Department of Education, Office for Civil Rights. "The Use of Tests When Making High- Stakes Decisions for Students:
A Resource Guide for Educators and Policymakers" (draft). July 6, 2000.
|
|