Reading and math scores for the nation’s 12th graders have stagnated since 2009, according to new data published today, prompting U.S. Secretary of Education Arne Duncan to urge for an overhaul of the nation’s high school model and amplified efforts to narrow the achievement gap for minority students. But how motivated are students to try their best on the “Nation’s Report Card,” a low-stakes test that has no bearing on their academic records? And what might happen to scores if those stakes were raised?
There’s been plenty of pushback in recent months from parents and teachers who say students already take too many tests, and some states have been reconsidering the requirements. But the National Assessment of Educational Progress, administered every two years to a representative sampling of students in grades 4, 8 and 12, is a unique situation. Because each state otherwise uses its own mix of assessments, NAEP (along with the SAT and ACT college entrance exams) represents one of the few ways of making comparisons nationally on student performance.
My colleague Mikhail Zinshteyn has a full breakdown of the NAEP results, and I encourage you to take a closer look. One takeaway that jumped out at me was how strongly a parent’s educational attainment level influenced their child’s performance. When compared with their classmates who scored below the 25th percentile in reading and mathematics, students who performed above the 75th percentile were more than twice as likely to have parents who completed college.
That being said, it’s important to remember the statistician’s adage that correlation is not causation. NAEP offers a snapshot of 12th graders’ performance on one assessment. What it doesn’t tell us is why scores flatlined, or which initiatives or interventions helped a particular subgroup to improve. (Stephen Sawchuk of Education Week is required reading for a pointed warning about misuses of NAEP data by education advocates seeking to bolster their own positions.)
I’d like to circle back to the question of student motivation, and how it might be influencing NAEP results. One claim that’s frequently made is that high schoolers focus on the assessments they know could directly affect their academic records: the exit exam many states require for graduation and the college entrance exams (ACT and SAT). Students know a poor showing on NAEP won’t have consequences for them, so they give it less effort – or so the argument goes.
But what if there was an incentive for students to do well on NAEP? That’s the question Boston College researcher Henry Braun posed in a 2011 study, conducted with the Educational Testing Service. The study involved 2,600 students attending 59 schools in seven states, who were broken up into three groups. (As the researchers point out, the study isn’t directly comparable because the schools selected ended up having slightly higher-achieving student populations than what’s found in NAEP’s sampling.)
The control group received no incentives. The second group of students was offered a $20 gift card at the outset of the test. And the third group received $5 up front, plus an additional $15 for correct answers on each of two randomly selected questions.
Both overall and for most of the student subgroups – including gender and race – their reading scores went up when money was offered. The gains were larger for the students who were offered the contingent incentive based on whether they answered particular questions correctly. For boys and girls both, the gain in the reading score was the equivalent of more than five NAEP points, which is considered statistically significant.
But the report also found that the incentives didn’t have as big an effect on students who were already struggling academically. That’s telling, said Cornelia Orr, executive director of the National Assessment Government Board, which oversees the administration of NAEP. “If you don’t have the foundational knowledge going into a test, an incentive isn’t going to make much difference,” Orr told me.
In a call with reporters Tuesday, Orr said “I think that it is a little bit more of an urban myth about students just blowing off the test when they sit down with it. Because there is really no evidence that students are blowing off this test.” I followed up with her on that point, and she said there is a need for more investigation – and fewer assumptions – about how motivation is really influencing the NAEP results. A forthcoming report from the National Center for Education Statistics that will examine 12th graders’ motivation and engagement on NAEP is expected to help advance that conversation, Orr said.
In Tennessee, educators have drawn attention for a campaign urging students in grades 4 and 8 to try harder on NAEP, complete with a motivational video. According to reporting by The Tennessean, the state teachers’ union contends it was those efforts that are responsible for students’ improved performance on the assessment, rather than aggressive reforms put in place by Gov. Bill Haslam.
Orr said that video – which featured then-Tennessee Titans’ quarterback Matt Hasselback and the state’s First Lady Crissy Haslam – was “just one part” of a larger push to improve student achievement at all levels, not just on NAEP.
“The question will be whether next year’s fourth graders are doing as well,” Orr said. “That would be a clue that the underlying educational programs are sustaining that growth.”
Jack Buckley, the former commissioner of the National Center for Education Statistics (which administers NAEP) who is now with the College Board, told me Tuesday that there are certainly risks to increasing an assessment’s stakes. The accuracy of the assessment is influenced by the amount of accountability pressure that’s placed on it, he said.
While there’s always going to be people who will try and cheat on assessments, “as the stakes go up on tests so do those numbers,” Buckley said. “But there can be problems at the other end of the spectrum as well. If there’s no stakes at all, there’s no motivation and students might not even try.”
The challenge for NAEP is to strike a careful balance, Buckley said.
“How much pressure is going to be put on the data, and for what purpose?” Buckley asked. “You need just enough pressure not to distort the outcome – that’s the ideal measure.”