I will protect your pensions. Nothing about your pension is going to change when I am governor. - Chris Christie, "An Open Letter to the Teachers of NJ" October, 2009

Monday, January 15, 2018

The End of the Teacher-Bashing, Chris Christie Era


Allow me a few personal thoughts:

Eight years ago, I started this blog in direct response to what I and many other teachers around the state perceived as a climate of teacher bashing brought on by New Jersey's governor, Chris Christie.

Today is Christie's last day as governor. Somehow, we teachers survived.

I'm only being a little hyperbolic when I say this. When I look back on Christie's two terms, I see both a series of policies and a set of attitudes that were -- and are -- a threat to the teaching profession in New Jersey.

- The value of our modest pensions and health care benefits (which are less generous than those found in the private sector) continues to erode, and current retirees have lost their cost-of-living-increases. We were already paying a wage penalty for choosing to become teachers. Now, Christie's appointees want us to give up even more of our compensation, even as pension fund managers collect outrageously high fees.

- New Jersey teachers are subject to an innumerate, illogical evaluation system that uses arbitrary weights of error-prone measures of "growth" that appear to be significantly biased. In short, NJDOE under Chris Christie has created an unvalidated mess of a teacher evaluation system that wastes time and money.

- Despite these serious problems with NJ's evaluation system, Christie has worked for years to undermine tenure and other workplace protections for teachers -- which happen to also be protections for taxpayers and students.

- Christie has demeaned the professionalism of educators by consistently appointing people into leadership positions who have neither the experience, the qualifications, nor the track records necessary for success.

- Christie has promoted the expansion of charter schools, which hire less-experienced teachers at lower pay than hosting public district schools. Many of these charters have serious issues with accountability and transparency, yet Christie enthusiastically supports them. Christie's administration has also turned a blind eye toward charters that clearly do not enroll the same types of students as their hosting public district schools. He has also actively promoted policies that disproportionately affect teachers of color through pubic school "renewal" and charter school expansion.

- Perhaps most important: Christie has tarnished New Jersey's legacy as a leader in school funding reform by promoting inequitable, inadequate school funding schemes and repeatedly ignoring the state's own law regarding school funding.

Add to all this Christie's bullying, preening, sneering, dismissive, sexist attitude toward teachers -- no, not just their unions, but teachers themselves. 

When you total up all of the above, it really is remarkable New Jersey's teachers are still doing the great work they do every day on behalf of this state's children.

Does our profession need to improve? Of course. But Chris Christie's constant undermining of public schools and public educators has done nothing to improve the evaluation, compensation, work conditions, or prestige of New Jersey's teachers. He has made the profession less attractive, which inevitably means fewer of our best and brightest are considering teaching as a career path.

Phil Murphy has a very difficult job ahead. Years of fiscal irresponsibility -- and, yes, I will be the first to say this recklessness long predates Chris Christie -- put him in a very difficult spot. There's little doubt in my mind health benefits for teachers will have to be reconfigured, and it's hard to think of a scenario where we come out ahead. Our policy on state aid to schools needs an overhaul, as does our tax policy.

None of this will be easy. But at least we will have a governor who supports public schools and public school teachers. 

After eight years, I'm looking forward to going to work on Tuesday knowing the governor isn't going to publicly blame me and my fellow teachers for the many problems New Jersey faces that we didn't create and we can't be expected to fix on our own.

Best of luck, Governor Murphy.

And Mr. Christie: have a nice life...

ADDING: As I said, the whole reason for starting Jersey Jazzman was to push back on the anti-teacher rhetoric of the Christie era. I do believe this blog has evolved to become something more than that... and yet this first mission has come to an end.

What's next? Stand by...

UPDATE: You have got to be kidding me...
On his final full day in office, Gov. Chris Christie on Monday signed a controversial bill into law that will increase the pensions of former Camden Mayor Dana Redd -- a Democratic ally -- and some other elected New Jersey officials.
Christie made no statement on the measure, which was one of 150 he took action on before he's set to leave office Tuesday. His office did not immediately return a message seeking comment.
The Democratic-controlled state Legislature fast-tracked the Democratic-sponsored legislation in the final weeks before a new set of lawmakers were sworn in and Christie finished his eight-year tenure.
The new law (S3620) allows some politicians to re-enroll in the state's Public Employees' Retirement System after being kicked out because they switched positions.
Look, I really don't have a problem with anyone getting a pension. I really don't have a problem with this law. But it's utterly, totally hypocritical for Chris Christie, of all people, to sign this on his last day in office after eight years of relentlessly hammering away at the idea that public employee pensions are too generous and must be curtailed.

The excuse that this only affects a few people, and therefore doesn't have a major impact, is completely beside the point. If we can't afford to meet our obligations as a state to workers for work they have already done, how can we possibly justify additional pension spending for a lucky few, no matter how small?

See that quote at the top of this blog?
I will protect your pensions. Nothing about your pension is going to change when I am governor. - Chris Christie, "An Open Letter to the Teachers of NJ" October, 2009
We've been lied to, we've been insulted, and now this -- on Christie's very last day.

I'll tell you something, though: tomorrow, I will join with thousands of New Jersey's teachers, and our cops and firefighters and social workers and DPW workers and everyone else at the municipal, county, state, and school district levels as we head off to work and do our damn best to make this the greatest state in the nation.

Chris Christie was never good enough to lead us. He was never our equal. We are the backbone of the Garden State. We do the work that needs to be done. We are Jersey Strong.

I will always be proud to be a New Jersey public school teacher, a New Jersey union member, and a New Jersey public employee.

I know what I do every day for the people of this state. Every public employee does.

Saturday, January 6, 2018

"Miracle" School "Journalism" and Gorilla Channel Values

UPDATE: We seem to go through this every time I do one of these...

What I show here are the grade-level enrollments each year as a cohort -- the "Class of '17," for example -- passes through a school. This is not student level data; so far as I know, neither NYSED not NJDOE publishes attrition, backfilling, and retention data.

Yes, it's possible students are retained, which could make underclass grades larger than upperclass grades. But that raises a series of issues by itself -- is a school keeping students for five years at extra expense to the taxpayers?

The data is what it is. But the burden of proof is not on me -- it's on those who makes claims about the effectiveness of certain schools. 

I promise I'll get back to Newark in a bit. But before I do, let's talk a bit about education journalism and The Gorilla Channel...

By now, you've probably heard about the internet prank pulled by cartoonist Ben Ward. Ward tweeted out a completely phony "excerpt" from Michael Wolff's new book about Donald Trump, recounting a made-up story about Trump watching videos of gorillas fighting up to 17 hours a day.

It appears that quite a few people fell for the prank and passed on the tweet in ways that made it seem that they thought the story was real. One interpretation of the incident is that too many of us fall too quickly for stories that we desperately want to be true, even if, on their face, they are implausible.*

For me, the timing of all this was fortuitous, because I had just sent out a tweet that captured a paragraph from a blog by Diane Ravitch that asked journalists to be more credulous when covering "miracle" school stories:
Ahem. My take? Journalists should always question stories that involve miraculous claims about test scores and graduation rates. Skepticism should be their default attitude towards claims that sound too good to be true. If at first they take the bait, they will tend to stop digging and become defensive. Those who take the bait will look foolish, and indeed they are. When a school makes outlandish claims about test scores, ask first who graded the tests. Then check the process for excluding and selecting students. Ask whether the school has the same proportion of students with disabilities and English learners as neighborhood schools. Dig deeper. Ask whether it accepts students with cognitive disabilities, or only those with mild learning disabilities. Keep digging. It has been my experience that behind every “miracle” school there is either fraud, dubious practices (e.g., credit recovery), or careful selection and exclusion of students.**
What Diane describes here is painfully common in education journalism: some school official or PR hack (these days, usually from a charter school) calls up the op-ed page editor at a news outlet and pitches a story about a school that "beats the odds." The school almost always has some data point they are using to sell their "success." But the editor or reporter almost never follow through and put that data into proper context.

What emerges is a story that feels right to folks with reformy predilections, but isn't based on a full and proper accounting of the facts. "Miracle" school stories are like The Gorilla Channel: because people want to believe them, they suspend critical thinking and accept them without subjecting them to appropriate scrutiny.

Which brings us to this recent op-ed from the NY Daily NewsApparently, no one at the News thinks they should check their op-ed writers' claims to see if they actually make any sense:
It’s college acceptance season, and if you’re a senior at one of Democracy Prep’s high schools in New York City chances are you are very happy: No charter network has demonstrated more success getting its students into and through college.
According to the network, last year 189 of the 195 seniors in its three high schools that had graduating classes went on to college. And although the sample size is small (the network has graduated fewer than 400 students), the network estimates that 80% of its graduates either are still in college or have graduated. 
Nearly all Democracy Prep graduates are from low-income families. Nationally, less than 20% of high school graduates below the median of household income receive bachelor's degrees within six years. 
Do you want this to be true? Would it make you feel better if you knew that children from low-income families, growing up in cities whose schools have been under-resourced for years, didn't suffer a penalty in college admissions? Would it make you feel better about our nation's appalling neglect of its most vulnerable children if you could point to a few schools that "beat the odds"?

We know there is systemic, structural inequality and racism in America. We know our schools reflect this, and that educational outcomes are influenced by these ugly societal realities. But here's a story that calls these truths into question. How much do you want to set aside your skepticism? Enough to keep you from examining the facts?

Ok, then...

According to Democracy Prep's website, the network runs five high schools in New York City:

  • Democracy Prep Charter High School
  • Democracy Prep Harlem High School
  • Bronx Prep High School
  • Democracy Prep Endurance High School
  • Harlem Prep High School

I am using NY State Education Department data for this analysis. I'll start by aggregating these five schools and asking a basic question: How many students who start as freshmen at Democracy Prep stay until their senior year?

This is important: if a school sheds students as they move from year to year, it suggests that school isn't placing all of its students in college -- it's only placing those who stayed for the full four years. What does the data show?

Year after year, Democracy Prep's charter schools shed substantial numbers of students between their freshman and senior year.

Is a school really placing all of its students into college if it loses substantial numbers of those students before their senior year? Is it fair to make a comparison of college placement rates between Democracy Prep and other high schools enrolling large numbers of low-income students if:
  1. Democracy Prep only enrolls students who apply to be there?
  2. Many students leave Democracy Prep before their senior year?
Keep in mind that the author of this piece, Charles Sahm, claims that cohort attrition is not an issue at Democracy Prep:
Unlike other charters that don’t accept new students after a certain grade, Democracy Prep — which operates elementary, middle, and high schools — takes in new students whenever a new spot opens up, even in later grades. [emphasis mine]
Maybe they do -- but Democracy Prep does not come close to replacing all of the students it sheds from year to year.

Did anyone at the News think to check Sahm's claim to see if it was both accurate and put into proper context? Did they think they have an obligation to their readers to make sure that relevant facts were being presented? Did they stop to consider that anyone making a bold claim like Sahm's should be required to subject that claim to rigorous scrutiny?

Education journalism in this country is too often shaped by Gorilla Channel values: because newspapers editors and publishers desperately want to believe certain things, they are willing to suspend critical thinking and merely report whatever they are fed.

How desperately do you want to believe in The Gorilla Channel? How desperately do you want to believe in "miracle" schools? Are you willing to have a good-faith argument about education policy?

Or is that too much to ask?

Another "miracle school" op-ed, courtesy of The Gorilla Channel!

ADDING: Sahm notes that Democracy Prep runs schools in other states:
Democracy Prep’s impact is not limited to New York. The network’s high school in Camden, N.J., is graduating its first class this year. All 34 of this year’s seniors have already received at least one college acceptance. This is revolutionary change for a city that five years ago only had three students graduate high school “college ready,” according to the College Board.
As luck would have it, I happen to have some familiarity with NJ education data...

Freedom Prep, the Democracy Prep affiliate in Camden, NJ, has not yet graduated a class. But we can see the cohort sizes up until last year.

This year's seniors at Freedom Prep were in a class of 78 freshman. By the fall of their junior year, they were down to 33 students. 58% of the freshmen of the Class of 2018 at Freedom Prep had left by their junior year.

Shouldn't this be discussed?

ADDING MORE: Gary Rubinstein calls out the Gorilla Channel values of the press's coverage of Success Academy. It ain't pretty...

* OK, look: I will be the first to say Donald Trump is unfit for office. But come on...

** Emphasis mine; lightly edited for clarity.

Tuesday, January 2, 2018

Only You Can Prevent School Finance Ignorance!

We interrupt our regularly scheduled, fact-based analysis of Newark's schools for a good, old-fashioned school finance debunking...

I'm thinking about trying to convince @SchlFinance101 to start a School Finance Hall of Shame, where we would regularly acknowledge great feats of public ignorance in the field of education fiscal policy.

My first nominee for 2018 would have to be this piece in today's edition of NJ Spotlight, a monumental display of school finance ignorance. For the academy's consideration, I offer this excerpt:

Senseless school funding

Compounding the problem for middle-class taxpayers is the state’s senseless school-funding plan that shovels truck loads of money to a handful of school districts while leaving suburban taxpayers to fund more than 90 percent of their school needs while they also pay more than 80 percent of other people's school costs. More state school aid money goes to Newark than to all of Monmouth and Ocean counties combined. Morris County, in total, gets less state aid than the City of Passaic; the same goes for Bergen County. Jersey City gets more annual state aid than either Salem, Somerset, Sussex, or Warren counties. Yet, the Legislature refuses to make changes to fairly distribute state aid that comes from the very taxpayers who are paying the most taxes to Trenton.
Where to begin?

First: true greatness in school funding ignorance requires comparisons that are not only specious, but utterly lacking in any attempt at fairness. Take, for example:
Jersey City gets more annual state aid than either Salem, Somerset, Sussex, or Warren counties.
We'll leave aside the many complex problems with this comparison and ask a very, very simple question: Should a small, sparsely populated county get as much aid as a large, densely populated city?

Even if you believe in absolutely insane ideas like "fair funding," where every student gets the same amount of state aid, Jersey City would get more aid than Salem, Sussex, and Warren Counties because Jersey City has more students!

I really should stop here - this op-ed, which can't even deal with the most basic mathematical concepts, clearly doesn't deserve any more attention. But let's use this as a teachable moment and dive into another concept that appears to elude so many who opine so loudly about school finance policy -- tax capacity.

Imagine two houses that are exactly the same, but located in two different towns.
Each house is exactly the same in all details, and each costs the same (in the real world they don't*). One, however, is located in hard-scrabble Palookaville, where it's one of the most expensive properties in town. The other is the cheapest house in Hoity-Toity Village, where McMansions abound.

In other words: one house is located in a relatively property-poor town, while one is in a property-rich community.
Now, for the purposes of this example, we will do something utterly unforgivable and set aside the tons and tons of research that shows students in greater economic disadvantage require more resources to achieve equal educational opportunity. Instead -- just this once -- we will imagine that each town wants to raise equal amounts of revenue per pupil for its schools.

Again, we are setting aside boatloads of research that shows less advantaged students need more resources and, instead, simply imagining what tax rates each town needs to set to get equal funding for its students. As a matter of basic math, property-poor towns must set higher tax rates to raise the same amount of money as property-rich towns.
In Palookaville, houses cost $100K on average. To raise $10K per house, the town has to set a tax rate of 10 percent. But in Hoity-Toity Village, houses cost $1 million on average. To raise $10K per house, they only have to levy a 1 percent property tax.

Now let's go back to our two equivalent houses. How much does each pay?
Houses in property-poor communities must pay higher tax rates than similar houses in property-rich communities to raise the equivalent amount of revenue.

This disparity is exactly the reason that we have state aid to begin with. If the state didn't step in and try to equalize the different tax bases in communities with varying amounts of property wealth, poor communities would be at a perpetual disadvantage just trying to raise the same amounts of revenue.

Back in 2016, Ajay Srikanth and I broke this phenomenon down in detail when we critiqued Chris Christie's "Fairness Formula." I wish I could say we were making an original point -- we weren't. Because this is one of the most basic ideas in public policy: Wealthier communities have a greater capacity to generate revenues than less-wealthy communities. If you don't understand this, you have no business opining about... well, anything.

Now, I will be the first one to say that New Jersey, and all other states, should take a hard look at how they determine the taxing capacity of school districts before they implement their school funding formulas. In New Jersey, there is a legitimate argument to be made that some -- some -- communities getting large amounts of state aid should pitch in a greater share of local taxes.

But ignorance like this is keeping us from having the serious conversation we need to have. Stop the madness.

Sadly typical school finance roundtable.

* As one of America's best economists, Leah Platt Boustan, points out, people understand that living in a property-wealthy community can lower their tax rate. So they will pay more for their house to live there. The houses in our example wouldn't cost the same; the one in the wealthy community would cost more. And it would be worth it: better services, lower tax rates.

You get this, right? If not, I'll keep trying...

Saturday, December 30, 2017

Test Scores Gains Are Not Necessarily a Sign of Better Instruction: A Cautionary Tale From Newark

This post is part of a series on recent research into Newark schools and education "reform."

Here's Part I.

Here's Part II.

* * *

In this series, I've been breaking down recent research about Newark, NJ's schools. Reformy types have been attempting to make the case that "reforms" in Newark over the past several years -- including charter school expansion, merit pay, Common Core alignment, school closures, and universal enrollment -- have led to gains in student learning. These "reforms" are purportedly the result of Facebook CEO Mark Zuckerberg's high-profile, $100 million grant to the city's schools back in 2010.

Zuckerberg recently funded a study, published by the Center for Education Policy Research at Harvard University this past fall, that shows a gain in "value-added" on tests for Newark compared to the rest of the state. (A technical paper, published by the National Bureau of Economic Research, is found here.)

Bruce Baker and I looked carefully at this study, and added our own analysis of statewide data, to produce a review of this research. One of our most important findings is that most of the "gains" -- which are, in our opinion, educationally small anyway (more on this later) -- can be tied to a switch New Jersey made in 2015 from the NJASK statewide exams to the newer PARCC exams.

As I noted in the last post, even the CEPR researchers suggest this is the most likely explanation for the "gains."
Assuming both tests have similar levels of measurement error, this implies that the PARCC and NJASK were assessing different sets of skills and the districts that excelled in preparing students for PARCC were not necessarily the same as the districts that excelled at preparing students for NJASK. Thus, what appears to be a single-year gain in performance may have been present before 2015, but was simply undetected by earlier NJASK tests. (p. 22, NBER, emphasis mine)
As I pointed out last time, there has never been, to my knowledge, any analysis of whether the PARCC does a better job measuring things we care about compared to the NJASK. So, while the PARCC has plenty of supporters, we really don't know if it's any better than the old test at detecting "good" instructional practices, assuming we can hold things like student characteristics constant.

But even if we did have reason to believe the PARCC was a "better" test, I still would find the sentence above that I bolded to be highly problematic. Let's look again at the change in "value-added" that the CEPR researchers found (p. 35 of the NBER report, with my annotations):

"Value-added" -- ostensibly, the measure of how much the Newark schools contributed to student achievement gains -- was trending downward prior to 2014 in English language arts. It then trended upward after the change to the new test in 2015. But the CEPR authors say that the previous years may have actually been a time when Newark students would have been doing better, if they had been taking the PARCC instead of the NJASK.

The first problem with this line of thinking is that there's no way to prove it's true. But the more serious problem is that the researchers assume, on the basis of nothing, that the bump upwards in value-added represents real gains, as opposed to variations in test scores which have nothing to do with student learning.

To further explore this, let me reprint an extended quote we used in our review from a recent book by Daniel Koretz, an expert on testing and assessment at Harvard's Graduate School of Education. The Testing Charade should be required reading for anyone opining about education policy these days. Koretz does an excellent job explaining what tests are, how they are limited in what they can do, and how they've been abused by education policy makers over the years.

I was reading Koretz's book when Bruce and I started working on our review. I thought it was important to include his perspective, especially because he explicitly takes on the writings of Paul Bambrick-Santoyo and Doug Lemon, who both hold just happen to hold leadership positions at Uncommon Schools, which manages North Star Academy, one of Newark's largest charter chains.

Here's Koretz:

One of the rationales given to new teachers for focusing on score gains is that high-stakes tests serve a gatekeeping function, and therefore training kids to do well on tests opens doors for them. For example, in Teaching as Leadership[i] – a book distributed to many Teach for America trainees – Steven Farr argues that teaching kids to be successful on a high-stakes test “allows teachers to connect big goals to pathways of opportunity in their students’ future.” This theme is echoed by Paul Bambrick-Santoyo in Leverage Leadership and by Doug Lemov in Teach Like a Champion, both of which are widely read by new teachers. For example, in explaining why he used scores on state assessments to identify successful teachers, Lemov argued that student success as measured by state assessments is predictive not just of [students’] success in getting into college but of their succeeding there. 
Let’s use Lemov’s specific example to unpack this. 
To start, Lemov has his facts wrong: test scores predict success in college only modestly, and they have very little predictive power after one takes high school grades into account. Decades of studies have shown this to be true of college admissions tests, and a few more recent studies have shown that scores on states’ high-stakes tests don’t predict any better. 
However, the critical issue isn’t Lemov’s factual error; it’s his fundamental misunderstanding of the link between better test scores and later success of any sort (other than simply taking another similar test). Whether raising test scores will improve students’ later success – in contrast to their probability of admission – depends on how one raises scores. Raising scores by teaching well can increase students’ later success. Having them memorize a couple of Pythagorian triples or the rule that b is the intercept in a linear equation[ii] will increase their scores but won’t help them a whit later. 
Some of today’s educators, however, make a virtue of this mistake. The[y] often tell new teachers that tests, rather than standards or a curriculum, should define what they teach. For example, Lemov argued that “if it’s ‘on the test,’ it’s also probably part of the school’s curriculum or perhaps your state standards… It’s just possible that the (also smart) people who put it there had a good rationale for putting it there.” (Probably? Perhaps? Possible? Shouldn’t they look?) Bambrick-Santoyo was more direct: “Standards are meaningless until you define how to assess them.” And “instead of standards defining the sort of assessments used, the assessments used define the standard that will be reached.” And again: “Assessments are not the end of the teaching and learning process; they’re the starting point.” 
They are advising new teachers to put the cart before the horse.”[iii] [emphasis mine; the notes below are from our review]
Let's put this into the Newark context:

  • One of the most prominent "reforms" in Newark has been the closing of local public district schools while moving more students into charter schools like North Star.
  • By their own admission, these schools focus heavily on raising test scores.
  • The district also claims it has focused on aligning its curriculum with the PARCC (as I point out in our review, however, there is little evidence presented to back up the claim).
  • None of these "reforms," however, are necessarily indicators of improved instruction.
How did Newark get its small gains in value-added, most of which were concentrated in the year the state changed its tests? The question answers itself: the students were taught with the goal of improving their test scores on the PARCC. But those test score gains are not necessarily indicative of better instruction. 

As Koretz notes in other sections of his book, "teaching to the test" can take various forms. One of those is curricular narrowing: focusing on tested subjects at the expense of instruction in other domains of learning that aren't tested. Did this happen in Newark?

More to come...

[i] Farr, S. (2010). Teaching as leadership; The highly effective teacher’s guide to closing the achievement gap. San Francisco: Josey-Bass. We note here that Russakoff reports that Teach for America received $1 million of the Zuckerberg donation “to train teachers for positions in Newark district and charter schools.” (Russakoff, D. (2016). The Prize; Who’s in charter of America’s schools? New York, NY: Houghton, Mifflin, Harcourt. p. 224)
[ii] A “Pythagorean Triple” is a memorized ratio that conforms to the Pythagorean theorem regarding the ratio of the sides of a right triangle. Koretz critiques the linear intercept rule, noting that b is often taught as the intercept of an equation in high school, but is usually the coefficient of an equation in college courses. In both cases, Kortez contends test prep strategies keep students from gaining a full understanding of the concepts being taught. See: Koretz, D. (2017) The testing charade; Pretending to make schools better. Chicago, IL: University of Chicago Press. pp. 104-108.
[iii] Koretz, D. (2017) The testing charade; Pretending to make schools better. Chicago, IL: University of Chicago Press. p. 114-115.

Tuesday, December 19, 2017

What Are Tests Really Measuring? A Tale of Education "Reform" in Newark

This post is part of a series on recent research into Newark schools and education "reform."

Here's Part I.

"What is a test, and what does it really measure?"

I often get the sense that more than a few stakeholders and policy makers in education don't take a lot of time to think carefully about this question.

There aren't many people who would claim that a test score, by itself, is the ultimate product of education. And yet test scores dominate discussions of education policy: if your beloved program can show a gain in a test outcome, you're sure to cite that gain as evidence in favor of it.

That's what's been happening in Newark, New Jersey these days. As I said in my last post, new research was published by the Center for Education Policy Research at Harvard University this past fall that purportedly showed a gain in "value-added" on tests for Newark compared to the rest of the state. The researchers have attempted to make the case that a series of reforms, initiated by a $100 million grant from Mark Zuckerberg, prompted those gains. (A more technical study of their research, published by the National Bureau of Economic Research, is found here.)

To make their case, the CEPR researchers do what many others have done: take test scores from students, input them into a sophisticated statistical model, and compare the gains for various groups. To be clear, I do think using test scores this way is fine -- to a point.

Test outcomes can and often do contain useful information that, when properly used, tell us important things. But we always have to remember that a test is a sample of knowledge or ability at a particular point in time. Like all samples, test outcomes are subject to error. Give a child who ate a good breakfast and got enough sleep a test in a quiet room with the heat set properly, and you'll get one score. Give that same child the same test but on an empty stomach in a freezing cold room, and you'll almost certainly get something else.

The variation in outcomes here illustrates a critical point: Often the scores on a test vary because of factors that have nothing to do with what the test is trying to measure. Psychometricians will often talk about construct validity: the extent to which a test is measuring what it is supposed to be measuring. Making a valid test requires not only creating test items that vary based on what we're trying to measure; it also requires defining what we're trying to measure.

Take, for example, New Jersey's statewide assessments in Grades 3 through 8 -- assessments required by federal law. For a number of years, the state administrated the NJASK: the New Jersey Assessment of Skills and Knowledge. It was a paper-and-pencil test that assessed students in two domains: math and English language arts (ELA).

Those are very big domains. What, exactly, comes under ELA? Reading and writing, sure... but reading what? Fiction? Informational texts? Toward what end? Comprehension, sure... but what does that mean? How does anyone demonstrate they comprehend something? By summarizing the text, or by responding to it in an original way? Is there a fool-proof way to show comprehension? And at what level?

These questions aren't merely a philosophical exercise -- they matter when building a test. What goes into the construct we are trying to measure? And, importantly, do the tests we give vary based on what we intend to use the tests to measure?

In the case of the recent Newark research, the economists who conducted the study made an assumption: they believed the test scores they used vary based on the actions of school systems, which implement programs and policies of various kinds. They assumed that after applying their models -- models that attempt to strip away differences in student characteristics and abilities to learn --  the variation in outcomes can be attributed to things the Newark publicly-financed schools, including the charter schools, do that differ from schools in other parts of the state.

It's a big assumption. It requires showing that the policies and programs implemented can be documented and, if appropriate, measured. It requires showing that those policies and programs only took place in Newark. And it requires making the argument that the variation found in test outcomes came only from those policies and programs -- what social scientists would call the treatment.

Further, this assumption requires making yet another assumption:

In 2015, New Jersey switched its statewide exam from the NJASK to the PARCC: the Partnership for Assessment of Readiness for College and Careers. PARCC is (mostly) a computerized exam. Its supporters often claim it's a "better" exam, because, they say, it measures things that matter more. I'm not going to get into that debate now, but I will note that, so far as I know, no one ever conducted a validity study of the PARCC compared to the NJASK. In other words: we're not sure how the two tests differ in what they measure.

What I can say is that everyone agrees the two exams are different. From what I've seen and heard from others, the PARCC math exam relies more on language skills than the NJASK math exam did, requiring students to do more verbal problem solving (which would put non-native English speakers at a disadvantage). The PARCC ELA exam seems to put more emphasis on writing than the NJASK, although how that writing is graded remains problematic.

Keeping this in mind, let's look at this graph from the CEPR research (p.35):

Until 2014, Newark's test score "growth" is pretty much the same as the other Abbott districts in the state. The Abbotts are a group of low-income districts that brought the famous Abbott v. Burke lawsuit, which forced the state toward more equitable school funding. They stand as a comparison group for Newark, because they have similar students and got similar test outcomes...

Until 2015. The Abbotts, as a group, saw gains compared to the rest of the state -- but Newark saw greater gains. Whether the size of those gains is educationally significant is something we'll talk about later; for right now, let's acknowledge they are statistically significant.

But why did they occur? Let me annotate this graph:

Newark's gains in "growth," relative to other, similar New Jersey districts, occurred in the same year the state switched exams.

And it's not just the CEPR research that shows this. As Bruce Baker and I showed in our review of that research, the state's own measure of growth, called Student Growth Percentiles (SGPs), also show a leap in achievement gains for Newark in the same year.

Again, the red line is the dividing point between the NJASK and the PARCC. In this case, however, we break down the districts into the Newark Public Schools, Newark's charter schools, and only those Abbotts in the same county as Newark. The districts close to Newark with similar demographics had similar gains in achievement "growth."

Let's step back and remember what the CEPR study was trying to understand: how a series of policies, initiated by Zuckerberg's donation, affected test score growth in Newark. What would we have to assume, based on this evidence, to believe that's true?
  • That the Newark reforms, which began in 2011, didn't kick in until 2015, when they suddenly started affecting test scores.
  • That the gains in the other Essex County Abbott districts (Irvington, Orange, and East Orange) were caused by some other factor completely separate from anything affecting Newark.
  • That the switch from the NJASK to the PARCC didn't create any gains in growth that were unrelated to the construct the tests are purportedly measuring.
Test makers will sometimes refer to the concept of construct-irrelevant variation: that test outcomes will vary because things we do not want them to measure still affect the scores. If two children with equal mathematical ability take a computerized test, but one has greater facility in using a computer, their test scores will differ. The problem is that we don't want their scores to differ, because we're trying to measure math ability, not familiarity with computers.

Did Newark's students -- and Orange's and East Orange's and Irvington's -- do better on the PARCC simply because they felt more at ease with the new PARCC test than students around the rest of the state? Did these districts engage in test prep activities specific to the PARCC that brought scores up, but didn't necessarily reflect better instruction?

The CEPR study admits this is likely:
Assuming both tests have similar levels of measurement error, this implies that the PARCC and NJASK were assessing different sets of skills and the districts that excelled in preparing students for PARCC were not necessarily the same as the districts that excelled at preparing students for NJASK. Thus, what appears to be a single-year gain in performance may have been present before 2015, but was simply undetected by earlier NJASK tests. (p. 22, NBER, emphasis mine)
I'll get into that last sentence more in a future post. For now, it's enough to note this: Even the CEPR team acknowledges that the most likely explanation for Newark's gains is the state's switch from the NJASK to the PARCC. But aligning instruction with one test more than another is not the same as providing better instruction. 

Gains like these are not necessarily an indication of curricular or instructional improvements. They are not necessarily brought about by moving students into "better" schools. They could very easily be the result of the tests measuring different things that we don't really want them to measure.

We'll talk more about this -- and get the views of a Harvard education expert -- next.