Empirical Education: Reading First program

Roberto and Brad, it was a pleasure hearing your commentary at the February 20 Policy Forum “Using Evidence for a Change” and having a chance to meet you afterward. Roberto, we promised you a note summarizing the views expressed by several on the panel and raised in the question period.

We can contrast two views of research evident at the policy forum:

The first view holds that, because research is so expensive and difficult, only the federal government can afford it and only highly qualified professional researchers can be entrusted with it. The goal of such research activities is to obtain highly precise and generalizable evidence. In this view, practitioners (at the state, district, or school level) are put in the role of consumers of the evidence.

The second view holds that research should be made a routine activity within any school district contemplating a significant investment in an instructional or professional development program. Since all the necessary data are readily at hand (and without FERPA restrictions), it is straightforward for district personnel to conduct their own simple comparison group study. The result would be reasonably accurate local information on the program‘s impact in the setting. In this view, practitioners are producers of the evidence.

The approach suggested by the second view is far more cost effective than the first, as well as more timely. It is also driven directly by the immediate needs of districts. While each individual study would pertain only to a local implementation, in combination, hundreds of such studies can be collected and published by organizations like the What Works Clearinghouse or by consortia of states or districts. Turning practitioners into producers of evidence also removes the brakes on innovation identified in the policy forum. With practitioners as evidence producers, schools can adopt “unproven” programs as long as they do so as a pilot that can be evaluated for its impact on student achievement.

A few tweaks to NCLB will be necessary to turn practitioners into producers of evidence:

1. Currently NCLB implicitly takes the “practitioners as consumers of evidence” view in requiring that the scientifically based research be conducted prior to a district‘s acquisition of a program. We have already published a blog entry analyzing the changes to the SBR language in the Miller-McKeon and Lugar-Bingaman proposals and how minor modifications could remove the implicit “consumers” view. These are tweaks such as, for example, changing a phrase that calls for:
“including integrating reliable teaching methods based on scientifically valid research”
to a call for
“including integrating reliable teaching methods based on, or evaluated by, scientifically valid research.”

2. Make clear that a portion of the program funds are to be used in piloting new programs so they can be evaluated for their impact on student achievement. Consider a provision similar to the “priority” idea that Nina Rees persuaded ED to use in awarding its competitive programs.

3. Build in a waiver provision such as that proposed by the Education Sciences Board that would remove some of the risk to a failing district in piloting a new promising program. This “pilot program waiver” should cover consequences of failure for the participating schools for the period of the pilot. The waiver should also remove requirements that NCLB program funds be used only for the lowest scoring students, since this would preclude having the control group needed for a rigorous study.

The view of “practitioners as consumers of evidence” is widely unpopular. It is viewed by decision-makers as inviting the inappropriate construction of an approved list, as was revealed in the Reading First program. It is seen as restricting local innovation by requiring compliance with the proclamations of federal agencies. In the end, science is reduced to a check box on the district requisition form. If education is to become an evidence-based practice, we have to start with the practitioners. —DN

Good news and bad news. As reported recently in Education Week(Viadero, 2007, October 17), pieces of legislation currently being put forward contain competing definitions for scientific research. The good news is that we may finally be getting rid of the obtuse and cumbersome term “Scientifically Based Research.” Instead we find some of the legislation using the ordinary English phrase “scientific research” (without the legalese capitalization). So far, the various proposals for NCLB reauthorization are sticking with the idea that school districts will find scientific evidence useful in selecting effective instructional programs and are mostly just tweaking the definition.

So why is the definition of scientific research important? This gets to the bad news. It is important because the definition—whatever it turns out to be—will determine which programs are, in effect, on an approved list for purchase with NCLB funds.

Let’s take a look at two candidate definitions, just focusing on the more controversial provisions.

* The Education Sciences Reform Act of 2002 says that research meeting its “scientifically based research standards” makes “claims of causal relationships only in random assignment experiments or other designs (to the extent such designs substantially eliminate plausible competing explanations for the obtained results) ”
* However, the current House proposal (the Miller-McKeon Draft) defines “principles of scientific research” as guiding research that (among other things) makes “strong claims of causal relationships only in research designs that eliminate plausible competing explanation for observed results, which may include but shall not be limited to random assignment experiments.”

Both say essentially the same thing, but the new wording takes the primacy off random assignment and puts it on eliminating plausible competing explanations. We see the change as a concession to researchers who find random assignment too difficult to pull off. These researchers are not, however, relieved of the requirement to eliminate competing explanations (for which randomized control remains the most effective method). Meanwhile, another bill, introduced recently by Senators Lugar and Bingaman takes a radically different approach to a definition.

* This bill defines what it means for a reading program to be “research–proven” and proposes the requirements for the actual studies that would “prove” that the program is effective. Among the minimum criteria described in the proposal are:

* The program must be evaluated in not less than two studies in which:
* The study duration was not less than 12 weeks.
* The sample size of each study is not less than five classes or 125 students per treatment (10 classes or 250 students overall). Multiple smaller studies may be combined to reach this sample size collectively.
* The median difference between program and control group students across all qualifying studies is not less than 20 percent of student-level standard deviation, in favor of the program students.

As soon as legislation tries to be this specific, counter examples immediately leap to mind. For example, we are currently conducting a study of a reading program that fits the last two points but, because the program is designed as a 10-week intervention, it can never become research-proven under this definition. Another oddity is that the size of the impact and the size of the sample are specified, but not the level of confidence required—it is unlikely we would have any confidence in a finding of a 0.2 effect size with only 10 classrooms in the study. Perhaps the most unacceptable part of this definition is the term “research-proven.” This is far too strong and absolute. It suggests that as soon as two small studies are completed, the program gets a perpetual green light for district purchases under NCLB.

As odd as this definition may be, we can understand why it was introduced. The most prevalent interpretation of the requirement for “Scientifically Based Research” in NCLB has been that the program under consideration should have been written and developed based on findings derived from scientific research. It was not required that the program itself have any scientific evidence of effectiveness. The Lugar-Bingaman proposal calls for scientific tests of the program itself. In Reading First, programs that had actual evidence of effectiveness were famously left off the approved list, while programs that simply claimed to be designed based on prior scientific research were put on. This proposal will help to level the playing field. To avoid the traps that open up when specific designs are legislated, perhaps the law could call for the convening of a broadly representative panel to hash out the differences between competing sets of criteria rather than enshrine one abbreviated set in federal law.

But even with consensus on the review criteria for acceptable research (and for explaining the trade–offs to the consumers of the research reviews at the state and local level), we are still left with an approved list—a set of programs with sufficient scientific evidence of effectiveness to be purchased. Meanwhile new programs (books, software, professional development, interventions, etc.) are becoming available every day that have not yet been “proven.”

There is a relatively simple fix that would help democratize the process for states and districts that want to try something because it looks promising but has not yet been “proven” in a sufficient number of other districts. Wherever the law says that a program must have scientific research behind it, also allow the state or district to conduct the necessary scientific research as part of the federal funding. So for example, where the Miller–McKeon Draft calls for

“a description of how the activities to be carried out by the eligible partnership will be based on a review of scientifically valid research,”

simply change that to

“a description of how the activities to be carried out by the eligible partnership will be based on a review of, or evaluation using, scientifically valid research.”

Similarly, a call for

“including integrating reliable teaching methods based on scientifically valid research”

can instead be a call for

“including integrating reliable teaching methods based on, or evaluated by, scientifically valid research.”

This opens the way for districts to try things they think should work for them while helping to increase the total amount of research available for evaluating the effectiveness of new promising programs. Most importantly, it turns the static approved list into a process for continuous research and improvement. —DN

Friday, March 14, 2008

Making Way for Innovation: An Open Email to Two Congressional Staffers Working on NCLB

Monday, October 15, 2007

Congress Grapples with the Meaning of “Scientific Research”

Followers

Blog Archive