Examples of the Panel’s Reasoning In Its Top Tier and Near Top Tier Decisions

These examples are developed by the Panel on an ongoing basis, in cases where the Panel believes that its application of the Top Tier or Near Top Tier standard to a particular intervention required especially careful analysis that should be articulated. The purpose of these examples is: (a) to provide additional guidance to readers and applicants on what qualifies as Top Tier or Near Top Tier, drawn from the Panel’s case-by-case decisions; and (b) to serve as a body of precedents to help guide the Panel’s future reviews and decisions. 

Example #1 – Reasoning on (a) whether one multi-site RCT can qualify an intervention as Top Tier; and (b) what constitutes a “sustained” effect (as required by the Top Tier standard). This example is drawn from the Panel’s review of Success For All for grades K-2, a school-wide reform program.

On the question of whether Success For All “has been demonstrated effective, through well-designed RCTs, in more than one site of implementation,” as required for Top Tier (see Checklist for Reviewing A Randomized Controlled Trial, pdf page 6), the Panel’s majority view is that Success For All meets this guideline, on the following basis. The one RCT of this program – showing positive effects as described in our evidence summary – was a multi-site trial conducted in the real-world public school settings and conditions where this program is normally implemented (41 high-poverty elementary schools across 11 states).  Such a study falls within the stated guidelines for Top Tier, aimed at ensuring sufficient confidence that the program would be effective if faithfully replicated in other, similar schools.

On the question of whether the trial’s finding of an improvement in schoolwide reading ability, including comprehension, at the end of second grade constitutes a “sustained” effect (as required by the Top Tier standard), the Panel’s majority view is that Success For All meets this condition, on the following basis.  The end of second grade, though it was the end of the intervention, was three years after children entered the program, and reading ability increased over all three program years, with the largest effects found in year three.  A minority of Panel members would have preferred to see evidence that the effects were sustained through later grades – i.e., after the completion of the intervention – before identifying this program as Top Tier.  Based on the majority view, the Panel identified this program as meeting the Top Tier standard.

Example #2 – Reasoning on what constitutes a demonstration of effectiveness in more than one site of implementation (as required for Top Tier). This example is drawn from the Panel’s review of the Good Behavior Game – a classroom management strategy for decreasing disruptive behavior.

On the question of whether the program “has been demonstrated effective, through well-designed RCTs, in more than one site of implementation” (see Checklist for Reviewing A Randomized Controlled Trial, pdf page 6), the Panel’s majority view is that the program does not currently meet this guideline, for the following reason.

Our initiative identified three well-conducted RCTs of this program carried out in different sites. One trial evaluated the program when implemented in grades 1 and 2; a second trial evaluated when program when implemented in grades 2 and 3; a third trial evaluated the program when implemented in grade 1 only, as part of a comprehensive classroom curriculum that also included academic components and supplemental behavioral strategies.  The first two trials – of the stand-alone version – produced a pattern of effects on behavior and substance use that were highly promising but not definitive (e.g., did not always reach statistical significance, or were inconsistent across different cohorts).  The third trial – of the combined intervention – produced a stronger pattern of effects, but these could not conclusively be attributed to the program as opposed to the other elements of the intervention.

Thus, the Panel believes that an additional, well-conducted trial confirming that one of these program versions produces sizeable, sustained effects, would likely be needed to meet this Top Tier guideline. (Note: the version of this program evaluated in the third trial described above was subsequently found by the Panel to meet the standard for Near Top Tier, which does not require demonstrated effectiveness in more than one site.)

Example #3 – Reasoning on what constitutes a demonstration of effectiveness in more than one site of implementation (as required for Top Tier). This example is drawn from the Panel’s review of the Perry Preschool Project – a high-quality preschool program for children from disadvantaged backgrounds.

On the question of whether the program “has been demonstrated effective, through well-designed RCTs, in more than one site of implementation” (see Checklist for Reviewing A Randomized Controlled Trial, pdf page 6), the Panel’s majority view is that the program does not yet meet these guidelines, for the following reasons.

There have been two RCTs of this preschool program. The first trial had a sample of over 100 African American children living in poverty in the mid-1960s; it reported sizable, sustained effects on participants’ life outcomes. Because this study was conducted in a single site and population under conditions that may differ from today (e.g., control group members generally did not have access to other preschool or daycare options, as they would now), the Panel looked for corroboration of this finding in another site and/or population, and a more contemporary setting.  The second trial evaluated the preschool program in a more ethnically-diverse, low-income sample, against a control group that participated in traditional nursery school (a treatment-as-usual condition).  However, this trial did not produce clear evidence of effectiveness – both because of limitations in the study design1 and the absence of statistically-significant effects.2

Importantly, this Panel finding does not imply that this program is not effective and/or evidence-based – just that an additional, well-implemented trial (i) conducted under conditions typical for today’s low-income children, and (ii) showing sizeable, sustained effects, would likely be needed to meet the Top Tier standard.

Example #4 – Reasoning on what constitutes “sizable … benefits to participants and/or society” (as required for Top Tier). This example is drawn from the Panel’s review of the Infant Health and Development Program – an educational child care program for low birth weight children.

Background: This program was evaluated in a large, multi-site RCT with a long-term follow-up.  In both the childhood and young adult (age-18) follow-ups, the trial found statistically-significant effects on cognitive ability and reading and/or math achievement for a key subgroup (but not for the full sample).  At age 18, the standardized effect sizes were about 0.2 to 0.4, which translate to about 3-6 points on the IQ scale.

On the question of whether these positive effects constitute “sizable …benefits to participants and/or society” (as required by the Top Tier standard), the Panel’s majority view is that the program does not yet meet this condition, for the following reasons.

First, in the childhood and young adult follow-ups, the effects on cognitive and academic tests, while important, had not translated to improvements in other key school outcomes that were measured (e.g., special education placements, school drop-out rates, grade retentions) or long-term behavioral or health outcomes that were measured (e.g., substance use, arrests, jail time).  It is possible that such effects will appear in future follow-ups of this study, leading the Panel to revisit its findings for this program.

Second, this is an expensive program, costing about $20,000 per child per year over the program’s three year span (in 2009 dollars).  As discussed in our guidance on the Top Tier evidence standard, the Panel’s decisions are based mainly on the evidence of sizeable, sustained effects; however, in some cases the Panel may also consider cost (see overview of Top Tier solicitation and review process). In light of the study’s finding of limited positive effects, the Panel decided to consider the program’s cost in relation to the benefits, and found this consideration to weigh against identifying the program as meeting the Top Tier standard.

Example #5 – Reasoning on whether an intervention has “no strong countervailing evidence” (as required for Top Tier). This example is drawn from the Panel’s review of a specific crime prevention program for juvenile offenders.

On the question of whether this program meets the guideline for “no strong countervailing evidence” (see Checklist for Reviewing A Randomized Controlled Trial, pdf page 6), the Panel’s majority view is that the program does not currently meet this guideline, for the following reason.

Our initiative’s review of 18 RCTs of this program identified two well-conducted trials showing large, sustained effects on key crime outcomes (arrests and incarceration), and two well-conducted RCTs showing no effects on such outcomes. Although several hypotheses have been offered for the discrepant findings – such as differences in the sample population, in adherence to the program model, or in services received by the control group – the Panel believes, based on its careful review, that the reasons are unknown at this time, and that further study is needed to determine the conditions under which this program is effective.

1 For example, although the study randomly assigned children to the preschool program versus traditional nursery school, it did not randomly assign teachers. This raises the possibility that there were differences in teacher quality between the two groups, which could account for any difference in outcomes.

2 This trial – in addition to evaluating the program compared to traditional nursery school – evaluated it against another preschool program using a very different curriculum, and reported some significant effects versus that condition.