Volunteer Tutoring

A note on this page's publication date

The content on this page has not been recently updated. This content is likely to be no longer fully accurate, both with respect to the research it presents and with respect to what it implies about our views and positions.

Published: 2010

In a nutshell

  • The Problem: The achievement gap opens early: by the age of 5, children who grow up in low-income households are already behind academically.
  • The Program: Programs employ non-professional tutors, who either volunteer or receive a small stipend, with the goal of improving young students' academic performance.
  • Track record: A review of more than 20 studies evaluating a variety of volunteer tutoring programs found substantial and statistically significant effects on students' academic achievement. We are unsure of when these results were measured but guess that they are short-term.
  • Cost-effectiveness: We have no definitive figure, but estimate a cost of $67-300 per student per year.
  • Bottom line: Volunteer tutors may offer a relatively low-cost method for improving students' academic achievement.

Table of Contents

Program description

Volunteer tutors are a common approach to improving students' academic achievement.1 This report relies heavily on the Campbell Collaboration's review of volunteer tutoring programs.2 The review considered "programs that were used for students in grades K – 8, and programs where adult, non-professional (volunteer) tutors were used."3 Tutors were one of the following: parents trained to work with their children, college students in a pre-teaching program, or other community volunteers (e.g., high school students or senior citizens).4

This review considered tutoring programs that lasted at least one month and aimed to increase academic achievement.5

Note: The review does not specify that the beneficiaries of the program are either low-income or low-achieving.6

Evidence of effectiveness

What studies were included in the review?

The review, Ritter et al. (2006), included studies that met the following criteria:

  • Randomized-control trials of programs.7
  • Studies published during or after 1985.8
  • Studies of programs that were not specifically focused on students with limited English proficiency.9
  • Studies of programs with adult, non-professionals tutors. Tutors either were volunteers or received only a small stipend.10
  • Studies where the tutees were in grades K-8.11 Students were most commonly in first grade (14 of 28 cohorts and 770 of 1,676 total students)12
  • Studies published in English and conducted in the United States.13

What were the results?

Ritter et al. (2006) reviewed 28 studies that included 1,676 participants,14 and reviewed evidence of improvement across different academic areas: general reading, particular skills within reading, writing and math.15 The reviewers originally intended to consider others measures in addition to academic achievement, but found that the studies they reviewed rarely included non-academic outcomes.16

The review concludes that there was a statistically significant (at the .05 level) effect on students' academic achievement, particularly reading. The effect was approximately 0.3 standard deviations, a sizable improvement in students' academic achievement.17 The review does not specify time to follow up for the intervention, but we would guess that these results are short-term as students' progress was likely assessed shortly after the program's end.18 Results are provided in the table below.19


  • A * denotes statistical significance at the p .05 level.
  • The effect sizes are in standard deviations.
Subject # of studies # of students Effect size
Reading - Overall 24 550 0.30*
Reading - Letters & Words 15 403 0.26*
Reading - Comprehensiion 8 293 0.18
Reading - Oral Fluency 12 336 0.30*
Writing 6 111 0.45*
Mathematics 5 292 0.27

Do the results differ by program type?

Ritter et al. (2006) considered many different approaches to tutoring and evaluated the results to determine whether any particular approach was more successful. The review considered three distinctive factors: types of tutors, grade level of tutee, and program structure.20

The review found no significant difference in effect size for different types of tutors or grade level of tutees.21 There was a difference in effect size based on whether the program was deemed to be "high" or "low" structure. However, the authors report these results cautiously.22

Is publication bias a concern?

Ritter et al. (2006) compared results of studies published in journals to results of studies published as doctoral dissertations to determine whether there was evidence of publication bias.23 The review concluded that "the difference in effect sizes between studies published in journals and non-published studies was not statistically significant. Other tests of publication bias also suggested the included studies were an unbiased sample."24


Ritter et al. (2006) does not provide cost data for the programs it reviews. We provide information as context for what a similar program would cost:


  • 1

    "By 1987, the National Research Council estimated that there were over one million volunteer tutors who donated an average of four hours per week in the nation's public schools. The survey found that three-fourths of public elementary schools in the United States reported the involvement of volunteers, with schools having an average of 24 volunteers (Michael, 1990)." Ritter at al. 2006, Pg 5.

  • 2

    Ritter et al. 2006.

  • 3

    Ritter et al. 2006, Pg 3.

  • 4

    "Some programs train parents as tutors to help their own children. These programs are different from those that train college-age tutors to work with younger students; often these college-age students are in the America Reads program or are pre-service teachers. Finally, the remainder of the programs reviewed here used community volunteers from a variety of ages, ranging from older high school students to senior citizens. Programs that used a combination of these tutors are placed in the community volunteer category. In our sample of 28 study 'cohorts', 5 were from programs using primarily parents (study sample = 338), 12 were from programs using primarily college-age tutors (study sample = 899), and the remaining 11 were from programs using community volunteers across a variety of ages (study sample= 439)." Ritter et al. 2006.

  • 5

    "The interventions featured regular tutoring sessions with an academic focus for at least one month in duration. The duration restriction was included due to a belief that programs lasting only a few days were qualitatively different than longer programs with sustained exposure." Ritter et al. 2006, Pg 7.

  • 6

    Ritter et al. 2006.

  • 7

    "Only randomized field trials were included in the review. Quasi-experimental studies that employ treatment and control groups matched on pre-tests of key outcome variables were not included in this review. Pretest-posttest studies, or those in which a treatment group is compared to another treatment group, were not included." Ritter et al. 2006, Pg 7.

  • 8

    "Studies published before 1985 were not included." Ritter et al. 2006, Pg 7.

  • 9

    "Furthermore, we excluded studies of programs that were especially designed to address the needs of students with limited English proficiency (LEP), because such specialized programs are not representative of most volunteer tutoring programs for elementary and middle school students." Ritter et al. 2006, Pg 7.

  • 10

    "Only studies of programs involving adult, non-professional tutors were included. Although these tutors were almost always referred to as 'volunteers' in the literature, those programs that pay a small stipend to tutors (such as undergraduate tutors who are tutoring as part of a work-study program) were also included." Ritter et al. 2006, Pg 7.

  • 11

    "In terms of the tutees, only studies of programs that serve students in grades K-8 (elementary and middle school) were considered, since this is the population typically served by volunteer tutoring programs, and because such programs are fundamentally different than those provided to high school students." Ritter et al. 2006, Pg 7.

  • 12

    "We divided up our sample of tutoring programs into those that served the youngest students (grade 1) and those that served older students (grade 2 and above). In our sample of 28 study “cohorts”, 14 were focused on students in first grade (study sample = 770) and the remaining 14 were focused on older students (study sample = 906)." Ritter et al. 2006, Pg 17.

  • 13

    "Only English-language studies of programs conducted in the United States were considered, due to the limited resources for this review." Ritter et al. 2006, Pg 7.

  • 14

    "In the end, the search yielded 21 unique articles, reports, or dissertations; there are 28 unique “study cohorts” or “studies” identified in these 21 reports, as some reports focused on multiple cohorts analyzed separately. Each of the 28 studies is described in some detail in Table 4 in Section 9. The evidence base described here relies upon a sample of 1,676 study participants, 873 of whom were in the tutoring treatment groups and 803 of whom were in the control groups." Ritter et al. 2006, Pg 17.

  • 15

    Ritter et al. 2006, Pg 17.

  • 16

    "The original intent of the review was to consider all outcome measures related to student achievement, including distal outcomes (ones the program are actually intended to influence, such as school grades or standardized achievement measures), as well as proximal outcomes (intermediate measures that might be influenced by tutoring and then might lead to improved outcomes in the future, such as student attendance rates). However, the review yielded very few studies that analyzed school grades or attendance rates; rather, most studies focused on various standardized assessments of math and reading skills, or “authentic” measures of reading and writing skills. As a result, our review focused on these outcomes." Ritter et al. 2006, Pgs 7-8.

  • 17

    Ritter et al. 2006, Pg 21, Table 2.

  • 18

    In the studies we've reviewed, long-term follow up is the exception not the rule. We checked the follow up time for the SMART program (see below). It had short-term follow up.

    • The program: "Eighty-four beginning first grade students at risk of reading difficulties were randomly assigned to experimental and comparison groups. Adult volunteers tutored students in the experimental group in 30-minute sessions two times per week in first and second grade." Baker, Gersten, and Keating 2000, Pg 495.
    • "Students were tested three times in the study: at the beginning of Grade 1 (October, 1996), the end of Grade 1 (May, 1997), and the end of Grade 2 (May, 1998)." Baker, Gersten, and Keating 2000, Pg 502.

  • 19

    Ritter et al. 2006, Pg 21, Table 2.

  • 20

    "We examined the possibility of differential effects of different types of volunteer tutoring programs on the reading outcomes. We focus here only on these 'subgroup' effects in which there are at least 3 studies in each subgroup. Subgroups examined are described above and include: types of tutors, grade level of tutees, program structure, and publication type." Ritter et al. 2006, Pg 21.

    Note we deal with the fourth factor, publication type, below in our section on publication bias.

  • 21

    "None of the outcomes had a significant difference in effect size by tutor type. That is, programs using parent, college age, or community tutors did not differ significantly in their effectiveness. Similarly, programs that included Grade 1 were not significantly different from programs for higher grades in their effectiveness." Ritter et al. 2006, Pg 21.

  • 22

    "The only significant subgroup difference we found was that highly structured programs had a significant advantage over programs with low structure on the global reading outcome, with an effect size of .59 for structured programs and .14 for unstructured. The other reading outcomes did not differ significantly by amount of program structure. It should be noted that there were only three studies classified as highly structured that used global reading outcomes, and all three studies had the same lead author (Vadasy et al., 1997a, 1997b, 2000)." Ritter et al. 2006, Pg 21.

  • 23

    "In the field of systematic reviews, there is a real concern with “publication bias” or “file-drawer bias”. These terms refer to the concept that studies showing null effects are less likely to be submitted for publication and less likely to be accepted for publication, all else equal, if submitted. Thus, one might expect that studies published in journals would be more likely to show positive program effects as compared to those disseminated as unpublished reports, conference papers, or student dissertations. Consequently, we distinguished in our sample of studies of tutoring programs that were published in journals as a test of this “bias”. In our sample of 28 study “cohorts”, 15 were from studies in refereed journals (study sample = 772); the remaining 13 study cohorts were primarily from doctoral dissertations (study sample = 904)." Ritter et al. 2006, Pg 18.

  • 24

    Ritter et al. 2006, Pg 4.

  • 25

    "The program, which has primarily been paid for by donations, costs $300 per child per year (2004 dollars)." Coalition for Evidence-Based Policy, "Start Making a Reader Today."