The Effect of Tracking in Public Schools
A new NBER paper finds evidence to support tracking in middle schools
The proposed new California Math Framework recommends that districts avoid separating students into separate classes by ability, a practice called “streaming” or “tracking”. The draft framework has received a lot of pushback but even this exhaustive evisceration of it focuses more on revealing the shallow thinking and shoddy work in the draft than on engaging with their recommendation on tracking. Here in San Francisco, SFUSD has long been philosophically opposed to tracking: it has no gifted and talented programs in elementary schools, doesn’t offer multiple math streams in middle school, and went so far as to abolish Algebra I in middle schools.
I just came across a fascinating new (August 2022) NBER working paper: “Patterns, Determinants, and Consequences of Ability Tracking: Evidence from Texas Public Schools.” It is sure to become ammunition in the debate about the proposed new California Math Framework and the future of math education in SFUSD. Let’s dig in and see what the study has to say.
Scope of the Paper
The researchers had access to elementary and middle school records for all students in Texas public schools for the period 2011-2019. That’s a huge data set covering about 10% of U.S. students. They calculated the extent to which individual students were tracked (i.e. grouped by ability) and explored the factors that were associated with more or less tracking in schools. They also had access to students’ scores on standardized Math tests from the end of 3rd grade to the end of 8th grade. This enabled them to calculate how tracking affected student performance.
What is Tracking?
Tracking can take many forms:
different groups of students might study different subject matter e.g. some students take Algebra I and some take Pre-Algebra.
A small number of low-ability students might be grouped in one class and given remedial instruction with the vast majority of students taking standard grade-level courses.
High-ability students (e.g. those part of a gifted-and-talented program) might be deliberately grouped in one class and given more challenging work even if the course still has the same name (e.g. “Math 5”).
English-learners might be grouped into a separate class than students who are fluent in English (perhaps because they are part of a bilingual program), and those English learners might have lower average math scores.
Richer, more educated parents (whose kids will tend to score higher than average) may be able to get a school to place their kids with a favored teacher even if the favored teacher is teaching the same material as all the other teachers in that grade.
Two schools in the same district might have very different student populations because school attendance boundaries group richer, higher scoring students in one school or because of the choices parents make about where to enroll their kids.
Some of these are deliberate choices by district or school staff; others are not. In a school with no tracking, the average math ability (as measured by prior year’s standardized test scores) in each class in a grade in a school will be the same. If the prior year’s scores are very different from class to class, this is evidence of tracking.
The researchers quantify tracking by looking at “how much of the variation in prior math test scores can be explained by current math class assignments.” That phrasing is awkward: the prior math test scores, by definition, came before the current math class assignments, so it’s odd to think of the current assignments explaining the prior scores.
Of course, smaller schools have less opportunity to track than larger ones. A school that has only one or two classes per grade has less opportunity to sort the students by ability than a school that has five classes per grade. To address this, the researchers use two different measures of tracking, an absolute one and one that is relative to the maximum amount of tracking that could theoretically occur.
Where Tracking Occurs
Among the researchers’ findings:
Districts, rather than individual schools, set tracking policy. This is as expected but the data bears it out.
Across-school sorting (i.e. sorting between schools within a district) is much lower than sorting within schools: “it is rare for across-school sorting to explain more than 20% of the variation in prior scores”
Across classes within schools, sorting by prior test score is not equivalent to sorting by race/ethnicity. “There is much less sorting by race/ethnicity and socioeconomic status than by prior test scores.”
Tracking increases with grade level. This is to be expected because middle school is often when students in the same grade start to take different math classes. Some schools will start to stream in this way in grade 6, others in grade 8. Interestingly, K-8 schools track less in middle school grades than pure middle schools.
What factors give rise to tracking?
The researchers attempt to control for many of the factors that could affect tracking such as district size, district property wealth (which affects school funding in Texas), urban/suburban/rural, grade configuration (i.e. which grades are supported), and the share of students in a variety of special needs categories and programs (such as physical and other disabilities, gifted, English learners). After controlling for all these variables, the factors that were significantly associated to a higher level of tracking were
curricular differentiation (i.e. whether the school offers two different math classes in the same year)
the share of students who are classified as having a non-physical disability,
the share of students who are classified as being gifted.
more experienced teachers
smaller classes
The first of these is entirely obvious. If a school offers two or more math classes in the same year that differ by content, one is going to seen as easier and the other as harder, and the assignment of students to those classes will take prior math achievement into account. If a school offers the same math class to all students in a year, it is less likely to group the students by ability because why bother?
The second and third are interesting. The process of classifying a student as having an emotional or learning disability or as being gifted is at least partially subjective. Schools that are more aggressive about classifying their students in either way are more likely to have higher rates of tracking. Note that the researchers controlled for the shape of the achievement curve at each school so it is not that those schools have more students in the top 5% or bottom 5% of the distribution. It is just that the schools have classified more students as being gifted or disabled.
The fourth and fifth are puzzling. The researchers controlled for district property wealth, which affects funding directly, so it is unclear to me what the mechanism is whereby cohorts that are more tracked have smaller classes and more experienced teachers.
Local Characteristics That Drive Tracking
The researchers also examined which local characteristics predicted the observed levels of tracking.
A district’s partisan lean, as measured by the share of the Presidential vote going to Democrats, was not a statistically significant predictor of the tracking level. This was a bit surprising because previous studies had shown that conservatives were more supportive of tracking than liberals.
During the period of the study, Texas had a school accountability scheme that emphasized learning gains, not achievement levels. It was possible, for example, for a school to have high average achievement levels but still receive an “unacceptable” rating if progress was low. Did receipt of an “unacceptable” rating have any affect on tracking? The researchers found that tracking actually decreases with the low performance ratings. That was a surprise because previous studies had shown that accountability pressures tended to lead schools to increase tracking.
They found “little relationship between student demographics - as proxied by racial composition and shares of students who are low income and limited English proficient - and the degree of tracking.”
It was known from previous studies that parents of high-achieving children are more inclined to favor tracking. The researchers found that what really matters is not the average level of math ability but how wide the distribution is:
“The standard deviation of lagged math test scores is a positive and statistically significant predictor of tracking…suggesting that the perceived net benefits of tracking are increasing with the heterogeneity of student ability.”
Districts with higher tracking levels have a lower share of students in private schools. This is an interesting finding. It suggests that districts which track are better able to prevent parents from leaving for private schools than those which don’t track.
Impact of Tracking on Achievement
Of course, the most interesting question of all is how tracking affects students of different achievement levels. Before reading this paper, I had heard that the consensus was that better students benefited from tracking and poorer students were harmed by it. The researchers found a more general benefit:
“Students exposed to more tracking experience higher test score growth at almost all points of the distribution”
When the students are exposed to more tracking seems to make a significant difference: it should be done in middle school, not elementary school. Exposure to more tracking in elementary school has no statistically significant impact on higher-achieving students but seems to have a temporarily negative effect for lower-achieving students i.e. their performance is lower than expected in grade 5 but the reduction in predicted performance disappears by grades 6, 7, and 8. The authors summarize:
Exposure to tracking in middle school is positively associated with test score growth for students at the top, suggesting that tracking increases inequities in educational outcomes but does not otherwise harm low-achieving students on average.
Here is the key chart from the paper:
Implications for San Francisco
Level of Tracking
Without the detailed data the researchers had, it’s impossible to calculate tracking numbers for SFUSD or any of its schools. But we can make some educated guesses.
The researchers found that across-school sorting was usually much lower than within-school sorting. That’s probably true in San Francisco too even though SFUSD is philosophically opposed to tracking. Yes, there is a big difference in the average scores at, say, Visitacion Valley Middle and AP Giannini Middle but there is also huge variation within each of those schools. And a large chunk of the difference between the schools can be explained by their different racial and ethnic compositions.
The study showed that districts with low levels of tracking tend to have high levels of private school enrollment. A plausible explanation is that tracking enables public schools to retain parents who might otherwise send their kids to private school because they know their kids will be sufficiently challenged in those public schools. San Francisco neatly follows the pattern revealed in the study: SFUSD tries not to track and the city has one of the highest levels of private school enrollment in the country. Nevertheless, it would be a mistake to blithely assign cause and effect here. I don’t know what its tracking policies were previously but San Francisco has had high levels of private school enrollment for at least sixty years.
The study also showed that tracking is more prevalent in districts with wide achievement distributions. SFUSD goes against the trend here. Although it is opposed to tracking, SFUSD has one of the widest achievement distributions in the state. On the four-point scale used by the state, SFUSD has far more students in either Standard Not Met (Level 1) or Standard Exceeded (Level 4) than in either Standard Nearly Met (Level 2) or Standard Met (Level 3). When there are so many kids at both ends of the scale, it is hard to imagine any one course syllabus suiting all of them. Any single course will either be too hard for the kids in Level 1 or too easy for the kids in Level 4.
Middle School Algebra
The conclusion of the paper, namely that higher-achieving students are helped by tracking in middle schools and lower-achieving students are not hurt by it, is evidence that SFUSD should adopt the same approach as many private middle schools in the city, where students are tracked into separate Math streams from 6th grade.
It is important to note that this is not the same as what SFUSD was doing prior to eliminating Algebra I in 2014. Here’s a chart showing the Math classes taken by SFUSD 8th graders over time.
They didn’t really remove tracking in 2014 because it wasn’t really there in the first place1. When SFUSD offered Algebra I in middle schools (prior to 2014), its approach was Algebra-for-all. Nearly 90% of students took Algebra I in 8th grade with the exceptions being students taking more basic courses. Now 95% take Pre-algebra. They switched from Algebra-for-all to Algebra-for-none without stopping at Algebra-for-those-who-are-ready-for-it along the way. All they did was defer Algebra for a year in the hope that more students would be ready for the material in 9th grade. They were wrong.
Eliminating middle school algebra did greatly reduce tracking in 9th grade because nearly everyone now takes Algebra I in 9th grade whereas previously there were many students taking Geometry, many repeating Algebra I, and some taking Algebra II. But the paper is only concerned with tracking up to the end of 8th grade, and eliminating middle school algebra had only a marginal effect on middle school tracking.
Thanks for the analysis. Hopefully with new members of the school board the focus can be more on education and less on equity.
Good essay. A few comments. Most of the difference between Visitacion Valley Middle and AP Giannini Middle can be explained by family income and parental education. Ethnicity takes a back seat to economic factors.
"The study showed that districts with low levels of tracking tend to have high levels of private school enrollment" Private school enrollment is almost entirely driven by (White) parental desires to segregate their children. San Francisco had a low private school enrollment before Brown vs. Board of Education, but private school enrollment skyrocketed when San Francisco began busing its schools to enforce desegregation. There was a "White flight" to segregated public suburban schools around the same time, generally declining population and a falling public school enrollment, echos of what we are seeing today to a lesser extent with the Asian population.
The Black student population did better in San Francisco in the 70s and 80s compared to the overall student population but most of the Black middle class was displace by the growing Asian population, leaving most of the Black population remaining poor and living in subsidized housing, leading to the extremely poor student results we have now.