The purpose of grading is a hot button issue in education. A recent SF Chronicle article described Palo Alto Unified’s attempt to switch to “evidence-based grading, which means rewarding students for demonstrating they know the subject matter, even if they need more time or test retakes to do so, without behavior, participation or obedience reflected in the calculation.”

In a different universe, one could imagine conservatives being the ones pushing for evidence-based grading: “if you want an ‘A’, take this test and prove that you deserve it. There are no participation trophies: you don’t get an ‘A’ just for showing up.” In our universe, it’s usually progressives who are pushing the policy and they’re motivated by the beliefs that homework discriminates against children from unstable families, that tests are stressful, and that expecting students to show up and behave is racist. Conservatives, therefore, reflexively oppose them.

I thought it would be interesting to examine, as the article puts it:

whether an A means students actually mastered the subject matter or simply showed up and didn’t make waves.

The California Department of Education publishes no data on grades, presumably because it knows that grades mean different things from school to school and district to district. If grades were consistent, we wouldn’t need the SBAC. SFUSD doesn’t usually publish its data either but it did make school-by-school grade reports public for the Fall Semester of 2022-231. Figure 1 is a screenshot from one of those grade reports:

There is a crude but effective way to assess grading standards: compare the percentage of students who receive an ‘A’ with the percentage who are proficient (i.e. score either Meets Standards or Exceeds Standards) in the corresponding SBAC test. Such a comparison will reveal whether there are students who are proficient but don’t get an ‘A’ (demonstrating that the school is a tough grader) or whether there are students who get an ‘A’ despite not being proficient (demonstrating that the school is an easy grader).

## Middle Schools

Figure 2 shows the ELA grading standards in SFUSD middle schools. The x-axis is the share of students at the school who are proficient (i.e. who meet or exceed standards). The y-axis is the share who meet or exceed standards minus the share of students who got an ‘A’. When this number is above zero, it means that there are students who were good enough to at least meet standards on the SBAC test who did not manage to get an ‘A’ in their class. When the number is below zero, it means that there are students who got an ‘A’ despite not meeting or exceeding standards on the SBAC test.

At Roosevelt, 75% of kids met or exceeded standards but only 44% received an ‘A’, leaving 31% who were good enough to meet standards but got a ‘B’ or lower grade. Meanwhile, at King, just 30% of kids met or exceeded standards but 53% got an ‘A’ meaning that 22% of the students received an ‘A’ despite not meeting standards. An ‘A’ clearly means very different things at each school.

There is an obvious trend whereby the more students a school has who are proficient the harder it is to get an ‘A’. Some schools are less prone to grade inflation than others. Paul Revere has the lowest overall proficiency rate in the city but it doesn’t sugarcoat the situation by throwing around ‘A’s. Similarly Willie Brown’s students are more likely to be proficient than Visitacion Valley’s (34% to 24%) but less likely to receive an ‘A’ (45% to 40%).

### Middle School Math

SFUSD has lower standards for Math grading than ELA grading. More students earn an ‘A’ in Math than in ELA (53% to 51%) even though SFUSD’s students are not as good at Math as ELA. Only 39% of 6-8 graders were proficient in Math compared to 52% who were proficient in ELA.

As figure 3 shows, most schools fall below the zero line meaning they give ‘A’ grades to students who are not proficient. 49% of Visitacion Valley’s students received an ‘A’ even though only 13% of them were proficient. Meanwhile, only 47% of Rooftop’s students received an ‘A’ even though 50% of them were proficient. Alice Fong Yu2 is at the opposite extreme: only 33% of its 7th graders earned an ‘A’ even though 73% of them were proficient (and 52% EXCEEDED standards).

A few months ago, I wrote about SFUSD’s new Math vision which made heavy use of this report produced by TNTP, an education nonprofit formerly known as The New Teacher Project. This report stressed the importance of grade-level assignments. The authors found that schools had such low expectations for students of color that they were often not taught the grade-level material they were expected to learn. But the students were graded on what they were taught, not on what the standards expected them to have learned. One of the conclusions of the report was:

“students of color received grades that less accurately reflected their mastery of rigorous content"

That is clearly what is happening in San Francisco too: most of the schools with easy grading standards have Latino/Black majorities.

## High Schools

High schoolers only sit the SBAC in 11th grade so I'm comparing the SBAC results with the grades of juniors only. Figure 4 shows the results for ELA. There are fewer high schools so I’m able to show two years of data on one chart.

The 2023 SOTA figure looks so anomalous that I double-checked it. Although 84% of juniors met or exceeded the standards, only 33% of them got an ‘A’. The previous year’s juniors were marginally more likely to be proficient (88% vs 84%) but far more likely to get an ‘A’ (74% vs 33%). Maybe there was one demanding teacher who transferred to SOTA from their neighbors at Academy because, in the previous year of 2021-22, only 25% of Academy’s juniors got an ‘A’ even though 58% of them were proficient.

For 11th grade Math, the magnitude of the difference between the proficiency rate and the ‘A’ rate is even greater, as figure 5 shows.

There’s a partial explanation for this. By the time students reach 11th grade, they’re taking many different Math classes. Kids on the standard pathway are taking Algebra II; some others are taking the Algebra II + Precalculus compression course; some will be taking regular precalculus; others will be taking honors precalculus; finally, a few are already taking AP Calculus. At the other end of the spectrum, there are kids taking or retaking earlier Math classes. A kid who is good enough to earn an ‘A’ in one class (e.g. regular precalculus) might instead be earning a ‘B’ in a more advanced class (e.g. honors precalculus).

Over 80% of Lowell’s students are proficient in Math but only around half (49% in 2023 and 55% in 2022) receive an ‘A’, in part because they’re taking harder courses and thus being held to higher standards. More generally, there is one group of schools (Balboa, Galileo, Lincoln, Lowell, SOTA, Washington) where many students who are proficient in Math don’t receive an ‘A’ and another group of schools (Academy, Burton, Jordan, Marshall, Mission, O’Connell, SF International) where lots of students receive ‘A’s despite not being proficient. Only Wallenberg managed to be in one group one year and the other group the other year. The extreme examples came at Mission (47% received an ‘A’ even though only 17% were proficient) and Jordan (31% received an ‘A’ even though zero were proficient) in 2022.

## Conclusions

In SFUSD, grading is relative. Grading standards vary enormously from school to school. Students are effectively being compared against their classmates, not against some objective standard. An ‘A’ in one school is not worth the same as an ‘A’ in another.

Imagine that SFUSD were to switch to evidence-based grading in a consistent way so that grades actually tracked mastery of the material. The schools that I called hard graders would all see an increase in the number of ‘A’ grades and the schools that I called easy graders would all see a decrease in the number of ‘A’ grades. Some consequence of this would be:

The average GPAs of Asian and White students would increase and the average GPAs of Black and Latino students would decrease.

The number of Latino and Black students admitted to Lowell would fall (just one B in 8th grade is sufficient to exclude a student from Band One admissions).

Students from the easy grading high schools would find it harder to get admitted to colleges because their GPAs would be lower.

## Caveats

While this analysis is very suggestive, it does suffer from a number of serious weaknesses:

Students sit SBAC tests in the Spring so it would make sense to compare SBAC results with grades from the Spring semester, not the preceding Fall semester. Alas, Spring semester grades are not public so the analysis had to make do with Fall semester grades. This temporal mismatch makes interpretation of the results more difficult. Suppose a school grades its students accurately based on their mastery of math but offers spectacularly good instruction that dramatically increases the students’ actual knowledge of math during the school year. The students’ SBAC scores in the Spring will reflect their increased mastery of math but the analysis will compare these high SBAC scores to the lower grades they received in the Fall and conclude, inaccurately, that the school grades harshly.

Some students don’t sit the SBAC tests but every student gets a class grade. If the students who don’t sit the SBAC tests are not representative of the class (and I would bet that they tend to be below average), this could bias the scores. Ideally, we would only compare the grades of students for whom we have SBAC scores. Unfortunately, we only have class averages to work with.

The grades data shows the “count of marks” in each subject which may be different from the number of students. In 2022-23, Aptos MS had 303, 273, and 272 students in grades 6-8 respectively but, as figure 1 above shows, the ELA marks for those grades numbered 334, 301, and 307 That there are around 30 more ELA marks than there are students indicates that there are some students taking two ELA classes. These are probably weaker students receiving intensive reading support. As weaker students they are less likely to receive ‘A’ grades. The grades in their two ELA courses are thus dragging the school’s average down even though the same students are only sitting the SBAC once (if at all).

The data may be incomplete. The same Aptos report card showed that there were only 150 Math marks in grade 8 in 2022-23 even though all 272 students were presumably taking Math. It’s impossible to know whether the missing students’ marks were better or worse than the students whose grades we do know.

h/t to the reader who pointed these out to me.

The eagle-eyed may notice that Alice Fong Yu was missing from the middle school ELA chart. That’s because the published data contains no grades for ELA for Alice Fong Yu students.

Another excellent analysis!! The results show the importance of using standards based assessments. Also parents need to get the true information on how students are doing. Suggest SFUSD develop a consistent grading process for all schools.