A friend who runs a climate-focused sustainability institute (not the Stanford Doerr School but similar) was bemoaning all the climate data that has disappeared from government websites. I sympathized but pointed out that education has been suffering from disappearing data for several years and Donald Trump can’t be blamed for these disappearances. I made a similar point to someone from SFUSD who was wondering if the subjects of my recent posts meant I’d lost my focus on the district. Alas, everything I write is based on data and there’s just not as much reliable education data as there used to be.
Here’s data that used to be published regularly but isn’t anymore.
Course Enrollment
The California Department of Education (CDE) used to publish data on course enrollments in every public school. Among the questions that could be answered with this data were:
How did enrollment in AP Math classes change after SFUSD stopped offering Algebra in middle school?
How many Black students take AP Math in San Francisco schools? If I recall correctly, more Black students took AP Calculus AB at KIPP San Francisco College Prep in 2018-19 than in all SFUSD high schools combined.
Districts have a choice of following the traditional Math course sequence of Algebra I - Geometry - Algebra II or of following a three-year Integrated Math sequence that interweaves the Geometry and Algebra material each year. What share of California students take Integrated Math and in which districts?
Which districts offer Algebra in middle school and what percentage of students take it?
What is the gender and ethnic mix of students in various classes?
In which grades do students take various subjects?
The data wasn’t perfect. It didn’t distinguish language-immersion classes from general education classes. It didn’t distinguish between the regular and honors versions of classes so you couldn’t tell whether a class was regular Chemistry or Chemistry Honors. In subjects that had mixed grade classes (e.g. Spanish II or Physics), the grade levels of the students were sometimes incorrectly recorded. But it was better than the nothing we have today.
Tantalizingly, it’s possible that CDE will eventually publish updated data. When I first started asking about it, I was told updated data would be released in Summer 2021, then by the end of 2021. In 2023, I was told it was still on the to-do list but might not happen that year. I’m not holding my breath.
Class Enrollment
Class Enrollment is related to Course Enrollment. A school may offer multiple sections of the same course. More rarely, a class might contain students taking two or more different courses. Questions that could be answered with class enrollment data include:
How do class sizes vary from school to school within a district? Course enrollment data might show that a school had 90 1st graders taking course 2400 (“Self-contained class”: the standard all-subject course for all elementary schools). The class enrollment data would show whether those 90 were in three or four or five classes. Within SFUSD, it was class enrollment data that revealed the huge difference in class sizes from one middle school to another, with some schools averaging over 30 per class and others fewer than 20.
How do average class sizes vary from district to district? It was by looking at class enrollment data that I was able to show that, compared with other districts, SFUSD has much smaller class sizes in K-3 but average-sized classes in high school grades.
The class enrollment data is probably part of the same course enrollment data project that is four years overdue.
Staff Data
The CDE used to publish data on the demographics, qualifications, and assignments of school district staff. Staff names were not published but the data could be used to answer such questions as:
How many people are employed as teachers? As pupil services staff? As administrators?
How many people are employed out of the central office rather than at school sites? How does this compare with other districts? Has it changed over time?
How does the ethnic mix of SFUSD’s teachers compare to that in other districts?
How many years of experience does the average teacher have in SFUSD compared to other districts?
Do schools with more disadvantaged students have teachers who are less experienced?
What is the average tenure at a school?
What percentage of a district’s teachers have fewer than 5 years experience?
How many teachers in a district have credentials to teach Science?
AP Results
College Board and CDE both used to publish data on AP participation and results that was quite detailed. College Board published, for each AP subject:
the exact score distribution i.e. the number of students who scored 1, 2, 3, 4, and 5 on the test
the score distribution by gender
the score distribution by ethnic group
the score distribution by grade
the score distribution for public school students (the distribution for private school students could be calculated given this and the aggregate numbers).
I used this data in my last post to produce the following chart:
Other interesting factoids that can be gleaned from this data include:
40% of California students who take AP Calculus AB take it in 11th grade or earlier. The percentage in SFUSD is probably lower than 5%, given the hoops students have had to go through to take it.
Across all subjects, roughly 20% of AP tests are taken in 10th grade or earlier, 40% in 11th grade, and 40% in 12th grade. A parent at one SFUSD school told me that students at her child’s school are not allowed to take AP tests before 12th grade.
In many subjects, those who take the test in 10th grade or earlier score higher than those who take it in 11th grade and they in turn score higher than those who take it in 12th grade. This is probably due to two things. First, there’s a selection effect: students who are allowed by their schools to take Art History or Biology or Chemistry or Calculus, or whatever, in earlier grades are on average smarter than those who are allowed to take it in later grades. Second, college admissions decisions are in by the time 12th graders sit their AP exams. Depending on the policies of the colleges they’re going to attend, they might no longer care about their results or they might only need a ‘3’ to get credit instead of a ‘5’ to impress an admissions committee.
87% of those who take AP Chinese are Asian; 83% of those who take AP Spanish Language and 93% of those who take AP Spanish Literature are Latino. 39% of those who take AP French, 59% of those who take AP German, and 48% of those who take AP Latin are White.
College Board published this level of detail for every state, so it was possible to compare students in California with students in other states. Post-pandemic, they slimmed it down and now publish just national score distributions across all students. Higher Ed Data Stories noticed the same thing:
The transparency The College Board touts as a value seems to have its limits, and I understand this to some extent: Racists loved to twist the data using single-factor analysis, and that's not good for a company who is trying to make business inroads with under-represented communities as they cloak their pursuit of revenue as an altruistic push toward access.
CDE also used to publish AP data. It didn’t publish scores by subject but it did show the number of students who took at least one test at each school and the score distribution for all the test takers at each school. It would also break these results down by ethnic group. This data could be used to answer questions such as:
What percentage of SFUSD’s Latino students take AP classes? How does this compare with other districts?
How does the average AP score obtained by SFUSD’s Latino students compare to the average score obtained by Latino students in other districts? In 2019, the average score of Latino students in San Francisco was 0.2 points higher than the statewide average for Latino students.
Which schools and districts have the most successful AP programs, whether measured by participation or average score? Across California, Lowell had an exceptionally high number of test takers but would barely crack the top 50 by average score. At some of the schools with higher average scores, the typical student took less than half as many tests. Which approach is better?
In San Francisco, it’s no surprise that Lowell and SOTA had the most participants and the highest scores but would you have guessed that KIPP had the third-highest participation level (i.e. tests taken per student)? KIPP also had the second-lowest average score on those AP tests. A (since departed) manager there told me that it was their policy to push students to take AP tests, even if they were likely to fail them, because their data told them exposure to the material would stand them in good stead in college.
This data is long gone. Unlike the course enrollment, class enrollment, and staff demographic data, which remains accessible, if outdated, on the CDE website, this data has been purged1. The only relic that survives is a count of the number of graduating students who scored at least a ‘3’ on two or more exams.
SAT Scores
Average SAT scores used to be published for every high school2. This data would show the average Math, English, and Total scores, and number of test takers, for each California public high school. Interesting questions that could be answered include:
What proportion of students at each school take the SAT?
How does the number of SAT takers compare with the number of AP takers?
How does the average SAT score compare to the average AP score?
Conclusion
It saddens me a little just to list out all those topics I can no longer address. I’ll continue to hunt for reliable data that’s interesting enough to write about. CDE does still publish some data and I have thought of an SFUSD-relevant question that I can address with it. That’ll be the subject of the next post.
Fortunately, 20 years of data does live on in my Google Drive.
Embarrassingly, I don’t remember the source. One cell in one tab in my spreadsheet has “University of California” in it, which seems like a big clue, but if it were the UC I don’t know why it wouldn’t have SAT scores for private schools too.
You said:
"but if it were the UC I don’t know why it wouldn’t have SAT scores for private schools too"
According to a document I received from UCOP, in January 2021, UC paid $5400 to license data only for public school students:
Data on California public school students, including personal identifying information without SSN:
a. 2020 cohort SAT® Suite of Assessment Data and full SAT Questionnaire, most recent SAT test scores
b. 2020 Advanced Placement Program® student level administration data as one record per student with testing history
c. 2020 cohort SAT® Subjects Examination Data (collectively, the “Data”).
The 'data recipient' was "Tongshan Chang or Institutional Research designee will receive
the data on behalf of the Licensee."