23 votes

Course evaluations are garbage science

14 comments

  1. [9]
    skybrian
    Link
    From the letter (archive link):

    From the letter (archive link):

    Williams and Ceci’s meta-analysis of “a very large number of studies across all academic fields” provides further quantitative grounding for what everyone already knows: Teaching evaluations are a mess. As David Delgado Shorter has discussed, also in our pages, the evidence that teaching evaluations reflect racial and gender bias on the part of students dates back over 40 years. Even more damning, “good” student evaluations may be negatively correlated with academic achievement. According to Bill Harbaugh, an economist at the University of Oregon who has studied the matter, students enrolled in classes taught by professors with high student ratings actually learn less.

    The most comprehensive recent analysis of the situation is Wolfgang Stroebe’s widely cited 2020 article in Basic and Applied Social Psychology, whose title says it all: “Student Evaluations of Teaching Encourages Poor Teaching and Contributes to Grade Inflation.” Teaching evaluations, Stroebel concludes, fail to measure actual learning, illegitimately reward teacher attractiveness, penalize minorities and women, and trigger cascading grade inflation. And because “there is evidence that faculty members in precarious positions (e.g., young tenure-track faculty) will be particularly motivated to improve the ratings they receive for their course by grading leniently,” teaching evalutations corrupt the classroom at its root. As a junior faculty member said to me earlier this month, “I’ve arrived at the point of the semester where I consider giving everyone a big grade boost on their last paper to juice my pre-tenure-review evals.”

    Despite their well-documented failings, such evaluations are now almost ubiquitous: 94 percent of colleges collected course evaluations in 2010.

    14 votes
    1. [6]
      rosco
      Link Parent
      This makes sense to me. Student evals are normally based on enjoyment, even if they aren't supposed to be. It would be easy to see a positive skew towards teachers that have less rigorous...

      Even more damning, “good” student evaluations may be negatively correlated with academic achievement.

      This makes sense to me. Student evals are normally based on enjoyment, even if they aren't supposed to be. It would be easy to see a positive skew towards teachers that have less rigorous standards because there is less stress and negative feedback from low scoring or heavy curves.

      Education is a spectrum and that is hard to capture in a 5 question or open ended course evaluation. I think what universities should consider isn't just "knowledge deployed" as this article reflects, but also interest and engagement in a subject. The professors that have positive reviews (and not just, they are so easy - spicy pepper emoji) probably drive much more enrollment and engagement within majors. I ended up minoring in History because I was so taken with an intro to Mexican history course that I only needed as a gen-ed requirement. The professor was front and center in that. The course wasn't the most difficult and I'd be pressed to say how much I took away; but it really sparked my interest in the subject.

      All that to say, I'd be interested to see if you looked at other metrics from teachers with positive student evaluations if there were any beneficial correlations. Similar to a teammate that is good at facilitating discussion and collaboration within a team while on their own may be a an inefficient worker, it may be that these professors are benefitting the department as a whole.

      Also, I'd love to read Wolfgang's student's evaluations.

      13 votes
      1. [5]
        skybrian
        Link Parent
        Here’s why I imagine grades and evaluations should be closely related: Getting good grades is a sign things are going well, and poor grades are a sign things are going poorly. This is important...

        Here’s why I imagine grades and evaluations should be closely related:

        Getting good grades is a sign things are going well, and poor grades are a sign things are going poorly. This is important feedback for students. It would be weird to get poor grades and think everything is fine. (Easy grading says “everything’s okay here” even if it isn’t.)

        These are two signals measuring the quality of the student-teacher relationship from different perspectives. They should largely agree.

        Figuring out whose fault it is when things go wrong might be harder?

        2 votes
        1. [4]
          rosco
          Link Parent
          That makes sense, and I agree they should be correlated. I think this would be easy to figure out. If it's a small portion of students with low grades that is a student problem, a high proportion...

          That makes sense, and I agree they should be correlated.

          Figuring out whose fault it is when things go wrong might be harder?

          I think this would be easy to figure out. If it's a small portion of students with low grades that is a student problem, a high proportion of students with low grades is a teacher problem. Compare the proportion of positive/negative student evals against the grade distribution in the course.

          3 votes
          1. [3]
            kfwyre
            (edited )
            Link Parent
            I see this sort of thing a lot (usually as a dig at teachers (not saying that’s what you’re doing here at all, by the way, just that it’s a common trope)), but it’s a bit too simplified IMO. A...
            • Exemplary

            If it's a small portion of students with low grades that is a student problem, a high proportion of students with low grades is a teacher problem.

            I see this sort of thing a lot (usually as a dig at teachers (not saying that’s what you’re doing here at all, by the way, just that it’s a common trope)), but it’s a bit too simplified IMO.

            A teacher could have a large percentage of failing students due to students lacking prerequisite skills, for example. This could be because they had a bad teacher previously in a course that built up to their current one, or the course path itself is bad, or advisors sign up students for courses concurrently versus consecutively, or it’s an 8 AM class and a bunch of students can’t be bothered to show up for it, etc.

            A preponderance of failing grades is definitely a signal that something is wrong, but it isn’t necessarily on the teacher themselves. Furthermore, the absence of failing grades isn’t necessarily an indicator that everything is hunky dory either. In fact, I’d argue that the pressure to inflate grades is probably a far bigger problem in education than failing students.

            A teacher in a prerequisite class who inflates grades will likely be rewarded for it (passing students! good surveys! “you’re a good teacher!”), but they’ve passed the buck to the following teacher who now has whole swaths of students who are unprepared for the demands of the following class. That teacher now faces the same pressure to inflate grades, with severe consequences should they not cave to it (failing students! bad surveys! “YOU’RE A BAD TEACHER!”).

            It’s educational hot potato, and way too many people blame the person who’s left holding it when time runs out rather than looking at the entire game and its impact as a whole.

            This doesn’t just happen in higher education either. I once taught an Algebra II course to high school juniors, and the majority of the class could not fluently complete simple integer arithmetic problems. We’re talking single-digit stuff, like -4+7.

            I once taught an 8th grade Algebra I course. The students were seemingly scared shitless of decimals — moreso than any group I’ve ever had. They didn’t know how to work with them and hated seeing them. I’d never seen decimal skills so low.

            One morning I began class by writing the fraction 3/4 on the board. I asked my students to simply write it down as a decimal. It is a benchmark fraction, and should be something that is automatic for them.

            Several students in the class — several — wrote down 3.4.

            I had students whose performance was so low that they not only didn’t immediately know what 3/4 was as a decimal just on sight and couldn’t figure it out by division, but they didn’t have the number sense to realize that it should be less than 1. They literally just looked at the two digits and put a decimal between them. This is a strategy of someone who is completely out of their depth. They were in an Algebra I class.

            I’m not saying this to lambaste the students. Many of them probably had very good reasons for why they didn’t know that. Regardless of those reasons, however, someone who is performing at that level simply isn’t ready for the demands of Algebra I. Unfortunately, this is rarely considered, and instead the reality is that I look like I’m a jerk who’s bad at my job if they end up failing. You didn’t teach them well enough! You didn’t work hard enough! You didn’t care enough!

            My favorite crystallization of this concept is none other than Jaime Escalante, the subject of the (in)famous teacher movie Stand and Deliver. In the movie, Escalante is portrayed as a passionate teacher who reaches disaffected youth who have fallen far behind in math. By the end of the movie and their school year with Escalante, the students take and pass the AP Calculus exam — a result so unexpected for that population that the College Board assumes the students cheated and forces them to retake the test (which they, again, pass).

            It’s a story of good teacher who helps his students through sheer inspiration and determination!

            And it’s also so misleading that it is outright misinformation:

            Stand and Deliver shows a group of poorly prepared, undisciplined young people who were initially struggling with fractions yet managed to move from basic math to calculus in just a year. The reality was far different. It took 10 years to bring Escalante's program to peak success. He didn't even teach his first calculus course until he had been at Garfield for several years. His basic math students from his early years were not the same students who later passed the A.P. calculus test.

            The timeline was completely falsified for the sake of making a better movie, but that’s only a small sin compared with what the movie outright omitted:

            Unlike the students in the movie, the real Garfield students required years of solid preparation before they could take calculus. This created a problem for Escalante. Garfield was a three-year high school, and the junior high schools that fed it offered only basic math. Even if the entering sophomores took advanced math every year, there was not enough time in their schedules to take geometry, algebra II, math analysis, trigonometry, and calculus.

            So Escalante established a program at East Los Angeles College where students could take these classes in intensive seven-week summer sessions. Escalante and Gradillas were also instrumental in getting the feeder schools to offer algebra in the eighth and ninth grades.

            Inside Garfield, Escalante worked to ratchet up standards in the classes that fed into calculus. He taught some of the feeder classes himself, assigning others to handpicked teachers with whom he coordinated and reviewed lesson plans. By the time he left, there were nine Garfield teachers working in his math enrichment program and several teachers from other East L.A. high schools working in the summer program at the college.

            Escalante’s students didn’t succeed because he was an inspiring teacher gifted with amazing pedagogy. His students succeeded because Escalante was able to make significant structural changes to their educational paths that allowed them to properly develop the skills to pass calculus in advance of their calculus class, not during it.

            I’m not trying to take a dig at Escalante. He’s a genuinely inspiring figure. He’s just not inspiring for the reasons shown in the movie, which promotes a teacher trope that has been so internalized in culture that it has shaped the ways entire generations look at education. We think that if teachers simply care enough, then their students will magically be successful. We’ve been taught by examples like this one and many others and plenty of other supporting tropes that education is essentially a cult of personalities rather than a sequence of groups of professionals all working within a specifically designed system.

            The reality is that the typical student of mine is one that I interact with for less than an hour each day. In their typical day, they will engage with approximately seven to ten teachers each. By the time they graduate high school, they will likely have had over 50 instructors. Each of these teachers certainly has a valuable impact on the student, but no single teacher’s impact is greater than that of the overall pipeline in which the student exists — the one that moves them from teacher to teacher and year to year.

            Breakdowns in that pipeline will invalidate the efforts of good, skilled teachers by creating unwinnable situations for them. A focus on education only at the individual level teaches us that this is their fault, even when that’s completely inaccurate.

            So, to come back to the teacher with a large number of failing students: it’s entirely possible that they’re a bad teacher with poor pedagogy who doesn’t want to help their students succeed.

            On the other hand, it’s entirely possible — and I’ll even go so far as to say probably more likely — that those grades are not indicative of the teacher’s quality as an educator and are instead representative of some other issue in the pipeline.

            12 votes
            1. [2]
              rosco
              Link Parent
              Wow, sorry for the terse original comment and thank you for the in depth explanation! That makes so much sense and as Skybrain pointed out does make it much harder to assess teachers/courses. I...

              Wow, sorry for the terse original comment and thank you for the in depth explanation! That makes so much sense and as Skybrain pointed out does make it much harder to assess teachers/courses.

              I didn't know the grade breakdown was a common trope and dig against teachers, but it makes a lot of sense. My perspective on it actually sprouted during my time as a student in Algebra II. We had a teacher who was... difficult, but not in the engaging way. There was a test that the whole class failed and he berated us for it. While most of us were keeping our heads as low as possible, one of the younger advanced placement students spoke up and said "well maybe if we all failed it's your fault!" It left quite the impression.

              Interesting enough, what followed mirrors the example you gave. Most of us really struggled with the basic concepts of Algebra II and did abysmally in Trigonometry. I decided I wasn't a "math person" because of it and ended up shifting majors into one that didn't have numerous math requirements at it's core, Biology to Anthropology. It wasn't until I went back for grad school and started preparing for stats and calculous that I found I could absolutely learn the math. I ended up retaking Algebra II and Trig through an online program and ended up doing fine with stats/calc. I think it cemented my feeling about my Algebra II teacher even more. But I never considered how my Trig teacher may have been perceived if she was subject to grade or evaluation review.

              Thanks again for taking the time to give such an in depth reply. I always appreciate your insight.

              3 votes
              1. kfwyre
                Link Parent
                No need to apologize! The “if the whole class is failing, it’s the teacher’s fault” maxim is essentially the same as “if everyone you meet is a jerk, then you’re the jerk” but for teachers. It can...

                No need to apologize! The “if the whole class is failing, it’s the teacher’s fault” maxim is essentially the same as “if everyone you meet is a jerk, then you’re the jerk” but for teachers. It can feel good and validating, especially when it’s directed at someone who deserves scorn, but I generally find that it’s an oversimplification of a much more complex situation.

                I appreciate your kind words and you sharing your experience. What you went through is completely valid, and helps texture my post in a very valuable way: even though a widespread grade breakdown isn’t always a teacher’s fault, sometimes it is. That’s an important perspective too, and it gets no pushback from me. You specifically, but also all kids in general, deserve teachers who won’t derail entire life pathways for students.

                I’m sorry that was your experience. I’m glad you were able to route around it, but you also should have never been put in that situation in the first place.

                3 votes
    2. [2]
      Spacepope
      Link Parent
      This looks like an interesting article. I was really disappointed to see that it's paywalled and I could only read the first couple paragraphs. It's hard to intelligently discuss something I...

      This looks like an interesting article. I was really disappointed to see that it's paywalled and I could only read the first couple paragraphs. It's hard to intelligently discuss something I cannot read.

      1 vote
      1. DefinitelyNotAFae
        Link Parent
        Did you click the provided archive link?

        Did you click the provided archive link?

        6 votes
  2. [4]
    boxer_dogs_dance
    Link
    I am not sure how evaluations should best be handled, but evaluations are one way to identify and possibly prevent abuse.

    I am not sure how evaluations should best be handled, but evaluations are one way to identify and possibly prevent abuse.

    4 votes
    1. [2]
      skybrian
      Link Parent
      There should definitely be independent channels to report problems in a classroom. I’m not sure waiting to the end of the semester for the evaluations is the way to go, though? If considered as a...

      There should definitely be independent channels to report problems in a classroom. I’m not sure waiting to the end of the semester for the evaluations is the way to go, though? If considered as a noisy source of hints about things that might be improved, it seems fine?

      7 votes
      1. boxer_dogs_dance
        Link Parent
        Some people will only report anonymously. Having many options for identification of extreme bad actors seems wise. However, I agree that using student evaluations for professional evaluation of...

        Some people will only report anonymously. Having many options for identification of extreme bad actors seems wise.

        However, I agree that using student evaluations for professional evaluation of educators is not ideal and is likely to cause problems. Learning is hard and can be boring and painful. Instructors should be encouraged to demand excellent work.

        4 votes
    2. kovboydan
      (edited )
      Link Parent
      It would seem possible to do a survey/course evaluation that evaluates an entire department for the semester - not an individual class - and use a regression model to estimate the impact of...

      I’ve yet to read the linked article, but that’s a neat question.

      It would seem possible to do a survey/course evaluation that evaluates an entire department for the semester - not an individual class - and use a regression model to estimate the impact of individual professors.

      This surely wouldn’t eliminate all the bias but I suspect it would minimize it, like blind orchestra auditions but less…blind.

      2 votes
  3. kfwyre
    Link
    I worked with a teacher who would, on most days, let students just go on their phones and play games on their laptops for the entire period. The students loved him. They (possibly jokingly?)...

    I worked with a teacher who would, on most days, let students just go on their phones and play games on their laptops for the entire period.

    The students loved him.

    They (possibly jokingly?) nominated him for a “Teacher of the Year” award. On account of his unilaterally strong student support, he almost won.

    This is my roundabout (and, for me, unusually concise) way of saying that student surveys are not a good measure of teacher quality.

    4 votes