ratemyprofessor Archives - The Prindle Institute for Ethics

On October 3rd, The New York Times reported that organic chemistry adjunct Professor Maitland Jones, Jr. had been fired by N.Y.U. after 82 out of 350 of his students signed a petition against him. Students complained that their grades did not reflect the amount of time and effort they put into the course and that Jones did not prioritize student well-being and learning. (Jones, meanwhile, reported a significant drop off in student performance in the past decade, and especially after the pandemic.) Before firing Jones, university officials had first offered to review student grades and allowed students to withdraw from the class retroactively.

Immediate responses varied: Jones supporters protested the decision; some students who had a bad experience in Jones’ class celebrated the decision in online reviews; and faculty critiqued the decision by administration to appease students as tuition-payers.

More broadly, there have been a wide range of takes offered on the whole situation: Jones’ firing illustrates the precarity of contingent faculty and an administration run amok; “weed-out” classes like organic chemistry exacerbate student inequalities; students are becoming coddled and entitled, which will make them bad doctors; academic degrees are becoming a consumer product – and the consumers with financial power are the parents, not the students; organic chemistry isn’t actually necessary to be a good doctor, and it only became a weed-out course through policy decisions to limit the number of new doctors, leaving us with a shortage of physicians; the systemic and structural factors that create out-of-touch professors, entitled students, and pandering administrators are what we should actually blame.

This case raises rich possibilities for discussion, but I would like to focus on the following question: What purpose do weed-out courses actually serve, and is it a purpose we can get behind?

I will limit the discussion for now to pre-med classes, but the question could be asked of other disciplines as well.

Let’s start with the more positive aspects of weed-out courses. The main purpose of such a course seems to be to allow students to assess whether they have the necessary aptitude for surviving medical school and becoming good doctors. Ideally, a professor would facilitate this task by ensuring the class has the adequate rigor to support students in their pursuits, but also kindly counseling struggling students that they should seek another career path.

One apparent benefit of this kind of course would be to prevent students from spending a great deal of time and money pursuing a career they are more and more unlikely to attain.

Unfortunately, it is a hard truth that effort alone is not enough to get one through medical school, even though dedication and determination are necessary ingredients.

Another benefit of this kind of course would be to encourage students to cultivate the studying and test-taking skills they will need to do well in medical school and to become good doctors.

These considerations seem reasonable to me, but I’m not sure the language of “weeding out” best captures this set of aims. Instead, it suggests hoops that students are required to jump through in order to demonstrate their commitment and thus be granted access to continuing along their career path. There are plenty of questions here as to which courses should serve as the key benchmarks for success as a physician (a bioethics course might be on the list of courses that should be included, and an organic chemistry class might be less central than an immunology class), but having such benchmarks does not itself seem to be a problem.

Doctors need to have a variety of skills to be effective physicians: from the people-skills required in doctor/patient interactions, to a good problem-solving ability to catch diseases and health conditions before they progress, to the vast memorization needed to keep up with best practices and treatments. These are all abilities that should be fostered by pre-med education. If a student lacks one or more of these core capacities, it seems best for them (and their potential patients) to turn to another career path where their abilities might shine.

At the same time, we need to ensure that students of all backgrounds receive the resources (and opportunity) needed to acquire these skills during the course of their undergraduate education so that we do not simply reify existing inequalities.

So, let’s turn to the more negative aspects of weed-out courses. Often, it seems that the goal of a weed-out course is to get a certain portion of the class to withdraw or fail. Even if the express reason for this design is to promote rigor and provide a benchmark for student success, the learning environment can become toxic in several ways. If the professor sets up the class so that only the “truly bright” students can pass and treats student confusion as signs of laziness or stupidity, this creates a host of problems.

First, students who can keep up with the learning environment, whether through advantages in past tutelage or an ability to more quickly grasp the material, may start to see themselves as superior to those who do not do as well in the class. Second, students of all aptitudes may feel immense pressure and dedicate excessive time to studying in order to succeed in the class, contributing to mental distress. Third, students who do not do well in the class despite putting in the same intensive effort may see themselves as failures or as less worthy than other students.

What this kind of weed-out mentality amounts to is a kind of bullying that identifies some people as superior and others as inferior, only loosely tracking a student’s academic merit.

This can create problems not only in the pre-med weed-out courses but also in medical school and beyond. Hierarchies might arise between different medical subspecialties, with physicians in some elite residencies seeing themselves as superior to those who did not make the cut. These dynamics might also lead to epistemic overconfidence in practicing physicians, causing disruptions in doctor/patient interactions and negatively impacting the quality of patient care.

More specifically, I worry that some of our initial defenses of these weed-out classes tend to reify bullying practices rather than establishing the necessary benchmarks one needs to meet in order to be a good physician – in the same way that there are certain benchmarks that one should be able to meet to be a good teacher, a good lawyer, a good journalist, a good businessperson, a good caretaker, and more. While the pandemic has negatively impacted student learning and well-being, the student petition can be read as reflecting an unwillingness to put up with a certain kind of bullying and as a demand for better institutional support.

The pandemic tested us all in a number of ways, and it has made apparent to many of us that some forms of treatment are untenable, especially in times of crisis.

If you take a look at the last 10 years of comments about Jones’ teaching performance on RateMyProfessors (for whatever the review site is worth), negative student ratings of Jones’ classes have been fairly consistent in quantity and quality over time. Students have raised the same concerns again and again, regardless of grades earned: no partial credit on tests, the necessity of studying excessive amounts of time compared to other organic chemistry classes, accusations that Jones did not respect students nor respond well to questions, consistently low test averages (there was conflicting information about whether tests were curved), and high drop rates. Students of all different academic backgrounds reported feeling excessively stressed out by the course, and many complained that the organic chemistry course was made intentionally difficult. While other students gave glowing reviews, it is clear that the instructional problems raised in the petition are not new.

In the end, we’re left – like some of Jones’s students – with what feels like an impossible task: How can we design weed-out classes to be sufficiently rigorous and supportive? And how would we know when we’ve done it right?

With the academic semester coming to a close, students all over North America will be filling out student evaluation of teacher forms, or SETs. Typically, students answer a range of questions regarding their experience with a class, including how difficult the class was, whether work was returned in a timely manner and graded fairly, and how effective their professors were overall. Some students will even go the extra step and visit RateMyProfessors.com, the oldest and arguably most popular website for student evaluations. There, students can report how difficult the course was, indicate whether they would take a class with the professor again, and, alas, whether one really did need to show up to class or not in order to pass.

While many students likely do not give their SETs much of a second thought once they’ve completed them (if they complete them at all), many universities take student ratings seriously, and often refer to them in determining raises, promotion, and even tenure decisions for early-career faculty. Recently, however, there have been growing concerns over the use of SETs in making such decisions. These concerns broadly fall into two categories: first, whether such evaluations are, in fact, a good indicator of an instructor’s quality of teaching, and second, that student evaluations correlate highly with a number of factors that are irrelevant to an instructor’s quality of teaching.

The first concern is one of validity, namely whether SETs are actually indicative of a professor’s teaching effectiveness. There is reason to think that they are not. For instance, as reported at University Affairs, a recent pair of studies performed by Berkeley professors Philp Stark and Richard Frieshtat indicate that student evaluations correlate most highly with actual or expected performance in a class. In other words, a student who gets an A is much more likely to give their professor a higher rating, and a student who gets a D is much more likely to give them a lower rating. Evaluations also highly correlated with student interest (disinterested students gave lower ratings overall, interested students gave higher ratings) and perceived easiness (easier courses receive higher ratings than difficult ones). These findings cast serious doubt on whether what students are evaluating is how effective their instructors were instead of simply how well they did or liked the class.

These and other studies have recently led Ryerson University in Toronto, Canada, to officially stop using SETs as a metric to determine promotion and tenure – a decision reached at the behest of professors who long-argued the unreliability of SETs. Perhaps even more troubling than student evaluations correlating with class performance and interest, though, were that SETs showed biases towards professors on the basis of “gender, ethnicity, accent, age, even ‘attractiveness’…making SETs deeply discriminatory against numerous ‘vulnerable’ faculty.” If SETs are indeed biased in this way, that would constitute a good reason to stop using them.

Perhaps the most egregious form of explicit bias in student ratings could be found up until recently on the RateMyProfessor website, which allowed students to rate professor “hotness,” a score that was indicated on a “chili pepper” scale. That such a scale existed removed a significant amount of credibility from the website; the fact that there are no controls over who can making ratings on the site is also a major reason why few take it seriously. The removal of the “hotness” rating came only after many complaints by numerous professors that argued that it contributes to the objectification of female professors, and contributed overall to a climate in which it is somehow seen as appropriate to evaluate professors on the basis of their looks.

While there might not exist any official SETs administered by a university that approximate the chili pepper scale, the effects of bias when it comes to perceived attractiveness are present regardless. The above-mentioned reports, for instance, found that when it comes to overall evaluations of professors, “attractiveness matters” – “more attractive instructors received better ratings,” and when it came to female professors, specifically, students were more likely to directly comment on their physical appearance. The study provided one example from an anonymous student evaluation that stated: “The only strength she [the professor] has is she’s attractive, and the only reason why my review was 4/7 instead of 3/7 is because I like the subject.” As the report emphasizes, “Neither of these sentiments has anything at all to do with the teacher’s effectiveness or the course quality, and instead reflect gender bias and sexism.”

It gets worse: in addition to evaluations correlating with perceived attractiveness, characteristics like gender, ethnicity, race, and age all affect evaluations of professors as well. As Freishtat reports, “When students think an instructor is female, students rate the instructor lower on every aspect of teaching,” white professors are rated generally higher than professors of other races and ethnicities, and age “has been found to negatively impact teaching evaluations.”

If the only problems with SETs were that they were unreliable, universities would have a practical reason to stop using them: if the goal of SETs is to help identify which professors are deserving of promotion and tenure, and they are unable to contribute to this goal, it seems that they should be abandoned. But as we’ve seen, there is a potentially much more pernicious side to SETs, namely that they systematically display student bias in numerous ways. It seems, then, that universities have a moral obligation to revise the way that professors are assessed.

Given that universities need some means of gauging professors’ teaching ability, what is a better way of doing so? Freishtat suggests that SETs, if they are to be used at all, should represent only one component of a professor’s assessment. Ultimately, those evaluations must be made a part of a more complete dossier in order to be put to better use; they need to be accompanied by letters from department heads, reviews from peers, and a reflective self-assessment of the instructor’s pedagogical approach.

But even if we can’t agree on the best way of evaluating instructor performance, it seems clear that a system that provides unreliable and biased results ought to be reformed or abandoned.

ratemyprofessor

The Ethics of Weed-Out Courses

Should You Rate Your Professor? The Ethics of Student Evaluations