← Return to search results
Back to Prindle Institute

Should You Rate Your Professor? The Ethics of Student Evaluations

photograph of students packed into lecture

With the academic semester coming to a close, students all over North America will be filling out student evaluation of teacher forms, or SETs. Typically, students answer a range of questions regarding their experience with a class, including how difficult the class was, whether work was returned in a timely manner and graded fairly, and how effective their professors were overall. Some students will even go the extra step and visit RateMyProfessors.com, the oldest and arguably most popular website for student evaluations. There, students can report how difficult the course was, indicate whether they would take a class with the professor again, and, alas, whether one really did need to show up to class or not in order to pass.

While many students likely do not give their SETs much of a second thought once they’ve completed them (if they complete them at all), many universities take student ratings seriously, and often refer to them in determining raises, promotion, and even tenure decisions for early-career faculty. Recently, however, there have been growing concerns over the use of SETs in making such decisions. These concerns broadly fall into two categories: first, whether such evaluations are, in fact, a good indicator of an instructor’s quality of teaching, and second, that student evaluations correlate highly with a number of factors that are irrelevant to an instructor’s quality of teaching.

The first concern is one of validity, namely whether SETs are actually indicative of a professor’s teaching effectiveness. There is reason to think that they are not. For instance, as reported at University Affairs, a recent pair of studies performed by Berkeley professors Philp Stark and Richard Frieshtat indicate that student evaluations correlate most highly with actual or expected performance in a class. In other words, a student who gets an A is much more likely to give their professor a higher rating, and a student who gets a D is much more likely to give them a lower rating. Evaluations also highly correlated with student interest (disinterested students gave lower ratings overall, interested students gave higher ratings) and perceived easiness (easier courses receive higher ratings than difficult ones). These findings cast serious doubt on whether what students are evaluating is how effective their instructors were instead of simply how well they did or liked the class.

These and other studies have recently led Ryerson University in Toronto, Canada, to officially stop using SETs as a metric to determine promotion and tenure – a decision reached at the behest of professors who long-argued the unreliability of SETs. Perhaps even more troubling than student evaluations correlating with class performance and interest, though, were that SETs showed biases towards professors on the basis of “gender, ethnicity, accent, age, even ‘attractiveness’…making SETs deeply discriminatory against numerous ‘vulnerable’ faculty.” If SETs are indeed biased in this way, that would constitute a good reason to stop using them.

Perhaps the most egregious form of explicit bias in student ratings could be found up until recently on the RateMyProfessor website, which allowed students to rate professor “hotness,” a score that was indicated on a “chili pepper” scale. That such a scale existed removed a significant amount of credibility from the website; the fact that there are no controls over who can making ratings on the site is also a major reason why few take it seriously. The removal of the “hotness” rating came only after many complaints by numerous professors that argued that it contributes to the objectification of female professors, and contributed overall to a climate in which it is somehow seen as appropriate to evaluate professors on the basis of their looks.

While there might not exist any official SETs administered by a university that approximate the chili pepper scale, the effects of bias when it comes to perceived attractiveness are present regardless. The above-mentioned reports, for instance, found that when it comes to overall evaluations of professors, “attractiveness matters” – “more attractive instructors received better ratings,” and when it came to female professors, specifically, students were more likely to directly comment on their physical appearance. The study provided one example from an anonymous student evaluation that stated: “The only strength she [the professor] has is she’s attractive, and the only reason why my review was 4/7 instead of 3/7 is because I like the subject.” As the report emphasizes, “Neither of these sentiments has anything at all to do with the teacher’s effectiveness or the course quality, and instead reflect gender bias and sexism.”

It gets worse: in addition to evaluations correlating with perceived attractiveness, characteristics like gender, ethnicity, race, and age all affect evaluations of professors as well. As Freishtat reports, “When students think an instructor is female, students rate the instructor lower on every aspect of teaching,” white professors are rated generally higher than professors of other races and ethnicities, and age “has been found to negatively impact teaching evaluations.”

If the only problems with SETs were that they were unreliable, universities would have a practical reason to stop using them: if the goal of SETs is to help identify which professors are deserving of promotion and tenure, and they are unable to contribute to this goal, it seems that they should be abandoned. But as we’ve seen, there is a potentially much more pernicious side to SETs, namely that they systematically display student bias in numerous ways. It seems, then, that universities have a moral obligation to revise the way that professors are assessed.

Given that universities need some means of gauging professors’ teaching ability, what is a better way of doing so? Freishtat suggests that SETs, if they are to be used at all, should represent only one component of a professor’s assessment. Ultimately, those evaluations must be made a part of a more complete dossier in order to be put to better use; they need to be accompanied by letters from department heads, reviews from peers, and a reflective self-assessment of the instructor’s pedagogical approach.

But even if we can’t agree on the best way of evaluating instructor performance, it seems clear that a system that provides unreliable and biased results ought to be reformed or abandoned.

Does Implicit Bias Explain Gender Discrimination?

Photo of men's and women's bathroom stall signs

Implicit bias is a concept that’s been enormously useful to feminists grappling with the way progress for women has stalled in some areas. Women are still under 5 percent of CEOs of Fortune 500 companies. They still make considerably less per hour than men for doing the same work. Women are still just 20 percent of PhD engineers and around the same percentage of philosophers. They still haven’t made it into the pantheon of US presidents, and only 23 out of the current members of the US Senate are women.

It’s all difficult to explain, especially if you don’t believe that women as a group have distinctive interests or aptitudes. But then, what’s going on? Outright sexism and misogyny aren’t exactly rare in the US, but neither are they common. Thus, if you suspect bias is at the root of the underrepresentation problem, implicit bias is a welcome concept.

Continue reading “Does Implicit Bias Explain Gender Discrimination?”

Is the Media to Blame for Police Brutality?

Photograph of protest with boy in foreground, a sign in the background saying "end police brutality"

Police brutality is a painful and all-too-familiar concept when the plight of black people is brought up. Although police abuse of African Americans has been prevalent in the United States for decades, the years 2012 and 2013 are especially significant. It was in 2012 that Trayvon Martin was murdered by George Zimmerman. The following year, Zimmerman was found not guilty of second degree murder and was acquitted of manslaughter. Since then, there’s been a trend of police killing unarmed black people. Since Martin’s death, African American males such as Tamir Rice, Michael Brown, Philando Castile, and most recently, Stephon Clark have lost their lives because of police brutality. After so many lives lost, one might wonder why there is no solution to prevent the police from killing unarmed African American men. Police departments have tried retraining their officers with the hopes of them making the right decision when dealing with suspects– particularly suspects of color. Yet black men still lose their lives. Perhaps, in order to solve the issue of police brutality, we need to truly understand it. Although police brutality stems from bigotry and carelessness, especially the former, the key to why police officers kill black males might be rooted in how they developed their racist conventions. Could it be that the contemporary media landscape is contributing to the death of black males by police officers? Continue reading “Is the Media to Blame for Police Brutality?”

Should DePauw be Concerned about First-Year Students of Color?

A photo of DePauw's music school.

DePauw’s student of color community is incredibly unique, in the sense that each and every individual hails from a myriad of backgrounds. However, their diversity can call for major adaptation when coming to DePauw, a predominantly white institution (PWI). The process of adaptation can be made even more difficult if a student of color’s identity is tested through negative interactions with their white counterparts, as well as negative forces that push into DePauw’s campus.

Continue reading “Should DePauw be Concerned about First-Year Students of Color?”

Move Over, Mercator: World Maps in Boston’s Public Schools

Schools in Boston recently decided to make the switch from the Mercator projection of world maps to the Gall-Peters projection, becoming the first American school system to do so. While seemingly uninteresting, making the switch from the Mercator projection is a step toward inclusivity and one that other schools should consider making.

Continue reading “Move Over, Mercator: World Maps in Boston’s Public Schools”

Multiracial Representation in Japan

In March 2015, a daughter of Japanese mother and African-American father, Ariana Miyamoto, was crowned Miss Universe Japan. In September 2016, a daughter of Japanese mother and Indian father, Priyanka Yoshikawa, was crowned Miss World Japan. Both are the first biracial representatives of Japan on the stage of international competitions. While it is a celebratory news, some controversy has arisen amongst the Japanese about sending “non-Japanese” people into  the world to represent Japan.

Continue reading “Multiracial Representation in Japan”