← Return to search results
Back to Prindle Institute

Honesty in Academia

photograph of Harvard's coat of arms

Honesty researcher Francesca Gino, a professor at Harvard Business School, has been accused of fabricating data in multiple published articles.

In one study, participants were given 20 math puzzles and awarded $1 for each one they solved. After grading their own worksheets, test subjects then threw them out and reported their results on another form. Some participants were asked to sign to confirm that their report was accurate at the bottom of the form, while others signed at the top. Gino’s hypothesis was that signing at the top would prime honest behavior, but she then allegedly tampered with the results to drive the intended effect. Gino is now on administrative leave while Harvard conducts a full investigation.

While it would obviously be ironic if Gino had been dishonest while researching honesty, there is a further reason that such dishonesty would be particularly galling, as dishonest research violates one of the cardinal virtues of the academic vocation.

Let me explain. Some readers might already be familiar with the traditional list of the cardinal virtues: Justice, Courage, Prudence, and Temperance. Honesty, of course, is nowhere on this list. So what do I mean when I call honesty a cardinal virtue?

Different vocations have their own characteristic virtues. It is not possible to be a good judge without being particularly just. Likewise, it is not possible to be a good soldier on the front lines without being particularly courageous. That is because each of these vocations emphasize certain virtues. A soldier must have the virtue of courage to repeatedly thrust themselves into battle, and a judge must have the virtue of justice in order to consistently reach fair verdicts.

Are there any characteristic virtues of the academic vocation? Professors typically have two primary tasks: the generation and transmission of knowledge. For both of these tasks, an emphasis on truth takes center stage. And this focus on truth means that professors will do better at both of these tasks by cultivating the intellectual virtues – virtues like open-mindedness, curiosity, and intellectual humility. For this reason, we can think of these intellectual virtues as cardinal virtues of the academic vocation.

But along with these intellectual virtues, honesty is also particularly important for the academic vocation. When students learn from their professors, they often simply take them at their word. Professors are the experts, after all. This makes students especially vulnerable, because if their professors deceive them, they cannot detect it.

This is true to an even greater extent with cutting-edge research. If professors are being dishonest, it could be that no intellectual discoveries are being made in the first place. In Gino’s case, for example, she may have concealed the fact that the study she performed did not actually support her findings. But without specialized training, few people can understand how new knowledge is generated in the first place, leaving them completely vulnerable to the possibility of academic dishonesty. Only other academics were able to spot the irregularities in Gino’s data that has led to further questions.

We thus have reason to take honesty as a cardinal virtue of the academic vocation as well. Not only do academics need to be open-minded, curious, and humble, but they must also be honest so that they use their training to further higher education’s most important goals. If academics regularly passed off false research and deceived their students, it would threaten to undermine the university enterprise altogether.

Distrust in higher education is on the rise, and to the extent that academics acquire a reputation for dishonesty, it is sure to only decline further. Gino’s work is just the tip of the iceberg. One of Gino’s co-authors has also been accused of faking his data, and Stanford’s president is stepping down due to questions about his research, but these are isolated incidents in comparison to the widespread replication crisis. When researchers tried to reproduce the results from 98 published psychology papers, only 39 of the studies were able to be replicated, meaning that over half of the “research” led to no new discoveries whatsoever.

While a failure of replication does not necessarily mean that the researchers who produced that work were being dishonest, there are many dishonest means that can lead to a study that can’t be replicated, including throwing out data that does not confirm a hypothesis or questionable methods of data analysis. Until the replication crisis, and discoveries of fake data, begin to wane, it will be difficult to restore public trust in social science research.

Is there anything that can be done? While public trust in higher education will not be restored overnight, there are several changes that could potentially help professors cultivate the virtue of honesty. One strategy for curbing our vices is limiting the situations in which we are tempted to do the wrong thing. As one example, pre-registering a study commits a researcher to the design of a study before they run it, removing the opportunity to engage in questionable statistical analysis or disregard the results.

Another way to increase virtuous behavior is to remind ourselves of our values. At the college level, for instance, commitment to an honor code can serve as a moral reminder that reduces cheating. Academic institutions or societies could develop honor codes that academics have to sign in order to submit to journals, or even a signed honor code that is displayed on published articles. While some professors might still be undeterred, others will be reminded of their commitment to the moral values inherent to their vocation.

Universities could also reconsider which professors they hold up as exemplars. For many academic disciplines, researchers that produce the most surprising results, and produce them on a regular basis, are held up as the ideal. But this of course increases the incentive to fudge the numbers to produce interesting “research.” By promoting and honoring professors that have well-established, replicable research, colleges and universities could instead encourage results that will stand the test of time.

None of these solutions is perfect, but by adopting a combination of measures, academics can structure their vocation so that it is more conducive to the development of honesty. It is impossible to eliminate all opportunities for dishonesty, but by creating a culture of honesty and transparency, professors can restore trust in the research they publish and in higher education more generally.

For her 2018 book, Rebel Talent, Francesca Gino opted for the tagline “Why it pays to break the rules at work and life.” The jury is still out on whether that was true in Gino’s case. If she was dishonest, it enabled her to ascend the ranks, landing at the top of the ladder as a professor at Harvard. To prevent more accusations like these moving forward, universities need to put in the work to ensure that honesty is what’s rewarded in academia.

 

This work was supported by the John Templeton Foundation grant “The Honesty Project” (ID#61842). Nevertheless, the opinions expressed here are those of the author and do not necessarily reflect the views of the Foundation.

What Titan Teaches About Technological Recklessness

render of Titan submersible

Since it was discovered that the Titan submersible had imploded, many have noted how reckless and irresponsible it was to “MacGyver” a deep-sea submersible. The use of off-the-shelf camping equipment and game controllers seemed dodgy, and everyone now knows it is absolute folly to build a pressure capsule out of carbon fiber. Yet, for years Stockton Rush claimed that safety regulations stifle innovation. Rush’s mindset has been compared to the “move fast and break things” mentality prominent in Silicon Valley. Perhaps there’s something to be learned here. Perhaps we’re hurtling towards similarly avoidable disasters in other areas of science and technology that we will come to see as easily avoidable in hindsight if not for our recklessness.

Since the disaster, many marine experts have come publicly forward to complain about the shortcuts that OceanGate was taking. Even prior to the implosion, experts had raised concerns directly to Rush and OceanGate about the hull. While search efforts were still underway for the Titan, it came to light that the vessel had not been approved by any regulatory body. Of the 10 submarines claiming the capacity to descend to the depths where the Titanic lies, only OceanGate’s sub was not certified. Typically, submarines can be safety rated in terms of their diving depth by an organization like Lloyd’s Register. But as the Titan was an experimental craft, there was no certification process to speak of.

What set the Titan apart from every other certified submarine was its experimental use of carbon fiber. While hulls are typically made of steel, the Titan preferred a lightweight alternative. Despite being five inches thick, the carbon fiber was made of thousands of strands of fiber that had the potential to move and crack. It was known that carbon fiber was tough and could potentially withstand immense pressure, but it wasn’t known exactly how repeated dives would impact the hull at different depths, or how different hull designs would affect the strength of a carbon fiber submarine. This is why repeated dives, inspections, and ultrasounds of the material would be necessary before we get a firm understanding of what to expect. While some risks can be anticipated, many can’t be fully predicted without thorough testing. The novel nature of the submarine meant that old certification tests wouldn’t be reliable.

Ultimately, Titan’s failure wasn’t the use of carbon fiber or lack of certification. There is a significant interest in testing carbon fiber for uses like this. What was problematic, however, was essentially engaging in human experimentation. In previous articles I have discussed Heather Douglas’s argument that we are all responsible for not engaging in reckless or negligent behavior. Scientists and engineers have a moral responsibility to consider the sufficiency of evidence and the possibility of getting it wrong. Using novel designs where there is no clear understanding of risk, where no safety test yet exists because the exact principles aren’t understood, is reckless.

We can condemn Rush for proceeding so brashly, but his is not such a new path. Many of the algorithms and AI-derived models are similarly novel, their potential uses and misuses are not well-known. Many design cues on social media, for example, are informed by the study of persuasive technology, employing a limited understanding of human psychology and then applying it in a new way and on a massive scale. Yet, despite the potential risks and a lack of understanding about long-term impacts on a large scale, social media companies continue to incorporate these techniques to augment their services.

We may find it hard to understand because the two cases seem so different, but effectively social media companies and AI development firms have trapped all of us in their own submarines. It is known that humans have evolved to seek social connection, and it is known that the anticipation of social validation can release dopamine. But the effects of gearing millions of people up to seek social validation from millions of others online are not well known. Social media is essentially a giant open-ended human experiment that tests how minor tweaks to the algorithm can affect our behavior in substantial ways.  It not only has us engage socially in ways we aren’t evolutionary equipped to handle, but also bombards us with misinformation constantly. All this is done on a massive scale without fully understanding the potential consequences. Again, like the carbon fiber sub these are novel creations with no clear safety standards or protocols.

Content-filtering algorithms today are essentially creating novel recommendation models personalized to you in ways that remain opaque. It turns out that this kind of specialization may mean that each model is completely unique. And this opacity makes it easier to forget just how experimental the model is, particularly given that corporations can easily hide design features. Developing safety standards for each model (for each person) is essentially its own little experiment. As evidence mounts of social media contributing to bad body image, self-harm, and suicide in teenagers, what are we to do? Former social media executives, like Chamath Palihapitiya, fear things will only get worse: “I think in the deep, deep recesses of our minds we knew something bad could happen…God only knows what it’s doing to our children’s brains.” Like Rush, we continue to push without recognizing our recklessness.

So, while we condemn Rush and OceanGate, it is important that we understand what precisely the moral failing is. If we acknowledge the particular negligence at play in the Titan disaster, we should likewise be able to spot similar dangers that lie before us today. In all cases, proceeding without fully understanding the risks and effectively experimenting on people (particularly on a massive scale) is morally wrong. Sometimes when you move fast, you don’t break things, you break people.

Pathogenic Research: The Perfect Storm for Moral Blindness?

microscopic image of virus cells

In October, scientists at Boston University announced that they had created a COVID-19 variant as contagious as omicron (very) but significantly more lethal. “In K18-hACE2 mice [engineered mice vulnerable to COVID],” their preprint paper reported, “while Omicron causes mild, non-fatal infection, the Omicron S-carrying virus inflicts severe disease with a mortality rate of 80%.” If this beefed-up Omicron were released somehow, it would have had the potential to cause a much more severe pandemic.

The National Science Advisory Board for Biosecurity has now released new guidelines which seek to strike a significantly more cautious balance between the dangers and rewards of risky research involving PPPs — potential pandemic pathogens. The previous standards, under which the Boston University research was allowed to be conducted without any safety review, were, according to the NSABB, reliant on definitions of a PPP that were “too narrow” and likely to “result in overlooking… pathogens with enhanced potential to cause a pandemic.” (The researchers at Boston University claimed their enhanced COVID-19 variant was marginally less deadly than the original virus, and hence that they were not conducting risky “gain of function” research requiring oversight. But this argument is flawed since the deadliness of a virus with pandemic potential is a function of the combination of infectiousness and deadliness. Since the novel variant combined close-to-original-COVID-19 deadliness with omicron infectiousness, the novel variant is likely significantly more dangerous than the original strain.)

Experiments like these are not merely a question of public policy. Apart from the legal and regulatory issues, we can also ask: is it morally permissible to be personally involved in such research? To fund it, administer it, or conduct it?

On the positive side, research with PPPs, including some forms of the heavily politicized “gain-of-function” research, promises valuable insight into the origins, risks, and potential treatment of dangerous pathogens. We may even prevent or mitigate future natural pandemics. All of this seems to give us strong moral reasons to conduct such research.

However, according to Marc Lipsitch and Alison Galvani, epidemiologists at Harvard and Yale, these benefits are overblown and achievable by safer methods. The risks of such research, on the other hand, are undeniable. Research with dangerous pathogens is restricted to the safest rated labs. But even top safety-rated BS-3 and BS-4 research labs leak viruses with regularity. The COVID-19 lab leak theory remains contentious, but the 1977 Russian flu pandemic was very likely the result of a lab leak. It killed 700,000 people. Anthrax, SARS, smallpox, zika virus, ebola, and COVID-19 (in Taiwan) have all leaked from research labs, often with deadly results. One accident in a lab could cause hundreds of millions of deaths.

Given the scale of risk involved, you might ask why we don’t see mass refusals to conduct such research? Why do the funders of such work not outright reject contributing to such risk-taking? Why does this research not spark strong moral reactions from those involved?

Perhaps part of the reason is that we seem particularly vulnerable to flawed moral reasoning when it comes to subjects this like this. We often struggle to recognize the moral abhorrence of risky research. What might explain our “moral blindness” on this issue?

Stalin supposedly said, “One death is a tragedy. A million deaths is a statistic.” Morally, he was wrong. But psychologically, he was right. Our minds are better suited to the small scale of hunter-gatherer life than to the modern interconnected world where our actions can affect millions. We struggle to scale our moral judgments to the vast numbers involved in a global pandemic. Moral psychologists call this effect “scope neglect” and I discuss it in more detail here.

When a lab worker, research ethics committee member, or research funder thinks about what might go wrong with PPP research, they may fail to “scale up” their moral judgments to the level needed to consider the moral significance of causing a worldwide pandemic. More generally, research ethical principles were (understandably) built to consider the risks that research poses to the particular individuals involved in the research (subjects and experimenters), rather than the billions of innocents that could be affected. But this, in effect, institutionalizes scope neglect.

To compound this clouding effect of scope neglect, we tend to mentally round up tiny probabilities to “maybe” (think: lottery) or round them down to “it will never happen” (think: being hit by a meteorite while sleeping, the unfortunate fate of Ann Hodges of Alabama). Lipsitch and Inglesby’s 2014 study gives a 0.01-0.6% probability of causing a pandemic per lab worker per year to gain-of-function research on virulent flu viruses.

But rounding this probability down to “it won’t happen” would be a grave moral error.

Because a severe pandemic could cause hundreds of millions of deaths, even the lower-bound 0.01% risk of causing a global pandemic each year would mean that a gain-of-function researcher should expect to cause an average of 2,000 deaths per year. If that math is even remotely close to right, working on the most dangerous PPPs could be the most deadly job in the world.

Of course, we don’t act like it. Psychologically, it is incredibly hard to recognize what is “normal” as morally questionable, or even profoundly wrong. If your respected peers are doing the same kind of work, the prestigious scientific journals are publishing your research, and the tenure board are smiling down from above, it’s almost impossible to come to the disturbing and horrifying  conclusion that you’re doing something seriously unethical. But if the risks are as severe as Lipsitch and Co. claim (and the benefits as mediocre) then it is difficult to see how working with PPPs could be ethically defensible. What benefit to the world would your work have to provide to justify causing an expected 2,000 deaths each year?

Even putting the ethical debate to one side, extreme caution seems warranted when debating the morality of lab research on PPPs. It is a topic that could create the “perfect storm” of flawed moral reasoning.

Ivermectin, Hydroxychloroquine, and the Dangers of Scientific Preprints

photograph of "In Evidence We Trust" protest sign

There is a new drug of choice among those who have refused to get vaccinated for COVID-19, or are otherwise looking for alternative treatments: ivermectin, an antiparasitic drug that is used primarily in farm animals. The drug recently made headlines in the U.S. after a judge in Ohio ordered a hospital to treat a patient with it, and a number of countries in Latin America and Europe have begun using it, as well. It is not the first time that a drug that was developed for something else entirely was touted as the new miracle cure for COVID-19: hydroxychloroquine, an anti-malarial, was an early favorite for alternative treatments from former president Trump, despite the FDA’s statement that it had no real effect on patients with COVID-19, and indeed could be very dangerous when used improperly. The FDA has recently issued a statement to a similar effect when it comes to ivermectin, warning that the drug can be “highly toxic in humans.”

It is not surprising that there has been continued interest in alternative treatments to COVID-19: given the existence of vaccine skepticism and various surrounding conspiracy theories, people who do not trust the science of vaccinations, for one reason or another, will look for other ways of fighting the disease. What is perhaps surprising is why this particular drug was chosen as the new alternative treatment. There is, after all, no seemingly good reason to think that a horse de-wormer would be effective at killing the coronavirus. So where did this idea come from?

Not, it turns out, from nowhere. As was the case with hydroxychloroquine, the U.S.-based health analytics company Surgisphere produced a study that purported to show that ivermectin was effective at treating COVID-19, albeit in just “a handful of in vitro and observational studies.” The study was not published in any peer-reviewed outlet, but was instead uploaded as a preprint.

A preprint is a “version of a scientific manuscript posted on a public server prior to formal review”: it’s meant to be a way of rapidly disseminating results to the scientific community at large. Preprints can have significant benefits when it comes to getting one’s results out quickly: peer-review can be a lengthy process, and during a global pandemic, time is certainly of the essence. At the same time, there are a number of professional and ethical considerations that surround the use of preprints in the scientific community.

For example, a recent study on preprints released during the pandemic found a “remarkably low publication rate” for sampled papers, with one potential explanation being that “some preprints have lower quality and will not be able to endure peer-reviewing.” Others have cautioned that while the use of preprints has had positive effects in the physical sciences, when it comes to the medical sciences there is potentially more reason to be concerned: given that developments in medical science is typically of much more interest to the general public, “Patients may be exposed to early, unsubstantiated claims relevant to their conditions, while lacking the necessary context in which to interpret [them].” Indeed, this seems to be what happened with regards to alternative treatments for COVID-19, which have been uploaded online amongst an explosion of new preprint studies.

Additional problems arise when it comes to the use of medical preprints in the media. Another recent study found that while online media outlets linking to preprints was a common practice, said preprints were often framed inconsistently: media outlets often failed to mention that the preprints had not been peer reviewed, instead simply referring to them as “research.” While the authors of the study were encouraged that discussions of preprints in the media could foster “greater awareness of the scientific uncertainty associated with health research findings,” they were again concerned that failing to appropriately frame preprint studies risked misleading readers into thinking that the relevant results were accepted in the scientific community.

So what should we take away from this? We have seen that there are clearly benefits to the general practice of publishing scientific preprints online, and that in health crises in particular the rapid dissemination of scientific results can result in faster progress. At the same time, preprints making claims that are not adequately supported by the evidence can get picked up by members of the general public, as well as the media, who may be primarily concerned with breaking new “scientific discoveries” without properly contextualizing the results or doing their due diligence in terms of the reliability of the source. Certainly, then, there is an obligation on the part of media outlets to do better: given that many preprints do not survive peer review, it is important for the media to note that, when they do refer to preprint studies, that the results are provisional.

It’s not clear, though, whether highlighting the distinction would make much of a difference in the grand scheme of things. For instance, in response to the FDA’s statement that there is no scientific basis for studying the effects of ivermectin on COVID-19, Kentucky senator Rand Paul stated that it was really a “hatred for Trump” that stood in the way of investigating the drug, and not, say, the fact that the preprint study did not stand up to scientific scrutiny. It seems unlikely that, for someone like Paul, the difference between preprints and peer-reviewed science is a relevant one when it comes to pushing a political narrative.

Nevertheless, a better understanding of the difference between preprints and peer-reviewed science could still be beneficial when helping people make decisions about what information to believe. While some preprints certainly do go on to pass peer review, if the only basis that one has for some seemingly implausible medical claims is a preprint study, it is worth approaching those claims with skepticism.

Medical Challenge Trials: Time to Embrace the Challenge?

photograph of military personnel receiving shot

The development of the COVID-19 vaccines is worthy of celebration. Never has a vaccine for a  novel virus been so quickly developed, tested, and rolled out. Despite this success, we could have done much better. In particular, a recent study estimates that by allowing “challenge trials” in the early months of the pandemic, we would have completed the vaccine licensing process between one and eight months faster than we did using streamlined conventional trials. The study also provides a conservative estimate of the years of life that an earlier vaccine rollout would have saved: between 720,000 and 5,760,000. However, whether we should have used challenge trials depends on a number of ethical considerations.

Here is an extraordinary fact: we first genetically sequenced the virus in January 2020. Moderna then developed their RNA vaccine in just two days. But the F.D.A. could only grant the vaccine emergency authorization in late December — almost a year later. Over this period the virus killed approximately 320,000 U.S. citizens. The vast majority of the delay between development and approval was due to the time needed to run the necessary medical trials. Enough data needed to be collected to show the vaccines were effective and, even more importantly, safe.

Here’s how those trials worked. Volunteers from a large pool (for example, 30,420 volunteers in Moderna’s phase three trial) were randomly provided either a vaccine or a placebo. They then went about their lives. Some caught the virus, others didn’t. Researchers, meanwhile, were forced to wait until enough volunteers caught the illness for the results to be statistically valid. The fact that the virus spread so quickly was a blessing in this one respect; it sped up their research considerably.

So-called “challenge trials” are an alternative way to run medical trials. The difference is that in a  challenge trial healthy (and informed) volunteers are intentionally infected with the pathogen responsible for the illness researchers want to study. The advantages are that statistically significant results can be found with far fewer volunteers far more quickly. If we vaccinate volunteers and then expose them to the virus, we’ll have a good idea of the vaccine’s effectiveness within days. This means faster licensing, faster deployment of the vaccine, and, therefore, thousands of saved lives.

Challenge trials are generally blocked from proceeding on ethical grounds. Infecting healthy people with a patho­gen they might nev­er oth­er­wise be ex­posed to — a patho­gen which might cause them ser­i­ous or per­man­ent harm or even death — might seem dif­fi­cult to jus­ti­fy. Some med­ic­al prac­ti­tion­ers con­sider it a vi­ol­a­tion of the Hip­po­crat­ic oath they have sworn to up­hold — “First, do no harm.” Ad­voc­ates of chal­lenge tri­als point out that slow, tra­di­tion­al med­ic­al tri­als can cause even great­er harm. Hun­dreds of thou­sands of lives could likely have been saved had COV­ID-19 chal­lenge tri­als been per­mit­ted and the various vac­cines’ emer­gency approv­al occurred months earli­er.

Ad­mit­tedly, chal­lenge tri­als ef­fect­ively shift some risk of harm from the pub­lic at large to a small group of med­ic­al vo­lun­teers. Can we really accept greater risk of harm and death in a small group in order to protect society as a whole? Or are there moral limits to what we can do for the ‘greater good’? Per­haps it is this unequal distribution of burdens and benefits that critics object to as un­eth­ic­al or un­just.

Ad­vocates of chal­lenge tri­als point out that vo­lun­teers con­sent to these risks. Hence, per­mit­ting chal­lenge tri­als is, fun­da­ment­ally, simply per­mitting fully con­sent­ing adults to put them­selves at risk to save oth­ers. We don’t ban healthy adults from run­ning into dan­ger­ous wa­ter to save drowning swim­mers (even though these adults would be risk­ing harm or death). So, the reas­on­ing goes, nor should we ban healthy adults from vo­lun­teer­ing in med­ic­al tri­als to save oth­ers’ lives.

Of course, if a volunteer is lied to or otherwise misinformed about the risks of a medical trial, their consent to the trial does not make participation ethically permissible. For consent to be ethically meaningful, it must be informed. Volunteers must understand the risks they face and judge them to be acceptable. But making sure that volunteers fully understand the risks involved (including the ‘unknown’ risks) can be difficult. For example, a well-replicated finding from psychology is that people are not very good at understanding the likelihood of very low- (or high-) probability events occurring. We tend to “round down” low probability events to “won’t happen” and “round up” high probability events to “will happen”. A 0.2% probability of death doesn’t seem very different from a 0.1% probability to most of us, even though it’s double the risk.

Informed consent also cannot be obtained from children or those who are mentally incapable of providing it, perhaps due to extreme old age, disability, or illness. So members of these groups cannot participate in challenge trials. This limitation, combined with the fact that younger, healthier people may be more likely to volunteer for challenge trials than their more vulnerable elders, means that the insights we gain from the trial data may not translate well to the broader population. This could weaken the cost-benefit ratio of conducting challenge trials, at least in certain cases.

A fur­ther eth­ic­al worry about chal­lenge tri­als is that the poor and the dis­ad­vant­aged, those with no oth­er op­tions, might be indirectly coerced to take part. If in­dividu­als are des­per­ate enough to ac­cess fin­an­cial resources, for ex­ample for food or shel­ter they require, they might take on in­cred­ible per­son­al risk to do so. This dy­nam­ic is called “des­per­ate ex­change,” and it must be avoided if chal­lenge tri­als are to be eth­ically per­miss­ible.

One way to pre­vent des­per­ate ex­changes is to place lim­its on the fin­an­cial com­pens­a­tion provided to vo­lun­teers, for ex­ample merely cov­er­ing travel and in­con­veni­ence costs. But this solu­tion might be thought to threaten to un­der­mine the pos­sib­il­ity of run­ning chal­lenge tri­als at all. Who is go­ing to volun­teer to put his life at risk for noth­ing?

There’s some evid­ence that people would be will­ing to vo­lun­teer even without ser­i­ous fisc­al com­pens­ation. In the case of blood dona­tion, un­paid vol­untary sys­tems see high dona­tion rates and high­er donor qual­ity than mar­ket-based, paid-dona­tion sys­tems such as the U.S.’s. As I write this 38,659 vo­lun­teers from 166 coun­tries have already signed up to be Chal­lenge Tri­al volun­teers with “1 Day Soon­er,” a pro-Chal­lenge Tri­al or­gan­iz­a­tion fo­cus­ing on COV­ID-19 tri­als. These vo­lun­teers ex­pect no mon­et­ary com­pens­a­tion, and are primar­ily mo­tiv­ated by eth­ic­al con­sid­er­a­tions.

The ad­voc­ates of chal­lenge tri­als sys­tem­at­ic­ally failed to win the ar­gu­ment as COV­ID-19 spread across the globe in 2020. Med­ic­al reg­u­lat­ors deemed the eth­ic­al con­cerns too great. But the tide may now be chan­ging. This Feb­ru­ary, Brit­ish reg­ulat­ors ap­proved a COV­ID-19 chal­lenge tri­al. When time-in-tri­al equates with lives lost, the prom­ise of chal­lenge tri­als may prove too strong to ig­nore.

The Ethics of Self-Citation

image of man in top hat on pedestal with "EGO" sash

In early 2021, the Swiss Academies of Arts and Sciences (SAAS) published an updated set of standards for academic inquiry; among other things, this new “Code of Conduct for Scientific Integrity” aims to encourage high expectations for academic excellence and to “help build a robust culture of scientific integrity that will stand the test of time.” Notably, whereas the Code’s previous version (published in 2008) treated “academic misconduct” simply as a practice based on spreading deceptive misinformation (either intentionally or due to negligence), the new document expands that definition to include a variety of bad habits in academia.

In addition to falsifying or misrepresenting one’s data — including various forms of plagiarism (one of the most familiar academic sins) — the following is a partial list of practices the SAAS will now also consider “academic misconduct”:

  • Failing to adequately consider the expert opinions and theories that make up the current body of knowledge and making incorrect or disparaging statements about divergent opinions and theories;
  • Establishing or supporting journals or platforms lacking proper quality standards;
  • Unjustified and/or selective citation or self-citation;
  • Failing to consider and accept possible harm and risks in connection with research work; and
  • Enabling funders and sponsors to influence the independence of the research methodology or the reporting of research findings.

Going forward, if Swiss academics perform or publish research failing to uphold these standards, they might well find themselves sanctioned or otherwise punished.

To some, these guidelines might seem odd: why, for example, would a researcher attempting to write an academic article not “adequately consider the expert opinions and theories that make up the current body of knowledge” on the relevant topic? Put differently: why would someone seek to contribute to “the current body of knowledge” without knowing that body’s shape?

As Katerina Guba, the director of the Center for Institutional Analysis of Science and Education at the European University at St. Petersburg, explains, “Today, scholars have to publish much more than they did to get an academic position. Intense competition leads to cutting ethical corners apart from the three ‘cardinal sins’ of research conduct — falsification, fabrication and plagiarism.” Given the painful state of the academic job market, researchers can easily find incentives to pad their CVs and puff up their resumes in an attempt to save time and make themselves look better than their peers vying for interviews.

So, let’s talk about self-citation.

In general, self-citation is simply the practice of an academic who cites their own work in later publications they produce. Clearly, this is not necessarily ethically problematic: indeed, in many cases, it might well be required for a researcher to cite themselves in order to be clear about the source of their data, the grounding of their argument, the development of the relevant dialectical exchange, or many other potential reasons — and the SAAS recognizes this. Notice that the new Code warns against “unjustified and/or selective citation or self-citation” — so, when is self-citation unjustified and/or unethical?

Suppose that Moe is applying for a job and lists a series of impressive-sounding awards on his resume; when the hiring manager double-checks Moe’s references, she confirms that Moe did indeed receive the awards of which he boasts. But the manager also learns that one of Moe’s responsibilities at his previous job was selecting the winners of the awards in question — that is to say, Moe gave the awards to himself.

The hiring manager might be suspicious of at least two possibilities regarding Moe’s awards:

  1. It might be the case that Moe didn’t actually deserve the awards and abused his position as “award-giver” to personally profit, or
  2. It might be the case that Moe could have deserved the awards, but ignored other deserving (potentially more-deserving) candidates for the awards that he gave to himself.

Because citation metrics of publications are now a prized commodity among academics, self-citation practices can raise precisely the same worries. Consider the h-index: a score for a researcher’s publication record determined by a function of their total number of publication credits and how often their publications have been cited in other publications. In short, the h-index claims to offer a handily quantified measurement of how “influential” someone has been on their academic field.

But, as C. Thi Nguyen has pointed out, these sorts of quantifications not only reduce complicated social phenomena (like “influence”) to thinned-out oversimplifications, but they can be gamified or otherwise manipulated by clever agents who know how to play the game in just the right way. Herein lies one of the problems of self-citations: an unscrupulous academic can distort their own h-index scores (and other such metrics) to make them look artificially larger (and more impressive) by intentionally “awarding themselves” with citations just like Moe granted himself awards in Situation #1.

But, perhaps even more problematic than this, self-citations limit the scope of a researcher’s attention when they are purporting to contribute to the wider academic conversation. Suppose that I’m writing an article about some topic and, rather than review the latest literature on the subject, I instead just cite my own articles from several years (or several decades) ago: depending on the topic, it could easily be the case that I am missing important arguments, observations, or data that have been made in the interim period. Just like Moe in Situation #2, I would have ignored other worthy candidates for citation to instead give the attention to myself — and, in this case, the quality of my new article would suffer as a result.

For example, consider a forthcoming article in the Monash Bioethics Review titled “Can ‘Eugenics’ Be Defended?” Co-written by a panel of six authors, many of whom are well-known in their various fields, the 8-page article’s reference list includes a total of 34 citations — 14 of these references (41%) were authored by one or more of the article’s six contributors (and 5 of them are from the lead author, making him the most-cited researcher on the reference list). While the argument of this particular publication is indeed controversial, my present concern is restricted to the article’s form, rather than its contentious content: the exhibited preference to self-cite seems to have led the authors to ignore almost any bioethicists or philosophers of disability who disagree with their (again, extremely controversial) thesis (save for one reference to an interlocutor of this new publication and one citation of a magazine article). While this new piece repeatedly cites questions that Peter Singer (one of the six co-authors) asked in the early 2000s, it fails to cite any philosophers who have spent several decades providing answers to those very questions, thereby reducing the possible value of its purported contributions to the academic discourse. Indeed, self-citation is not the only dysgenic element of this particular publication, but it is one trait that attentive authors should wish to cull from the herd of academic bad habits.

Overall, recent years have seen just such an increased interest among academics about the sociological features of their disciplinary metrics, with several studies and reports being issued about the nature and practice of self-citation (notably, male academics — or at least those without “short, disrupted, or diverse careers” — seem to be far more likely to self-cite, as are those under pressure to meet certain quantified productivity expectations). In response, some have proposed additional metrics to specifically track self-citations, alternate metrics intended to be more balanced, and upending the culture of “curated scorekeeping” altogether. The SAAS’s move to specifically highlight self-citation’s potential as professional malpractice is another attempt to limit self-serving habits that can threaten the credibility of academic claims to knowledge writ large.

Ultimately, much like the increased notice that “p-hacking” has recently received in wider popular culture — and indeed, the similar story we can tell about at least some elements of “fake news” development online —  it might be time to have a similarly wide-spread conversation about how people should and should not use citations.

On “Doing Your Own Research”

photograph of army reserve personnel wearing neck gaiter at covid testing site

In early August, American news outlets began to circulate a surprising headline: neck gaiters — a popular form of face covering used by many to help prevent the spread of COVID-19 — could reportedly increase the infection rate. In general, face masks work by catching respiratory droplets that would otherwise contaminate a virus-carrier’s immediate environment (in much the same way that traditional manners have long-prescribed covering your mouth when you sneeze); however, according to the initial report by CBS News, a new study found that the stretchy fabric typically used to make neck gaiters might actually work like a sieve to turn large droplets into smaller, more transmissible ones. Instead of helping to keep people safe from the coronavirus, gaiters might even “be worse than no mask at all.”

The immediate problem with this headline is that it’s not true; but, more generally, the way that this story developed evidences several larger problems for anyone hoping to learn things from the internet.

The neck gaiter story began on August 7th when the journal Science Advances published new research on a measurement test for face mask efficacy. Interested by the widespread use of homemade face-coverings, a team of researchers from Duke University set out to identify an easy, inexpensive method that people could use at home with their cell phones to roughly assess how effective different commonly-available materials might be at blocking respiratory droplets. Importantly, the study was not about the overall efficacy rates of any particular mask, nor was it focused on the length of time that respiratory droplets emitted by mask-wearers stayed in the air (which is why smaller droplets could potentially be more infectious than larger ones); the study was only designed to assess the viability of the cell phone test itself. The observation that the single brand of neck gaiter used in the experiment might be “counterproductive” was an off-hand, untested suggestion in the final paragraph of the study’s “Results” section. Nevertheless, the dramatic-sounding (though misleading) headline exploded across the pages of the internet for weeks; as recently as August 20th, The Today Show was still presenting the untested “result” of the study as if it were a scientific fact.

The ethics of science journalism (and the problems that can arise from sensationalizing and misreporting the results of scientific studies) is a growing concern, but it is particularly salient when the reporting in question pertains to an ongoing global pandemic. While it might be unsurprising that news sites hungry for clicks ran a salacious-though-inaccurate headline, it is far from helpful and, arguably, morally wrong.

Furthermore, the kind of epistemic malpractice entailed by underdeveloped science journalism poses larger concerns for the possibility of credible online investigation more broadly. Although we have surrounded ourselves with technology that allows us to access the internet (and the vast amount of information it contains), it is becoming ever-more difficult to filter out genuinely trustworthy material from the melodramatic noise of websites designed more for attracting attention than disseminating knowledge. As Kenneth Boyd described in an article here last year, the algorithmic underpinnings of internet search engines can lead self-directed researchers into all manner of over-confident mistaken beliefs; this kind of structural issue is only exacerbated when the inputs to those algorithms (the articles and websites themselves) are also problematic.

These sorts of issues cast an important, cautionary light on a growing phenomenon: the credo that one must “Do Your Own Research” in order to be epistemically responsible. Whereas it might initially seem plain that the internet’s easily-accessible informational treasure trove would empower auto-didacts to always (or usually) draw reasonable conclusions about whatever they set their minds to study, the epistemic murkiness of what can actually be found online suggests that reality is more complicated. It is not at all clear that non-expert researchers who are ignorant of a topic can, on their own, justifiably identify trustworthy information (or information sources) about that topic; but, on the other hand, if a researcher does has enough knowledge to judge a claim’s accuracy, then it seems like they don’t need to be researching the topic to begin with!

This is a rough approximation of what philosophers sometimes call “Meno’s Paradox” after its presentation in the Platonic dialogue of that name. The Meno discusses how inquiry works and highlights that uninformed inquirers have no clear way to recognize the correct answer to a question without already knowing something about what they are questioning. While Plato goes on to spin this line of thinking into a creative argument for the innateness of all knowledge (and, by extension, the immortality of the soul!), subsequent thinkers have often taken different approaches to argue that a researcher only needs to have partial knowledge either of the claim they are researching or of the source of the claim they are choosing to trust in order to come to justified conclusions.

Unfortunately, “partial knowledge” solutions have problems of their own. On one hand, human susceptibility to a bevy of psychological biases make a researcher’s “partial” understanding of a topic a risky foundation for subsequent knowledge claims; it is exceedingly easy, for example, for the person “doing their own research” to be unwittingly led astray by their unconscious prejudices, preconceptions, or the pressures of their social environment. On the other hand, grounding one’s confidence in a testimonial claim on the trustworthiness of the claim’s source seems to (in most cases) simply push the justification problem back a step without really solving much: in much the same way that a non-expert cannot make a reasonable judgment about a proposition, that same non-expert also can’t, all by themselves, determine who can make such a judgment.

So, what can the epistemically responsible person do online?

First, we must cultivate an attitude of epistemic humility (of the sort summarized by Plato’s infamous comment “I know that I know nothing”) — something which often requires us to admit not only that we don’t know things, but that we often can’t know things without the help of teachers or other subject matter experts doing the important work of filtering the bad sources of information away from the good ones. All too often, “doing your own research” functionally reduces to a triggering of the confirmation bias and lasts only as long as it takes to find a few posts or videos that satisfy what a person was already thinking in the first place (regardless of whether those posts/videos are themselves worthy of being believed). If we instead work to remember our own intellectual limitations, both about specific subjects and the process of inquiry writ large, we can develop a welcoming attitude to the epistemic assistance offered by others.

Secondly, we must maintain an attitude of suspicion about bold claims to knowledge, especially in an environment like the internet. It is a small step from skepticism about our own capacities for inquiry and understanding to skepticism about that of others, particularly when we have plenty of independent evidence that many of the most accessible or popular voices online are motivated by concerns other than the truth. Virtuous researchers have to focus on identifying and cultivating relationships with knowledgeable guides (who can range from individuals to their writings to the institutions they create) on whom they can rely when it comes time to ask questions.

Together, these two points lead to a third: we must be patient researchers. Developing epistemic virtues like humility and cultivating relationships with experts that can overcome rational skepticism — in short, creating an intellectually vibrant community — takes a considerable amount of effort and time. After a while, we can come to recognize trustworthy informational authorities as “the ones who tend to be right, more often than not” even if we ourselves have little understanding of the technical fields of those experts.

It’s worth noting here, too, that experts can sometimes be wrong and nevertheless still be experts! Even specialists continue to learn and grow in their own understanding of their chosen fields; this sometimes produces confident assertions from experts that later turn out to be wrong. So, for example, when the Surgeon General urged people in February to not wear face masks in public (based on then-current assumptions about the purportedly low risk of asymptomatic patients) it made sense at the time; the fact that those assumptions later proved to be false (at which point the medical community, including the epistemically humble Surgeon General, then recommended widespread face mask usage) is simply a demonstration of the learning/research process at work. On the flip side, choosing to still cite the outdated February recommendation simply because you disagree with face mask mandates in August exemplifies a lack of epistemic virtue.

Put differently, briefly using a search engine to find a simple answer to a complex question is not “doing your own research” because it’s not research. Research is somewhere between an academic technique and a vocational aspiration: it’s a practice that can be done with varying degrees of competence and it takes training to develop the skill to do it well. On this view, an “expert” is simply someone who has become particularly good at this art. Education, then, is not simply a matter of “memorizing facts,” but rather a training regimen in performing the project of inquiry within a field. This is not easy, requires practice, and still often goes badly when done in isolation — which is why academic researchers rely so heavily on their peers to review, critique, and verify their discoveries and ideas before assigning them institutional confidence. Unfortunately, this complicated process is far less sexy (and far slower) than a scandalous-sounding daily headline that oversimplifies data into an attractive turn of phrase.

So, poorly-communicated science journalism not only undermines our epistemic community by directly misinforming readers, but also by perpetuating the fiction that anyone is an epistemic island unto themselves. Good reporting must work to contextualize information within broader conversations (and, of course, get the information right in the first place).

Please don’t misunderstand me: this isn’t meant to be some elitist screed about how “only the learned can truly know stuff, therefore smart people with fancy degrees (or something) are best.” If degrees are useful credentials at all (a debatable topic for a different article!) they are so primarily as proof that a person has put in considerable practice to become a good (and trustworthy) researcher. Nevertheless, the Meno Paradox and the dangers of cognitive biases remain problems for all humans, and we need each other to work together to overcome our epistemic limitations. In short: we would all benefit from a flourishing epistemic community.

And if we have to sacrifice a few splashy headlines to get there, so much the better.

Back to School: America’s Uncontrolled and Unethical Experiment

photograph of middle school science clasroom

As of this writing, several school districts in the United States have already reopened at some level, but most of the nation’s 40 million school-age children are scheduled to return sometime from mid to late August. One major argument for the reopening is so parents can return to work (assuming there is a job to go to), and help rebuild America’s faltering economy. The American Academy of Pediatrics has also supported this back-to-school movement, though this support concentrates on the emotional and social needs of the students that can be better met by returning to school.

There is, however, one argument against going back to school that few consider: Going back to school amid an epidemic is America’s uncontrolled experiment using our children as the sample. Even the nation’s top epidemiologist, Anthony Fauci, told teachers in a recent interview: “You’ll be part of the experiment in reopening schools.” This experiment is neither scientific, nor ethical.

We scientists live in a world of unknowns, and we traverse that world through the use of the scientific method and research ethics. The controlled scientific experiment goes like this: (1) A research question is formulated when the researcher makes the best “guess” as to what to expect from the data to be collected; this “guess” is based on what is already known about the topic, (2) a sample of people is identified that will participate in the experiment with as little risk to them as possible, (3) variables are identified which, as much as reasonably can be, are controlled for, (4) after considering any risks, and obtaining consent to participate from the sample members, the experiment is run, (5) the data are collected, (6) analyzed, and (7) conclusions are drawn. Through this controlled and ethical study, hopefully we find some answers that can be used to solve the problem at hand. Of utmost importance, however, is that these steps must be accomplished within the boundaries of research ethics. In the field of healthcare, these are typically four in number.

The four basic ethical considerations when doing research in the public health and healthcare arenas in general are (1) autonomy, or the power to make an informed, uncoerced, freely given consent to participate in the research; (2) justice, assuring a fair distribution of risks, benefits, and resources over participants, (3) beneficence, that no harm is done; and, (4) nonmaleficence, keeping participants from harmful situations. These ethical considerations came about after WWII when atrocities of uncontrolled experiments on human subjects by the Nazi regime were discovered. These considerations are now guides in designing ethical research. By carefully adhering to the scientific method and ethical principles of research, controlled experiments can be carried out.

Unfortunately, none of these guidelines are being met in the uncontrolled experiment America is about to run on its children when they go back to school this fall. The assumption is that getting students back in school will help solve the economic problem as well as the social and psychological problems the nation’s children are facing. These are important problems, and there are ethical ways of addressing them; the uncontrolled experiment on which America is embarking is not one of them.

If we compare this uncontrolled experiment with an ethically-sound controlled experiment, we can see the many pitfalls; pitfalls that may have dire consequences for all involved.

First of all, there is no research question. There is only a hope that things go OK and not too many get hurt. We don’t have enough information about the virus and its effect on children to even formulate a research question. What are we looking for and hoping to find? In essence, we are saying, “Let’s reopen schools, get the economy going, and help meet students’ social and emotional needs,” inferring that this is the only avenue open to us to accomplish these goals.

Secondly, variables such as the age, race, and gender of students, teachers, school staff, and bus drivers — along with their underlying medical conditions — are just some of many variables that are difficult, if not impossible, to control for in the school environment. Even when good-faith attempts are made to control for some of these variables, several ethical problems emerge.

One example is school transportation. The average school bus occupancy is 56; if social distancing without masking is practiced, only 6 students can ride the bus; if masking alone is practiced, only 28 can ride. It costs districts about $1000 per pupil per year to transport students to and from school. Additional costs to the districts by adding routes and making more trips to get students to school using either masking or social distancing, will be a strain on precious resources that could be spent on helping students with the ability to use remote learning.

Additionally, many states have regulations that mandate only students who live beyond a one-mile radius of the school they attend can ride a bus. Others must walk, ride their bikes, or use public or private transportation. Assuming that the family can afford public transportation, or, has a car, lives in a neighborhood that is safe for walking, and has weather that cooperates, these options work. However, marginalized children who live within this one-mile radius (and are thus not candidates for school transportation) may be further marginalized — kept from the emotional and social contacts they need and potentially missing vital instructional activities. These concerns are further complicated when we think about special needs students, whose medical vulnerabilities might put them at-risk in these new school environments.

Thirdly, the sample used (children) is a protected one. The Office of Human Research Protection (OHRP) identifies several protected populations that deserve special consideration when they are involved in research using humans. Pregnant women, prisoners, those with lessened cognitive abilities, and children are a few examples. Extra precautions must be taken to assure these subjects are not simply being used with little protection from specific harms that may come. Children are not mature enough to make their own decisions as to whether they want to participate in a research project. They seldom, if ever, are even allowed to make their own medical decisions. Children have no say in whether they want to go back to school amid a pandemic projected to have taken the lives of more than 180,000 in our nation by the end of August. We are sending this protected group back to school blindly, with few safety precautions. We also know that when schools were closed statewide during the months of March through May, there was a temporal association with decreased COVID-19-related deaths in those states.

Fourthly, how will we be able to keep the participants (children, faculty and staff, bus drivers) from harm? Masking and social distancing can be practiced at school; however, some age groups will be better at that than others. The benefits and risks involved are not spread evenly over the sample of students. Not only the students are at risk, but teachers are, as well.

Education Week recently reported that as many as 1.5 million public school teachers are at a higher risk of contracting COVID-19 due to their underlying health problems. The research on school staff vulnerability is sparse, but, given the law of large numbers, many staff members are at high risk as well when in a building of several hundred children. Children do get COVID-19, and with 5.5 million children suffering from asthma alone this could be a disaster waiting to happen. When race is taken into account, African-American children are 2.5 times as likely to contract COVID-19 as are Caucasian children, and American Indian and Hispanic children are 1.5 times as likely. Schools may be breeding grounds for transmitting the virus to these vulnerable populations. Children have more of the COVID-19 virus in their noses and throats than do adults, which makes children just as likely to spread the disease. They may not get the disease as easily as adults, but they do transmit it just as easily.

Do the benefits of returning to school (and there are many) outweigh the associated costs of spreading the disease?

There are many reasons other than academic ones for children needing to be in school. We know that at least 14 million children do not get enough to eat on a daily basis and this is dependent on race; 30% of these children are Black and 25% are Hispanic, less than 10% are Caucasian. Additionally, when children are home for extended periods of time with adults, the probability of child abuse increases. Yet, during this summer, schools found a way to deliver lunches, if not breakfast also to their students who were in need of that service.

Some local municipality police departments and county sheriffs have instituted a “Drop By” program. In these programs, homes where abuse may be more likely to occur, are irregularly visited as a “Drop By” to see how things are going and if anyone needs anything. During the visits law enforcement officers are able to get a feel for any evidence of domestic violence doing so in a non-threatening and non-accusatory manner.

School attendance both mediates and moderates the potential problems of food insecurity and abuse. But, as seen with programs such as outlined, there are other ways to ameliorate these injustices to our children. A re-allocation of dollars is needed along with creative ways to supply the needed services that children and families need during this pandemic. Sending kids back to school under the current implementation is not the solution. The potential nonmonetary costs are not worth the benefits that may accrue by returning to school under the present conditions.

Eventually, we will have to come to terms with the outcomes of this uncontrolled experiment. Will we have learned that it was a bad idea? That there should have been more planning to control for the safety and well-being for all at school? That we should have controlled for transportation safety? That dollars should have been reallocated for technology and given to those without it for remote learning? That home visits by school personnel to aid those experiencing difficulty learning remotely would have been worth the money?

Is America prepared to deal with the outcomes of this uncontrolled experiment where children are the sample? Neither science nor the ethics of research accept the premise of “we’ll do it and then see what happens.” But uncontrolled experiments do just that at the peril of those who are participants in unethical, uncontrolled experiments. America sits poised to conduct such a trial.

In Search of an AI Research Code of Conduct

image of divided brain; fluid on one side, curcuitry on the other

The evolution of an entire industry devoted to artificial intelligence has presented a need to develop ethical codes of conduct. Ethical concerns about privacy, transparency, and the political and social effects of AI abound. But a recent study from the University of Oxford suggests that borrowing from other fields like medical ethics to refine an AI code of conduct is problematic. The development of an AI ethics means that we must be prepared to address and predict ethical problems and concerns that are entirely new, and this makes it a significant ethical project. How we should proceed in this field is itself a dilemma. Should we proceed in a top-down principled approach or a bottom up experimental approach?

AI ethics can concern itself with everything from the development of intelligent robots to machine learning, predictive analytics, and the algorithms behind social media websites. This is why it is such an expansive area with some focusing on the ethics of how we should treat artificial intelligence, others focusing on how we can protect privacy, or some on how the AI behind social media platforms and AI capable of generating and distributing ‘fake news’ can influence the political process. In response many have focused on generating a particular set of principles to guide AI researchers; in many cases borrowing from codes governing other fields, like medical ethics.

The four core principles of medical ethics are respect for patient autonomy, beneficence, non-maleficence, and justice. Essentially these principles hold that one should act in the best interests of a patient while avoiding harms and ensuring fair distribution of medical services. But the recent Oxford study by Brent Mittelstadt argues that the analogical reasoning relating the medical field to the AI field is flawed. There are significant differences between medicine and AI research which makes these principles not helpful or irrelevant.

The field of medicine is more centrally focused on promoting health and has a long history of focusing on the fiduciary duties of those in the profession towards patients. Alternatively, AI research is less homogeneous, with different researchers in both the public and private sector working on different goals and who have duties to different bodies. AI developers, for instance, do not commit to public service in the same way that a doctor does, as they may only responsible to shareholders. As the study notes, “The fundamental aims of developers, users, and affected parties do not necessarily align.”

In her book Towards a Code of Ethics for Artificial Intelligence Paula Boddington highlights some of the challenges of establishing a code of ethics for the field. For instance, those working with AI are not required to receive accreditation from any professional body. In fact,

“some self-taught, technically competent person, or a few members of a small scale start up, could be sitting in their mother’s basement right now dreaming up all sorts of powerful AI…Combatting any ethical problems with such ‘wild’ AI is one of the major challenges.”

Additionally, there are mixed attitudes towards AI and its future potential. Boddington notes a divide in opinion: the West is more alarmist as compared to nations like Japan and Korea which are more likely to be open and accepting.

Given these challenges, some have questioned whether an abstract ethical code is the best response. High-level principles which are abstract enough to cover the entire field will be too vague to be action-guiding, and because of the various different fields and interests, oversight will be difficult. According to Edd Gent,

“AI systems are…created by large interdisciplinary teams in multiple stages of development and deployment, which makes tracking the ethical implications of an individual’s decisions almost impossible, hampering our ability to create standards to guide those choices.”

The situation is not that different from work done in the sciences. Philosopher of science Heather Douglas has argued, for instance, that while ethical codes and ethical review boards can be helpful, constant oversight is impractical, and that only scientists can fully appreciate the potential implications of their work. The same could be true of AI researchers. A code of principles of ethics will not replace ethical decision-making; in fact, such codes can be morally problematic. As Boddington argues, “The very idea of parceling ethics into a formal ‘code’ can be dangerous.” This is because many ethical problems are going to be new and unique so ethical choice cannot be a matter of mere compliance. Following ethical codes can lead to complacency as one seeks to check certain boxes and avoid certain penalties without taking the time to critically examine what may be new and unprecedented ethical issues.

What this suggests is that any code of ethics can only be suggestive; they offer abstract principles that can guide AI researchers, but ultimately the researchers themselves will have to make individual ethical judgments. Thus, part of the moral project of developing an AI ethics is going to be the development of good moral judgment by those in the field. Philosopher John Dewey noted this relationship between principles and individual judgment, arguing:

“Principles exist as hypotheses with which to experiment…There is a long record of past experimentation in conduct, and there are cumulative, verifications which give many principles a well earned prestige…But social situations alter; and it is also foolish not to observe how old principles actually work under new conditions, and not to modify them so that they will be more effectual instruments in judging new cases.”

This may mirror the thinking of Brent Mittelstadt who argues for a bottom-up approach to AI ethics that focuses on sub-fields developing ethical principles as a response to resolving challenging novel cases. Boddington, for instance, notes the importance of equipping researchers and professionals with the ethical skills to make nuanced decisions in context; they must be able to make contextualized interpretations of rules, and to judge when rules are no longer appropriate. Still, such an approach has its challenges as researchers must be aware of the ethical implications of their work, and there still needs to be some oversight.

Part of the solution to this is public input. We as a public need to make sure that corporations, researchers, and governments are aware of the public’s ethical concerns. Boddington recommends that in such input there be a diversity of opinion, thinking style, and experience. This includes not only those who may be affected by AI, but also professional experts outside of the AI field like lawyers, economists, social scientists, and even those who have no interest in the world of AI in order maintain an outside perspective.

Codes of ethics in AI research will continue to develop. The dilemma we face as a society is what such a code should mean, particularly whether it will be institutionalized and enforced or not. If we adopt a bottom up approach, then such codes will likely be only there for guidance or will require the adoption of multiple codes for different areas. If a more principled top-down approach is adopted, then there will be additional challenges of dealing with the novel and with oversight. Either way, the public will have a role to play to ensure that its concerns are being heard.

Some Ethical Problems with Footnotes

scan of appendix title page from 1978 report

I start this article with a frank confession: I love footnotes; I do not like endnotes.
Grammatical quarrels over the importance of the Oxford comma, the propriety of the singular “they,” and whether or not sentences can rightly end with a preposition have all, in their own ways and for their own reasons, broken out of the ivory tower. However, the question of whether a piece of writing is better served with footnotes (at the bottom of each page) or endnotes (collected at the end of the document) is a dispute which, for now, remains distinctly scholastic.1 Although, as a matter of personal preference, I am selfishly partial to footnotes, I must admit – and will hereafter argue – that, in some situations, endnotes can be the most ethical option for accomplishing a writer’s goal; in others, eliminating the note entirely is the best option.
As Elisabeth Camp explains in a TED Talk from 2017, just like a variety of rhetorical functions in normal speech, footnotes typically do four things for a text:

  1. they offer a quick method for citing references;
  2. they supplement the footnoted sentence with additional information that, though interesting, might not be directly relevant to the essay as a whole,
  3. they evaluate the point made by the footnoted sentence with quick additional commentary or clarification, and
  4. they extend certain thoughts within the essay’s body in speculative directions without trying to argue firmly for particular conclusions.

For each of these functions (though, arguably less so for the matter of citation), the appositive commentary is most accessible when directly available on the same page as the sentence to which it is attached; requiring a reader to turn multiple pages (rather than simply flicking their eyes to the bottom of the current page) to find the note erects a barrier that, in all likelihood, leads to many endnotes going unread. As such, one might argue that if notes are to be used, then they should be easily usable and, in this regard, footnotes are better than endnotes.
However, this assumes something important about how an audience is accessing a piece of writing: as Nick Byrd has pointed out, readers who rely on text-to-speech software are often presented with an unusual barrier precisely because of footnotes when their computer program fails to distinguish between text in the main body of the essay versus text elsewhere. Imagine trying to read this page from top to bottom with no attention to whether some portions are notes or not:

(From The Genesis of Yogācāra-Vijñānavāda: Responses and Reflections by Lambert Schmithausen; thanks to Bryce Huebner for the example)
Although Microsoft Office has available features for managing the flow of its screen reader program for Word document files, the fact that many (if not most) articles and books are available primarily in .pdf or .epub formats means that, for many, heavily footnoted texts are extremely difficult to read.
Given this, two solutions seem clear:

  1. Improve text-to-speech programs (and the various other technical apparatuses on which they rely, such as optical character recognition algorithms) to accommodate heavily footnoted documents.
  2. Diminish the practice of footnoting, perhaps by switching to the already-standardized option of endnoting.

And, given that (1) is far easier said than done, (2) may be the most ethical option in the short term, given concerns about accessibility.
Technically, though, there is at least one more option immediately implementable:
3. Reduce (or functionally eliminate) current academic notation practices altogether.
While it may be true that authors like Vladimir Nabokov, David Foster Wallace, Susanna Clarke, and Mark Z. Danielewski (among plenty of others) have used footnotes to great storytelling effect in their fiction, the genre of the academic text is something quite different. Far less concerned with “world-building” or “scene-setting,” an academic book or article, in general, presents a sustained argument about, or consideration of, a focused topic – something that, arguably, is not well-served by interruptive notation practices, however clever or interesting they might be. Recalling three of Camp’s four notational uses mentioned above, if an author wishes to provide supplementation, evaluation, or extension of the material discussed in a text, then that may either need to be incorporated into the body of the text proper or reserved for a separate text entirely.
Consider the note attached to the first paragraph of this very article – though the information it contains is interesting (and, arguably, important for the main argument of this essay), it could potentially be either deleted or incorporated into the source paragraph without much difficulty. Although this might reduce the “augmentative beauty” of the wry textual aside, it could (outside of unusual situations such as this one where a footnote functions as a recursive demonstration of its source essay’s thesis) make for more streamlined pieces of writing.
But what of Camp’s first function for footnotes: citation? Certainly, giving credit fairly for ideas found elsewhere is a crucial element of honest academic writing, but footnotes are not required to accomplish this, as anyone familiar with parenthetical citations can attest (nor, indeed, are endnotes necessary either). Consider the caption to the above image of a heavily footnoted academic text (as of page 365, the author is already up to note 1663); anyone interested in the source material (both objectively about the text itself and subjectively regarding how I, personally, learned of it) can discover this information without recourse to a foot- or endnote. And though this is a crude example (buttressed by the facility of hypertext links), it is far from an unusual one.
Moreover, introducing constraints on our citation practices might well serve to limit certain unusual abuses that can occur within the system of academic publishing as it stands. For one, concerns about intellectual grandstanding already abound in academia; packed reference lists are one way that this manifests. As Camp describes in her presentation,

“Citations also accumulate authority; they bring authority to the author. They say ‘Hey! Look at me! I know who to cite! I know the right people to pay attention to; that means I’m an authority – you should listen to what I have to say.’…Once you’ve established that you are in the cognoscenti – that you belong, that you have the authority to speak by doing a lot of citation – that, then, puts you in a position to use that in interesting kinds of ways.”

Rather than using citations simply to give credit where it is due, researchers can sometimes cite sources to gain intellectual “street cred” (“library-aisle cred”?) for themselves – a practice particularly easy in the age of the online database and particularly well-served by footnotes which, even if left unread, will still lend an impressive air to a text whose pages are packed with them. And, given that so-called bibliometric data (which tracks how and how frequently a researcher’s work is cited) is becoming ever-more important for early-career academics, “doing a lot of citation” can also increasingly mean simply citing oneself or one’s peers.
Perhaps the most problematic element of citation abuse, however, stems from the combination of easily-accessed digital databases with lax (or over-taxed) researchers; as Ole Bjørn Rekdal has demonstrated, the spread of “academic urban legends” – such as the false belief that spinach is a good source of iron or that sheep are anatomically incapable of swimming – often come as a result of errors that are spread through the literature, and then through society, without researchers double-checking their sources. Much like a game of telephone, sloppy citation practices allow mistakes to survive within an institutionally-approved environment that is, in theory, designed to squash them. And while sustaining silly stories about farm animals is one thing, when errors are spread unchecked in a way that ultimately influences demonstrably harmful policies – as in the case of a 101-word paragraph cited hundreds of times since its publication in a 1979 issue of the New England Journal of Medicine which (in part) laid the groundwork for today’s opioid abuse crisis – the ethics of citations become sharply important.
All of this is to say: our general love for academic notational practices, and my personal affinity for footnotes, are not neutral positions and deserve to be, themselves, analyzed. In matters both epistemic and ethical, those who care about the accessibility and the accuracy of a text would do well to consider what role that text’s notes are playing – regardless of their location on a given page.
 
1  Although there have been a few articles written in recent years about the value of notes in general, the consistent point of each has been to lament a perceived downturn amongst the general attitude regarding public disinformation (with the thought that notes of some kind could help to combat this). None seem to specifically address the need for a particular location of notes within a text.

Forbidden Knowledge in Scientific Research

cloeup photograph of lock on gate with iron chain

It is no secret that science has the potential to have a profound effect on society. This is often why scientific results can be so ethically controversial. For instance, researchers have recently warned of the ethical problems associated with scientists growing lumps of human brain in the laboratory. The blobs of brain tissue grown from stem cells developed spontaneous brain waves like those found in premature babies. The hope is that the study offers the potential to better understand neurological disorders like Alzheimer’s, but it also raises a host of ethical worries concerning the possibility this brain tissue could reach sentience. In other news, this week a publication in the journal JAMA Pediatrics ignited controversy by reporting a supposed link between fluoride exposure and IQ scores in young children. In addition to several experts questioning the results of the study itself, there is also concern about the potential effect this could have on the debate over the use of fluoride in the water supply; anti-fluoride activists have already jumped on the study to defend their cause. Scientific findings have an enormous potential to dramatically affect our lives. This raises an ethical issue: should there be certain topics, owing to their ethical concerns, that should be off-limits for scientific study?

This question is studied in both science and philosophy, and is sometimes referred to as the problem of forbidden knowledge. The problem can include issues of experimental methods and whether they follow proper ethical protocols (certain knowledge may be forbidden if it uses human experimentation), but it can also include the impact that the discovery or dissemination of certain kinds of knowledge could have on society. For example, a recent study found that girls and boys are equally as good at mathematics and that children’s brains function similarly regardless of gender. However, there have been several studies going back decades which tried to explain differences between mathematical abilities in boys and girls in terms of biological differences. Such studies have the possibility of re-enforcing gender roles and potentially justifying them as biologically determined. This has the potential to spill over into social interactions. For instance, Helen Longino notes that such findings could lead to lower priorities being made to encourage women to enter math and science.

So, such studies have the potential to impact society which is an ethical concern, but is this reason enough make them forbidden? Not necessarily. The bigger problem involves how adequate these findings are, the concern that they could be incorrect, and what society is to do about that until correct findings are published. For example, in the case of math testing, it is not that difficult to find significant correlations between variables, but the limits of this correlation and the study’s potential to identify causal factors are often lost on the public. There are also methodical problems; some standardized tests rely on male-centric questions that can skew results, different kinds of tests and different strategies for preparing for them can also misshape our findings. So even if correlations are found, where there are not major flaws in the assumptions of the study, they may not be very generalizable. In the meantime, such findings, even if they are corrected over time, can create stereotypes in the public that are hard to get rid of.

Because of these concerns, some philosophers argue that either certain kinds of questions be banned from study, or that studies should avoid trying to explain differences in abilities and outcomes according to race or sex. For instance, Janet Kourany argues that scientists have moral responsibilities to the public and they should thus conduct themselves according to egalitarian standards. If a scientist wants to investigate the differences between racial and gender groups, they should seek to explain these in ways without assuming that the difference is biologically determined.

In one of her examples, she discusses studying differences between incidents of domestic violence in white and black communities. A scientist should highlight similarities of domestic violence within white and black communities and seek to explain dissimilarities in terms of social issues like racism or poverty. With a stance like this, research into racial differences explaining differences in rates of domestic violence would thus constitute forbidden knowledge. Only if these alternative egalitarian explanations empirically fail can a scientist then choose to explore race as a possible explanation of differences between communities. By doing so, it avoids perpetuating a possibly empirically flawed account that suggests that blacks might be more violent than other ethnic groups.

She points out that the alternative risks keeping stereotypes alive even while scientists slowly prove them wrong. Just as in the case of studying mathematical differences, the slow settlement of opinion within the scientific community leaves society free to entertain stereotypes as “scientifically plausible” and adopt potentially harmful policies in the meantime. In his research on the matter Philip Kitcher notes that we are susceptible to instances of cognitive asymmetry where it takes far less empirical evidence to maintain stereotypical beliefs than it takes to get rid of them. This is why studying the truth of such stereotypes can be so problematic.

These types of cases seem to offer significant support to labeling particular lines of scientific inquiry forbidden. But the issue is more complicated. First, telling scientists what they should and should not study raises concerns over freedom of speech and freedom of research. We already acknowledge limits on research on the basis of ethical concerns, but this represents a different kind of restriction. One might claim that so long as science is publicly funded, there are reasonable democratically justified limits of research, but the precise boundaries of this restriction will prove difficult to identify.

Secondly, and perhaps more importantly, such a policy has the potential to exacerbate the problem. According to Kitcher,

“In a world where (for example) research into race differences in I.Q. is banned, the residues of belief in the inferiority of the members of certain races are reinforced by the idea that official ideology has stepped in to conceal an uncomfortable truth. Prejudice can be buttressed as those who opposed the ban proclaim themselves to be the gallant heirs of Galileo.”

In other words, one reaction to such bans on forbidden knowledge, so long as our own cognitive asymmetries are unknown to us, will be to fight back that this is an undue limitation on free speech for the sake of politics. In the meantime, those who push for such research can become martyrs and censoring them may only serve to draw more attention to the cause.

This obviously presents us with an ethical dilemma. Given that there are scientific research projects that could have a potentially harmful effect on society, whether the science involved is adequate or not, is it wise to ban such projects as forbidden knowledge? There are reasons to say yes, but implementing such bans may cause more harm or drive more public attention to such issues. Even banning research on the development of brain tissue from stem cells may be wise, but it may also cause such research to move to another country with more relaxed ethical standards, meaning that potential harms could be much worse. These issues surrounding how science and society relate are likely only going to be solved with greater public education and open discussion about what ethical responsibilities we think scientists should have.

The Problem with “Google-Research”

photograph of computer screen with empty Google searchbar

If you have a question, chances are the internet has answers: research these days tends to start with plugging a question into Google, browsing the results on the first (and, if you’re really desperate, second) page, and going from there. If you’ve found a source that you trust, you might go to the relevant site and call it a day; if you’re more dedicated, you might try double-checking your source with others from your search results, maybe just to make sure that other search results say the same thing. This is not the most robust kind of research – that might involve cracking a book or talking to an expert – but we often consider it good enough. Call this kind of research Google-research.

Consider an example of Google-researching in action. When doing research for my previous article – Permalancing and What it Means for Work – I needed to get a sense of what the state of freelancing was like in America. Some quick Googling turned up a bunch of results, the following being a representative sample:

‘Permalancing’ Is The New Self-Employment Trend You’ll Be Seeing Everywhere

More Millennials want freelance careers instead of working full-time

Freelance Economy Continues to Roar

Majority of U.S. Workers Will be Freelancers by 2027, Report Says

New 5th Annual “Freelancing in America” Study Finds That the U.S. Freelance Workforce, Now 56.7 Million People, Grew 3.7 Million Since 2014

While not everyone’s Googling will return exactly the same results, you’ll probably be presented with a similar set of headlines if you search for the terms “freelance” and “America”. The picture that’s painted by my results is one in which the state of freelance work in America is booming, and welcome: not only do “more millennials want freelance careers,” but the freelance economy is currently “roaring,” increasing by millions of people over the course of only a few years. If I were simply curious about the state of freelancing in America, or if I was satisfied with the widespread agreement in my results, then I would probably have been happy to accept the results of my Google-researching, which tells me that the status of freelancing in America is not only healthy, but thriving. I could, of course, have gone the extra mile and tried to consult an expert (perhaps I could have found an economist at my university to talk to). But I had stuff to do, and deadlines to meet, so it was tempting to take these results at face value.

While Google-researching has become a popular way to do one’s research (whenever I ask my students how they would figure out the answer to basically any question, for example, their first response is invariably that they Google it), there are a number of ways that it can lead one astray.

Consider my freelancing example again: while the above headlines generally agree with each other, there are reasons to be worried about whether they are conveying information that’s actually true. One problem is that all of above articles summarize the results of the same study: the “Freelancing in America” study, mentioned explicitly in the last headline. A little more investigating reveals some troubling information about the study: in addition to concerns I raised in in my previous article – including concerns about the study glossing over disparities in freelance incomes, and failing to distinguish between the earning potentials and difference in number of jobs across different types of freelance work – the study itself was commissioned by the website Upwork, which describes itself as a “global freelancing platform where businesses and independent professionals connect and collaborate.” Such a site, one would think, has a vested interest in presenting the state of freelancing as positively as possible, and so we should at the very least take the results of the study with a grain of salt. The articles, however, merely present information from the study, but do little in the way of quality control.

One worry, then, is by merely Google-researching the issue I can end up feeling overly confident that the information presented in my search results is true: not only is the information I’m reading being presented uncritically as fact, all my search results agree with and support one another. Part of the problem lies, of course, with the presentation of the information in the first place: while it may be the case that I should take these articles with a grain of salt, it seems that by the way the above articles were written, the various websites and news outlets that presented the information in such a way that they took the results of the study at face value. As a result, although it was almost certainly not the intention of the authors of the various articles, they end up presenting misleading information.

The phenomenon by which journalists reports on studies by taking them at face value is unfortunately commonplace in many different areas of reporting. For example, writing on problems with science journalism, philosopher Carrie Figdor argues that since “many journalists take, or frequently have no choice but to take, a stance toward science characteristic of a member of a lay community,” they do not possess the relevant skills required to determine whether the information that they’re presenting is true, and cannot reliably distinguish between those studies that are worth reporting on and which are not. This, Figdor argues, does not necessarily absolve journalists of blame, as they are at least partially responsible for choosing which studies to report on: if they choose to report on a field that is not producing reliable research, then they should “not [cover] the affected fields until researchers get their act together.”

So it seems that there are at least two major concerns with Google-research: the first comes relates to the way that information is presented by journalists – often lacking the specialized background that would help them better present the information they’re reporting on, journalists may end up presenting information that is inaccurate or misleading. The second is with the method itself – while it may sometimes be good enough to do a quick Google and believe what the headlines say, oftentimes getting at the truth of an issue requires going beyond the headlines.

Should We All Take a Bit of Lithium?

Lithium is classically associated with extreme mental illness, and has a somewhat negative connotation in the public. In its concentrated form, it has been documented to alleviate several symptoms of mental illness. But it also can (and has had) severe negative health consequences for people taking it in high doses.

But this op-ed raises the question: Should we all be taking (a very little) bit of Lithium? It turns out that it is naturally occurring, in very small amounts, in many drinking water sources. What is surprising is that there is some evidence that these small amounts potentially have a surprisingly positive effect. Several studies have shown correlations between levels of naturally occurring lithium and positive social outcomes.

Researchers began to ask whether low levels of lithium might correlate with poor behavioral outcomes in humans. In 1990, a study was published looking at 27 Texas counties with a variety of lithium levels in their water. The authors discovered that people whose water had the least amount of lithium had significantly greater levels of suicide, homicide and rape than the people whose water had the higher levels of lithium. The group whose water had the highest lithium level had nearly 40 percent fewer suicides than that with the lowest lithium level.

Almost 20 years later, a Japanese study that looked at 18 municipalities with more than a million inhabitants over a five-year period confirmed the earlier study’s finding: Suicide rates were inversely correlated with the lithium content in the local water supply.

More recently, there have been corroborating studies in Greece and Austria.

This raises several interesting questions. First, should the government and the scientific community be devoting more resources to studying the effects of lithium in these small doses? Second, suppose we found out that there are positive effects. Should we all be drinking water with these naturally occurring levels of lithium or not? Would you want your municipal water supply augmented to achieve this result? What do you think?