ACL ARR Ethics Review Flagging Guidelines

· February 14, 2024

By Malihe Alikhani and Vinodkumar Prabhakaran

Ethics reviews are a cornerstone of maintaining integrity and responsibility in academic research. As we navigate through a multitude of submissions for our upcoming conference proceedings, it’s vital to ensure that our papers adhere to ethical standards, rules, and regulations. This guide aims to assist reviewers and action editors (AEs) in detecting papers that may need an in-depth ethics review, as well as preparing the reviewers to conduct effective and efficient ethics reviews, ensuring a high standard of compliance throughout our conference proceedings.

The Importance of Ethics Review

Ethics reviews are not just a formality; they are critical to the integrity of academic discourse. In particular, the ethics reviews are meant to assess whether the submitted research exhibits any substantial ethical issues, that is, issues that pose an increased risk of harm outside the current norms of NLP or CL research. It is especially critical to uphold these standards as we engage in research that deals with sensitive data and/or controversial topics. As ethics reviewers, you play a pivotal role in safeguarding these standards.

Before diving into the review process, familiarize yourself with the specific guidelines for ethics reviewing that ARR has drafted. Remember, an ethics review is not just about ticking boxes, nor is it a technical review; it’s about engaging critically with the material and its implications. You can find more information about this and about the process in our previous blog posts:

Guidelines for Ethics Reviewing:

ACL code of conduct:

Guidelines for Flagging Papers for Ethics Review

One of the critical steps in the ARR pipeline is the flagging of papers for ethics reviewing. This is done by the action editors during the early stages of the reviewing cycle, or by the reviewers during the deeper technical review they perform. They are required to provide a justification along with flagging any papers they deem to require an in-depth ethics review.

This is an important step that routes papers to the ethics chairs which are then assigned for full in-depth ethics reviews after an initial assessment. While ethics reviews are essential in maintaining the integrity of research, it’s vital to discern when they are truly needed. Over-flagging not only burdens the review process but might also hinder valuable research from moving forward.

Based on the past few cycles of ARR ethics review operationalization, we have identified a few patterns that result in false flags. We surface these recurring patterns here as we believe it will be helpful for reviewers and action editors to correctly identify the papers that should be flagged for deeper ethics reviewing, while also eliminating false flags stemming from misconceptions about the ethics reviewing.

What to Avoid?

  1. No justification: A huge majority of papers get flagged without the justification field being filled in, or left as “N/A”, “None”, “No ethical concerns” etc. This unnecessarily adds to the workload of the ethics chairs and reviewers. If you are flagging a paper for full ethics review, please try to give a clear and succinct justification so that ethics chairs can appropriately assign ethics reviewers for the paper.
  2. Flagging for Missing Section(s): Another substantial number of papers get incorrectly flagged because of missing ethical considerations section, or missing limitations section, or incompletely/incorrectly filled Responsible AI checklist. While these issues may need to be flagged to the AE’s attention, and in some cases may justify desk rejection (in case of Limitations section for certain conferences), they do not justify a full ethics review. Adding an ethical considerations section is not mandatory; however if you believe there are specific ethical issues that warrant a full review, please flag the paper.
  3. Flagging for Copyright, Consent, Transparency: While it is important to ensure that ARR process correctly flags submissions that may be in violation of copyright policies of datasets used, inadequate informed consent process, or transparency of presented artifacts, these issues in and of themselves do not justify a fuller ethics review. Instead, since you have already identified the issue, you can call out the authors in your review or meta review itself to address these issues. However, if there are aspects that require a deeper look, you can of course flag it for a full ethics review, outlining the concern.

Common Misconceptions

  1. All Data-centric Papers require Ethics Review: A common misstep is assuming that any paper discussing data usage needs a full ethics review. This is not the case. Instead, focus should be on potential misuse or unethical handling of data. A mere description of data usage does not automatically warrant an in-depth review.
  2. Use of Human Annotators require Ethics Review: The involvement of human annotators in a study does not on its own justify an ethics review. However, if there are specific concerns about the human annotation step, such as potential exploitation or ethical lapses, these may warrant an in-depth ethics review.
  3. All Datasets on Sensitive Topics require Ethics Review: Datasets that feature critical content, such as misinformation detection, might raise eyebrows due to their potential dual use. But again, the mere presence of such content does not automatically demand a full ethics review. If it involves engaging with specific marginalized communities (e.g., linguistic minorities) or if the content of the annotation involves potentially traumatizing data (e.g., involving hate speech or other gory concepts), then it may justify an in-depth ethics review to assess whether adequate safeguards were taken. In such cases, explain your reasons as justification while flagging for ethics review.
  4. All Papers on High-stakes Domains require Ethics Review: While all of NLP research can arguably have downstream human impact, data/methodological innovations on high-stakes domains may have an immediate impact on users or communities. These include educational use cases such as automated grading scenarios, or medical use cases such as in psychotherapy, or legal use cases such as criminality prediction. These may also include challenges in de-identification in high-stakes domains such as healthcare. However, just because a paper focuses on such a high-stakes domain doesn’t by default necessitate a full ethics review. What matters is the context and application of the research. If regulations have been followed and necessary precautions taken, and these are detailed in the paper, then sensitive research areas can be explored without automatically qualifying for a full ethics review.
  5. All Papers on NLP ethics require Ethics Review: A paper engaging with questions on NLP ethics and fairness in and of itself does not justify an in-depth ethics review. However, many of these papers do tend to engage with topics that are potentially sensitive and may sometimes need an in-depth review. But this determination should be based on such additional contexts, rather than merely for the topic being NLP ethics.

Twitter, Facebook