arXiv tightens AI-authorship rules with one-strike year-long ban
Authors caught publishing unverified LLM output — hallucinated references and visible model chatter being the giveaway — face a 12-month ban and a requirement that future submissions clear peer review first.
Authors caught publishing unverified LLM output — hallucinated references and visible model chatter being the giveaway — face a 12-month ban and a requirement that future submissions clear peer review first.
The pre-print server arXiv has formalised what it calls a "one-strike" rule against authors whose submissions show "incontrovertible evidence that the authors did not check the results of LLM generation," according to TechCrunch's reporting on 16 May 2026 . The penalty is a one-year ban, after which the author may submit again only via a paper that has already cleared a "reputable peer-reviewed venue" .
Thomas Dietterich, chair of arXiv's computer science section, framed the rationale in plain terms: "if a submission contains incontrovertible evidence that the authors did not check the results of LLM generation, this means we can't trust anything in the paper" . The triggers arXiv cites include hallucinated references and accidentally-included messages to and from the model in the manuscript body .
Procedurally, moderators flag suspicious submissions and section chairs confirm the evidence before any ban is imposed. Authors retain the right to appeal . Responsibility for content remains with the human authors "irrespective of how the contents are generated" .
The move continues a pattern of arXiv tightening submission integrity. Earlier measures included endorsement requirements for first-time posters . arXiv's recent transition to independent non-profit status is given as part of the context for the policy change .
Two questions the reporting does not resolve are worth flagging. First, scope: the announcement focuses on the computer science section, but whether the same rule applies to physics, mathematics, biology, and the other arXiv archives is not stated. Second, enforcement granularity: there is no public number on how many submissions arXiv has already caught under this standard, or what fraction of incoming computer-science submissions show LLM-generation artefacts.
For working ML researchers, the practical implication is unambiguous. If a paper draft contains references that don't resolve, or any visible trace of the model that wrote the prose, that paper is now a career-affecting risk. Read your own citations.