Detecting Verbatim LLM Copy-Paste in Homework

Authors: Aizierjiang Aiersilan

Abstract

Large language models (LLMs) have made fluent essay writing, code drafting, and quiz answering instantly available to students at every level, from secondary school through graduate study. Many educators do not object to LLM use per se; what they need to detect is the case in which a student pastes the assignment prompt into a chatbot and submits the model's reply verbatim, without engaging with the work. Existing post-hoc AI-text detectors remain unreliable and have been shown to penalise non-native English writers, while output-side watermarks require cooperation from the model provider. We propose an alternative that the educator controls directly: an input-side watermark in which an invisible instruction is embedded inside the visible assignment prompt itself. An LLM that ingests the prompt verbatim quietly reads the hidden instruction and writes a tell-tale signature into its reply, exposing the copy-and-paste pathway specifically. We describe SteganoPrompt, a single-page, zero-dependency web tool that encodes an arbitrary printable-ASCII payload into the deprecated Unicode Tags block (U+E0000-U+E007F). The encoded string is visually identical to the original, survives common copy-paste channels (Word, Google Docs, PDF, Markdown, Slack, e-mail, the major learning-management systems), and is reliably tokenized by frontier models. We evaluate compliance across seven LLM families and a representative set of educational content channels. The work is informed by my experience as a graduate teaching assistant for an undergraduate software engineering course. The tool is released under the MIT licence at https://ezharjan.github.io/SteganoPrompt/.

Download Full PDF

Bibliographic Reference

Copied successfully!
@article{aiersilan2026detecting,
  title={Detecting Verbatim LLM Copy-Paste in Homework},
  author={Aiersilan, Aizierjiang},
  journal={arXiv preprint arXiv:2605.16336},
  year={2026}
}