Plagiarism and “self-plagiarism” in scientific works in the age of digital technologies

Olga V. Isaeva

This article is published under a Creative Commons license and not by the author of the article. So if you find any inaccuracies, you can correct them by updating the article.

Update this article

Plagiarism and “self-plagiarism” in scientific works in the age of digital technologies Creative Commons

Marina A. Rozhkova,

Olga V. Isaeva

Цифровое право, Journal Year: 2022, Volume and Issue: №2, P. 25 - 35, https://doi.org/10.38044/2686-9136-2022-3-2-25-35

Published: June 30, 0202

Abstract

As well as streamlining academic research activities, contemporary technologies provide opportunities to infringe on the intellectual property of others through plagiarism. However, plagiarism has yet to be adequately dealt with in national legislations, which either do not contain any provision in this connection or fail to adequately define the relevant terms. Moreover, there continues to be much discussion as to what plagiarism is, as well as how and in what cases it should be punishable. The situation is further complicated by the various approaches to plagiarism and the lack of consensus on whether so-called “self-plagiarism” should be treated separately from the intentional infringement of other people’s intellectual property or be deemed as a form of plagiarism. With the aim of clarifying these questions, the authors of the present paper compare different approaches taken to the problem of plagiarism and consider some contemporary approaches to plagiarism detection.

Keywords

INTRODUCTION

The digital age has undoubtedly made it much easier to obtain information necessary for conducting research or preparing a scholarly work, as well as to find materials supporting or refuting particular arguments concerning the subject matter at hand. However, the abundance and variety of information available on the Internet also constitutes a temptation for its more “negligent” use, which, among other things, may infringe the copyright of other authors.

At the same time, contemporary authors and other interested parties may avail themselves of a wide range of technological means of establishing unfair borrowing of their ideas or wordings used in other people’s works. Due to the inherent conflicts arising from such activities, the detection of plagiarism in scientific research regularly causes a public outcry, which, being disseminated on the Internet, becomes widely known.

In this context, it is no longer surprising when unfair borrowings in scientific works, including theses, come to light, along with demands on the part of authors for compensation for infringements of their intellectual property or impairment of their reputations. Meanwhile, the record shows many problems arising in this regard and needed to be resolved being considered more thoroughly.

The problem of improper borrowing has been recognised since at least the first century CE: if someone presented a work of another person or parts thereof as his own, it was equated to theft and called “plagiarism”.¹ However, although the term has become very widespread, it is rarely used in the national legislations of modern countries; moreover, where it does appear, its meaning is not always interpreted in a uniform manner, constituting a significant obstacle to the prevention of such dishonest behaviour.

For this reason, this article presents a considered analysis of the concept of plagiarism along with a review of currently prevalent approaches to its understanding in the academic and legal environments.

RESEARCH QUESTIONS

In previous research (Rozhkova, 2021) we concluded that there are nowadays two main concepts of plagiarism: from the perspective of academic ethics (academic integrity), plagiarism is considered as academic fraud, while, from a legal point of view, it comprises an infringement of intellectual property. Due to being based on different approaches, these concepts are quite distinct. Thus, attempts to “construct” ethical norms on the foundation of copyright or fit them into a Procrustean bed of legal provisions are equally doomed to failure, resulting in a proliferation of inconclusive discussion.

PLAGIARISM FROM THE POINT OF VIEW OF ACADEMIC ETHICS

When referring to academic ethics, we mean the ethical standards that apply to the conduct of academic and scientific research, which are based on the institution of integrity. As the International Centre for Academic Integrity (ICAI) specifically points out, fairness, along with honesty, trust, respect, responsibility and courage, is one of the fundamental elements of academic integrity.²

Here, it should be emphasized that plagiarism not only affects the interests of the authors whose scientific work is being dishonestly borrowed, but also teachers, students and the general public, who are consequently misled as to whom the relevant scientific achievements belong. For this reason, academic ethics is fundamentally distinct from copyright law, which prioritizes the legitimate interests of the author (copyright holder) of the work³ (Atanasova, 2019).

Although generally defining plagiarism as academic fraud, academic and educational institutions often add their own nuances to highlight various characteristics enshrined in their ethical policies. Generally speaking, academic plagiarism refers to giving the false impression that the research has been conducted by the individual author(s), i.e., all the research work has been conducted by the authors themselves, and the final conclusions described as a result of this work therefore belong to these particular authors. Passing off someone else’s research work and/or results as one’s own is considered dishonest, unfair, fraudulent, and not meeting the requirements of academic ethics.

From the perspective of academic ethics, plagiarism is interpreted quite broadly⁴ (Ansorge et al., 2021). For example, plagiarism it includes the unfair use of others’ ideas or concepts, copying fragments of others’ publications in violation of citation rules, reproduction of others’ work or preparatory materials (including unpublished texts), borrowing data from others’ research or experiments, as well as copying individual phrases from others peoples’ work without indicating the original author, etc. (Blok, 2021). Thus, it is clear that the scope of objects covered by academic ethics, which also protects scientific theories and mathematical methods, differs significantly from that of copyright, which, as is known, does not protect ideas, concepts, principles, methods, etc.

Therefore, we should agree with the idea often expressed in the literature that the rules of academic ethics are both more stringent than intellectual property law and cover a wider range of phenomena.

Although disclosure of academic plagiarism does not necessarily entail exposing the plagiarist to any legal liability, its consequences are generally detrimental, if not involving the complete destruction of the plagiarist’s academic reputation. In addition, should plagiarism be detected in thesis research, according to the results of which the plagiarist has been awarded an academic degree, such a person may be stripped of his or her qualification.

For example, in 2000, the former MEP Jorgo Chatzimarkakis was awarded a doctorate for his thesis on information globalism. However, in 2011, the VroniPlag wiki found that his work had been almost 72% borrowed along with insufficient acknowledgement of the other authors’ work.⁵ As a result, in 2012, the University of Bonn, declaring Chatzimarkakis to be guilty of plagiarism, violation of academic ethics, lack of proper citation of sources, revoked the young politician’s doctorate.⁶

In connection with the foregoing, there is another thing to be mentioned. Plagiarists often complain that their dishonest borrowings have been brought to light not by the allegedly infringed authors of the original works, but by third parties. Nevertheless, since academic ethics is designed to protect the interests not only of authors, but also of the public in a broad sense, such a position is unfounded. Thus, any member of the public or association thereof can legitimately investigate or report fraudulent conduct by a researcher.

PLAGIARISM IN LEGAL TERMS

As considered above, the term “plagiarism” rarely appears in national legislations due to being considered primarily as a matter of institutional policy. However, as a consequence, the number of its possible interpretations is becoming significant.

In the widest sense, plagiarism is thought of as any “intellectual theft” where a person usurps the authorship of another’s intellectual product in whole or in part (this is the interpretation reflected in Russian criminal law). In this legislative context, the objects of plagiarism include all those deemed to constitute intellectual property. In other words, this is not only about copyrighted works, but all intellectual property. Thus, some authors refer to plagiarism of inventions, industrial designs or even trademarks (Hung et al., 2021).

For instance, in 2014, a Russian court heard a case where an author successfully argued that a scheme appearing in a scientific paper published by third parties was a generalized scheme that he had synthesized and researched in his doctoral thesis. As a result, his copyright was recognized as having been infringed and the dishonest researchers were compelled to prepare and publish a correction to the authorship of the scheme presented in their article⁷.

A narrower concept of plagiarism is set out in the provisions of the 1886 Berne Convention for the Protection of Literary and Artistic Works (the Berne Convention). Although authors are free to quote the works of other authors that have been made public, such quotation must be made in good faith and to the extent justified by the purpose. When referring to the works of others, authors in countries that are party to the Berne Convention must indicate the sources of borrowing and the authors of the corresponding work if their name is provided (Article 10). Russian copyright law similarly allows the free use of other people’s works subject to obligatory citation.

In the light of the foregoing, plagiarism is narrowly understood to be an improper borrowing of another author’s work (or its parts) when creating one’s own work, i.e., a borrowing that is in violation of citation rules due to its failure to specify the author or source of borrowing. Whereas copyright covers only works of authorship, the scope of objects of plagiarism in this regard shall be deemed narrower: including only copyright items that are capable of being quoted, i.e., mainly literary and scientific works. In the digital era, the possibility of “quoting” photographic works has become the subject much discussion. However, in case of non-compliance with citation rules when referring to someone else’s photographic works, such improper usage would seem to be more about illegal reproduction or communication to the public than plagiarism in the narrow sense.

Since a failure to comply with the obligation to indicate the name of the author and source of borrowing when borrowing (using) parts of another’s work constitutes an offence, the author of the original work is entitled to claim compensation for their violated rights. Here, the legal liability is independent of the negative consequences that may occur in case of academic fraud detection. For example, even if the revelation of plagiarism in a thesis results in the annulling of a degree previously granted to that plagiarist, this does not present an obstacle for the author(s) of the original work(s) to claim compensation for copyright infringement.

To summarize, the academic and narrow legal concepts of plagiarism are not mutually exclusive, but together provide effective remedies to those who have suffered as a result of a plagiarist’s unfair actions, which can be implemented either separately or in combination as appropriate.

PLAGIARISM AND WORD-FOR-WORD COPYING ARE NOT SYNONYMOUS

According to a currently widely held opinion, only a literal match of sentences or whole paragraphs in different works of different authors may be considered as plagiarism in the narrow sense. According to this belief, the later of any two such works is automatically considered to be a result of plagiarism. In particular, current Russian legal doctrine and case law typically rule plagiarism to have occurred when there is a word-for-word copying of someone else’s intellectual result (or its parts) in violation of the statutory rules of citation, i.e., without specifying the author of the original text and the source of borrowing (Isaeva, 2021).

However, this simplistic point of view fails to take a number of factors into account. First of all, it should be understood that professional texts (legal, technical, natural science, etc.) are often structured around the same commonly used phrases or expressions. In particular, legal texts are often characterized by use of extracts of laws and regulations. When drafting papers, legal scholars refer to the same laws, not only mentioning their titles, but also quoting (with or without quotation marks) particular legislative provisions. Independent researchers may rely on the same reference sources in their research, as well as supporting their own reasoning with the same citations. In such cases, even if there is verbatim coincidence, the issue of plagiarism does not arise.

As noted above, from the point of view of academic ethics, even a mere borrowing of others’ ideas or work materials (which do not fall under copyright protection) constitutes plagiarism. The latter can be revealed not only when parts of different authors’ works coincide word for word, but also when someone else’s work is paraphrased, i.e., without being copied literally. Plagiarists may achieve such a result by various means including automated solutions: today, there are many applications that automate the paraphrasing of a text. Even translation applications, while not specifically designed for this purpose, can be used to carry out such “cosmetic” changes with no change of sense.

Out of the variety of methods used by plagiarists to conceal plagiarism, edited, disguised and translated plagiarism variants can be distinguished.

Disguised and edited plagiarism are similar by their nature: in both cases, the borrowed text is reproduced not word for word, but with some alterations. However, in case of disguised plagiarism, the modifications of the original text are insignificant: the plagiarists change word forms, rearrange phrases and sentences, use synonyms, etc. In the case of edited plagiarism, the borrowed text undergoes comprehensive literary editing; as a result, while it may be completely different from the original, at the same time, it preserves the content and general idea (Levin, 2018). Translated plagiarism, as the name implies, corresponds to the translation of an original text into another language and publication of such a translation under the guise of an independent study, with no information about the fact of borrowing (Mazov et al., 2017).

On a practical level, the diversity of technical means for concealing plagiarism significantly complicates its detection, as well as preventing the identification of individual plagiarists. Nevertheless, the use of various applications and services by authors when preparing their works is not in itself evidence of plagiarism — or an attempt to disguise it. Modern technologies support a wide variety of legitimate tools for editing texts to improve the quality of created works.

Contemporary tools for revealing plagiarism

Experts usually draw attention to the fact that special tools and methods aimed at identifying plagiarism have been developed due to the diversity of its forms and variety of methods of its concealment.

First of all, we should mention automated text analysis conducted by means of specially created algorithms, which is undoubtedly much more effective than the detection of plagiarism by human agents (Mazov et al., 2016). Curiously, some researchers hold the opinion that it is unacceptable to use automated review tools to identify plagiarism in a text, and that plagiarism should only be detected manually. While the present authors do not agree that the use of technical means to detect plagiarism is unreasonable, in order to exclude accidental repetitions, commonly used phrases and expressions, titles or citations of laws and articles, etc., text borrowings detected automatically should be subsequently analysed by a competent human. In other words, in order to correctly identify plagiarism, it is necessary to use both automated and “manual” analysis, using a variety of means and methods — as well as, if necessary, special expertise.

The latter is due to the possibility that, although an author of a scientific work may use their own research which has already been published (i.e., conduct new research based on the results of their previous one), an automated check is likely to consider such text fragments to be borrowed, when, in fact, they are not⁸ (the concept of “self-plagiarism” will be considered further). The researchers point out that automated checking is clearly insufficient in such cases: only an expert can provide a substantive, evidence-based opinion that the fragments of the text defined by a machine as borrowed actually belong to the author; moreover, in principle, the evidence used to support such an opinion can only be provided by the original author him- or herself (Kirillova, 2019).

In addition to the usual collation approach, other less well-known comparison methods are used in cases where, while there are no grounds to speak about the coincidence of texts, there are reasons to suspect the author of a later scientific work to be guilty of plagiarism, or in cases where plagiarism is detected on the Internet.

One such method is “digital fingerprinting”. When this technique is applied, a summary of materials is placed into the program, which compiles the said “fingerprints” allowing any subsequent “borrowings” of the works to be automatically identified. If the “fingerprints” match, work aimed at identifying common text fragments is initiated. Another comparison method involves the use of keywords to identify the research topic, allowing one work to be compared with another in automatic mode to reveal their similarity. This may be contrasted with the citation analysis method: rather than being based on the similarity of texts, any mention of particular information in the text is studied in order to identify similarities of citation patterns (Osipov et al., 2016).

The above discussion demonstrates a kind of arms race in the use of modern technologies in the context of plagiarism: plagiarists use various means, including automation, to conceal plagiarism, while the identification of plagiarism involves tools based on specially developed algorithms, which, in turn, forces plagiarists to use even more sophisticated methods, and so on.

“SELF-PLAGIARISM” IS NOT A FORM OF PLAGIARISM

The term “self-plagiarism” is used to refer to the full or partial reproduction of an author’s previous work in another scientific work by the same author without any acknowledgement thereof. At the same time, many experts highlight the ambiguity of the term itself, as well as its application to excessively diverse practices (Bird, 2002), which include: dual or redundant publication (republishing by the author of the same research paper under different titles and in different editions); flow publication / autoplagiarism (publishing a series of articles on one topic while only a little new material is added in each of them); “salami” publication (where significant research is split into small articles).

Due to the absence in national legislations of a general rule restricting authors from reusing their own intellectual results, most legal scholars tend to believe that there are no grounds to imply an offence in case when an author reproduces their work (or its parts) in their other works. In so doing, it is logically impossible for an author to infringe his or her own intellectual property; conversely, as mentioned above, copyright law assigns primary importance to protecting the rights holder. From a legal perspective, “self-plagiarism” cannot therefore be considered as a form of plagiarism.

However, from the perspective of publication ethics, such a misuse of the results of one’s creative work is generally recognized as unfair behaviour. Standards of publication ethics are established by scientific publishers with due regard to the recommendations of the Committee on Publication Ethics (COPE) (https://publicationethics.org) and set out in the publication policies / guidelines of peer reviewed journals and publishing houses (such policies are typically made publicly available on publishers’ websites). It is within the context of publication ethics, which regulates the relationship between authors, reviewers, editors and readers during the creation, publication, distribution and use of scientific publications, that the term “self-plagiarism” has grown in popularity.

Nevertheless, the concept of “self-plagiarism” seems to be in some conflict with copyright law, as can be seen in the following.

In legal terms, when considering the publication of the results of scientific research, the points at issue are generally covered by a license agreement between the author and the publisher, which provides for the transfer of an exclusive or non-exclusive license from the former to the latter. In case of transferring an exclusive license to the publisher, the author’s republication of previous research (or part thereof) without the consent of the previous publisher becomes an offence in terms of breaching such an agreement); here, no issue connected with “self-plagiarism” arises. In case of the author transferring a non-exclusive license to the publisher (which may be granted due to the author’s desire to republish their scientific work in order to expand their readership), it is the publisher that commits the offence when prohibiting the author from using his or her work contrary to the terms of the agreement; in this case, the concept of “self-plagiarism” does not change the situation from a legal point of view.

While recognizing multiple publication of the same material as dishonest behaviour, the subsequent retraction of such papers raises several legal questions to which there are no decisive answers, becoming a coercive measure applied in the case of violation of ethical standards. This tendency can be considered as reducing the regulatory status of copyright law, creating an alternative to legal mechanisms for regulating intellectual property relations and consequently decreasing the lawfulness of actions in this field (Bogustov, 2021).

The concept of self-plagiarism is not so relevant for academic ethics, which applies to academic and scientific research relations. In this context, “self-plagiarism” is mentioned much less frequently; in the majority of cases, as an overview of contemporary publications shows, it refers to situations when course papers are duplicated and submitted by the same student to different university departments. However, the same issue may be raised with regard to theses. For instance, after analysing a case where almost one third of a lawyer’s thesis was reproduced from an earlier dissertation without including any reference thereto, Stefan Weber stresses that it is the extent of borrowing and the academic level of the thesis that should be decisive⁹. In this connection, while “self-plagiarism” may be deemed to be dishonest behaviour, it may by no means be equated with “intellectual theft” (Resnik, 1998).

Finally, it is sometimes even obligatory to use one’s own prior works when preparing a new one. In this context, we refer to the rules of awarding postgraduate degrees or applying for a research grant. In the first case, prior to commencing work on the thesis itself, it is necessary to publish the results of research in the form of articles in order to present them to the wider academic community. In the second case, when applying for a grant, the author should provide a list of his or her own publications, which are to be further developed if the grant is awarded. In both cases, while there should certainly be no word-for-word reproduction of previously published works, a failure to comply with the rule to base the research on some previous results would constitute a breach of procedure.

To summarize, unlike the concepts of academic plagiarism and plagiarism in a narrow (legal) sense, while not excluding their simultaneous application in particular cases, the concept of “self-plagiarism” may come into conflict with the rules of contract law or intellectual property law.

CONCLUSION

The conducted study supports the conclusion that the contemporary concepts of academic plagiarism and plagiarism in the narrow (legal) sense do not exclude their simultaneous application, which may be used by authors to protect their scientific research and copyrights in an effective manner. However, the use of contemporary technological means to conceal plagiarism complicates the process of obtaining satisfaction for rights holders, though not excluding such a possibility. The detection of plagiarism also increasingly relies on specially developed technological approaches. Moreover, both definitions of plagiarism must be clearly distinguished from the concept of “self-plagiarism”, which latter concept is not always legally consistent.

It is hoped that the problems outlined in this article and the opinions expressed herein will serve as a basis for further discussion of the problems in the area under consideration.

REFERENCES

1. Ansorge, L., Ansorgeová, K., & Sixsmith, M. (2021). Plagiarism through paraphrasing tools - The story of one plagiarized text. Publications, 9(4), 48. https://doi.org/10.3390/publications9040048

2. Atanasova, I. (2019). Copyright infringement in digital environment. Economics and Law, 1(1), 13-22. http://el.swu.bg/wp-content/uploads/2019/07/COPYRIGHT-INFRINGEMENT-IN-DIGITAL-ENVIRONMENT.pdf

3. Bird, S. (2002). Self-plagiarism and dual and redundant publications: What is the problem? Commentary on ‘Seven ways to plagiarize: Handling real allegations of research misconduct’. Science and Engineering Ethics, 8(4), 543-544. https://doi.org/10.1007/s11948-002-0007-4

4. Blok, P. (2021). Plagiarism and copyright. In J. Soeharno, A. Keimpe, D. Supported, V. Schot, M. Karen, P. Blok, E. Dommering, J. Vugt, & J. Zweistra (Eds.), Plagiarism in academic research and education (pp. 42-51). Association of Universities in the Netherlands (VSNU).

5. Bogustov, A. (2021). Pravovye aspekty retraktsii [Legal aspects of retraction]. In .A. Rozhkova (Ed.), Pravo cifrovoj ekonomiki - 2021 (17): Ezhegodnik-antologiya [The Law of the Digital Economy - 2021 (17): Annual-anthology], (17), (pp. 448-461). Statut.

6. Hung, K. M., Chen, L. M., & Chen, T. W. (2021). Trademark infringement recognition assistance system based on human visual gestalt psychology and trademark design. EURASIP Journal on Image and Video Processing, (2021), Article 27. https://doi.org/10.1186/s13640-021-00566-2

7. Isaeva, O. (2021). Predely zaimstvovaniya chuzhogo proizvedeniya pri sozdanii sobstvennogo [Limits of borrowing someone else’s work when creating one’s own]. Hozyastvo i Pravo, (10), 62-71.

8. Kirillova, T. (2019). Problema plagiata pri podgotovke nauchno-pedagogicheskikh kadrov vysshey kvaliﬁkatsii v obrazovatelnikh organizatsiyakh FSIN Rossii [The problem of plagiarism in the training of top-qualiﬁcation academics in educational organisations of the Russian Federal Penitentiary Service]. In Penitetiarnaya besopasnost’: Nationalniye traditii i zarubezhny opyt [Penitentiary security: National traditions and international experience] (pp. 94-96). Samarsky Yuridichesky Institute FSIN Rossii.

9. Levin, V. (2018). Plagiat i problem etiki v epokhu kumpiyuterov [Plagiarism and ethics issues at the age of computers]. In V. O. Sheleketa (Ed.), Iskusstveny intellect: Eticheskie problemy “cyfrovogo obschestva” [Artiﬁcial intelligence: Ethics problems of the “digital society”] (pp. 36-44). Belgorodsky Gosudrstveny Tekhnologichesky Universitet im. V.G. Shukhova.

10. Mazov, N., Gureev, V., & Kosyakov, D. (2016). On the development of a plagiarism detection model based on citation analysis using a bibliographic database. Scientiﬁc and Technical Information Processing, (43), 236-240. https://doi.org/10.3103/S0147688216040092

11. Mazov, N., & Gureyev, V. (2017). Vyyavleniye plagiata na osnoveanaliza tsitirovaniya: Problem i resheniya [Plagiarism detection based on citation analysis: Problems and solutions]. Proceedings of SPSTL SB RAS, (12), 355-362.

12. Osipov, I., & Beresnev, A. (2016). Metody obnaruzheniya plagiata v tekstakh studencheskikh rabot [Methods of plagiarism detection in student papers]. In Almanac of Young Researchers’ Papers of ITMO University, Proceedings from University of information technologies, mechanics and optics, (4), (pp. 95-98). Sankt-Peterburgsky Nationalny Issledovatel’sky Ubiversitet Informatsionykh Technology, Mekhaniki i Optiki.

13. Resnik, D. (1998). The ethics of science: An introduction. Routledge.

14. Rozhkova, M. (2021). Plagiat i inye vidy nekorrektnykh zaimstvivaniy v dissertatsiyakh: Pravoviye i eticheskiye voprosy [Plagiarism and other types of incorrect borrowings in dissertations: Legal and ethical issues]. Zhurnal Suda Po Intellectual’nym Pravam, 3(33), 124-140.

15. Tlitova, A., & Toshchev, A. (2019). Obzor sushchestvuyushchikh instrumentov vyyavleniya plagiata i samoplagiata [Review of Existing Tools for Detecting Plagiarism and Self-Plagiarism]. Russian Digital Libraries Journal, 22(3), 143-159. https://doi.org/10.26907/1562-5419-2019-22-3-143-159

When the Invented Becomes the Inventor: Can, and Should AI Systems be Granted Inventorship Status for Patent Applications? Creative Commons

Lindsey Whitlow

Legal Issues in the Digital Age, Journal Year: 2020, Volume and Issue: №2, P. 2-23